Re: Neural Network predictions converging to one value


I have been running the training as you suggested. It's still in progress, but I thought I would mention some things I have found in the meantime.

Without regularization, it has actually not got stuck (in the identical predictions problem on the training set) after 24 epochs. This is contrary to what I had reported in my first post, about regularization not affecting this problem. I must have got my observations mixed up while juggling the different model hyper-parameters. Sorry about that.

However, with some regularization, I am seeing the problem I saw before. So, it must have been the regularization that pushed it over to a high-bias region, where the best it could do was to learn the mean of the outputs it most recently saw and predict that for every example. In that case, maybe this is similar to the condition shown in the last curve on slide 12 of your lecture on regularization?

I still notice significant training error in the long many-epoch run, and even in the randomized runs, though it probably hasn't gone through enough iterations of random runs yet to be sure it's always the case. But if this trend continues, I suppose that means the model may not have sufficient number of parameters for this problem?
