Quote:
Originally Posted by ladybird2012
1) Slide 21: The graph on the RHS shows that when Qf=15 we need no regularizer. However, if I understand it right, this graph is based on the experiment performed on slide 13 of Lecture 11. On that slide we had overfitting when Qf>=10 since we were trying to fit the target with a tenth-order polynomial. So I would have assumed that for Qf>=10 we would need regularizer.... What am I missing?
|
The figure in slide 21 uses different parameters compared to the overfitting figures. The model being reguarized is 15th order, and there is zero stochastic noise in that part.
Quote:
2)Weight decay versus weight elimination for neural networks: I feel like these two regularizers are doing opposite things. Weight decay reduces the weights and favors small weights, but weight elimination favors bigger weights and eliminates small weights.
|
Weight elimination does not
favor bigger weights. It tries to reduce all weights, but it has a bigger incentive to reduce small weights than to reduce big weights.