Questions on lecture 12
Hi,
I have 2 questions from lecture 12.
1) Slide 21: The graph on the RHS shows that when Qf=15 we need no regularizer. However, if I understand it right, this graph is based on the experiment performed on slide 13 of Lecture 11. On that slide we had overfitting when Qf>=10 since we were trying to fit the target with a tenth-order polynomial. So I would have assumed that for Qf>=10 we would need regularizer.... What am I missing?
2)Weight decay versus weight elimination for neural networks: I feel like these two regularizers are doing opposite things. Weight decay reduces the weights and favors small weights, but weight elimination favors bigger weights and eliminates small weights. So I guess these two regularizers are used under different conditions in neural networks -- could someone give me an example so I can pin it down? Are they ever both used in the same learning problem?
Thanks a lot in advance.
|