LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 4 - Overfitting (http://book.caltech.edu/bookforum/forumdisplay.php?f=111)
-   -   Choice of regularization parameter (http://book.caltech.edu/bookforum/showthread.php?t=474)

jbaker 05-10-2012 08:18 PM

Choice of regularization parameter
 
Will try and ponder my question from the lecture again, since I wasn't quite dextrous enough to get the point across in chat-room format. :confused:

I think the point I was missing is that in Fig. 4.7(b) (last graph on slide 21 of May 10 lecture), the stochastic noise is in fact fixed at zero. I was probably having flashbacks to Fig. 4.3(b), where it's a fixed non-zero value, in which case the behavior of E_out would depend on N as well as lambda, right? So I was wondering for what choice of N the graph was plotted, and how the behavior of the Q_f = { 15, 30, 100 } lines would change with N. And imagining that N=15 as in previous examples, it was surprising that regularization wouldn't help out when Q_f=15!

But with zero stochastic noise, the expected deterministic noise is just whatever it is, independent of N, as the fit is the same regardless of what random points you pick. Well, I suppose we'd better have N >Q_f, at least, or we're in trouble!

Have I got that right?

jbaker 05-10-2012 11:25 PM

Re: Choice of regularization parameter
 
Which leads me down the road to a related quandary -- won't the red and green lines (with non-zero stochastic noise) in Fig. 4.7(a) move around a bit depending on N? (Because of the dependence as shown in Fig. 4.3(a)?) So haven't you had to choose a value of N to generate those? Or am I missing something?

magdon 05-11-2012 02:01 PM

Re: Choice of regularization parameter
 
In both plots of Fig 4.7, N=30 (as in fig 4.6) - unfortunately it is not mentioned in the caption to Fig 4.7. The general shape of the plots will not change if N increases, and yes if N is less than the degree of the polynomial you are fitting, then there will be problems and you have to use the pseudo-inverse.

If your model complexity matches the target complexity (Q=Q_f) *and* there is no stochastic noise, with at least N=Q+1 data points, you will recover the target function exactly without regularization. Any regularization will therefore necessarily result in an inferior Eout.

Hope this helps,

Quote:

Originally Posted by jbaker (Post 2035)
Will try and ponder my question from the lecture again, since I wasn't quite dextrous enough to get the point across in chat-room format. :confused:

I think the point I was missing is that in Fig. 4.7(b) (last graph on slide 21 of May 10 lecture), the stochastic noise is in fact fixed at zero. I was probably having flashbacks to Fig. 4.3(b), where it's a fixed non-zero value, in which case the behavior of E_out would depend on N as well as lambda, right? So I was wondering for what choice of N the graph was plotted, and how the behavior of the Q_f = { 15, 30, 100 } lines would change with N. And imagining that N=15 as in previous examples, it was surprising that regularization wouldn't help out when Q_f=15!

But with zero stochastic noise, the expected deterministic noise is just whatever it is, independent of N, as the fit is the same regardless of what random points you pick. Well, I suppose we'd better have N >Q_f, at least, or we're in trouble!

Have I got that right?


jbaker 05-21-2012 04:48 PM

Re: Choice of regularization parameter
 
Yes indeed, thanks. I knew there was an N hiding in there somewhere! :D Also wasn't keeping the potential distinction between Q/Qf entirely clear in my head, so that also demystifies it a bit.


All times are GMT -7. The time now is 06:36 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.