LFD Book Forum  

Go Back   LFD Book Forum > Book Feedback - Learning From Data > Chapter 4 - Overfitting

Reply
 
Thread Tools Display Modes
  #1  
Old 05-10-2012, 07:18 PM
jbaker jbaker is offline
Member
 
Join Date: Apr 2012
Posts: 11
Default Choice of regularization parameter

Will try and ponder my question from the lecture again, since I wasn't quite dextrous enough to get the point across in chat-room format.

I think the point I was missing is that in Fig. 4.7(b) (last graph on slide 21 of May 10 lecture), the stochastic noise is in fact fixed at zero. I was probably having flashbacks to Fig. 4.3(b), where it's a fixed non-zero value, in which case the behavior of E_out would depend on N as well as lambda, right? So I was wondering for what choice of N the graph was plotted, and how the behavior of the Q_f = { 15, 30, 100 } lines would change with N. And imagining that N=15 as in previous examples, it was surprising that regularization wouldn't help out when Q_f=15!

But with zero stochastic noise, the expected deterministic noise is just whatever it is, independent of N, as the fit is the same regardless of what random points you pick. Well, I suppose we'd better have N >Q_f, at least, or we're in trouble!

Have I got that right?
Reply With Quote
  #2  
Old 05-10-2012, 10:25 PM
jbaker jbaker is offline
Member
 
Join Date: Apr 2012
Posts: 11
Default Re: Choice of regularization parameter

Which leads me down the road to a related quandary -- won't the red and green lines (with non-zero stochastic noise) in Fig. 4.7(a) move around a bit depending on N? (Because of the dependence as shown in Fig. 4.3(a)?) So haven't you had to choose a value of N to generate those? Or am I missing something?
Reply With Quote
  #3  
Old 05-11-2012, 01:01 PM
magdon's Avatar
magdon magdon is offline
RPI
 
Join Date: Aug 2009
Location: Troy, NY, USA.
Posts: 595
Default Re: Choice of regularization parameter

In both plots of Fig 4.7, N=30 (as in fig 4.6) - unfortunately it is not mentioned in the caption to Fig 4.7. The general shape of the plots will not change if N increases, and yes if N is less than the degree of the polynomial you are fitting, then there will be problems and you have to use the pseudo-inverse.

If your model complexity matches the target complexity (Q=Q_f) *and* there is no stochastic noise, with at least N=Q+1 data points, you will recover the target function exactly without regularization. Any regularization will therefore necessarily result in an inferior Eout.

Hope this helps,

Quote:
Originally Posted by jbaker View Post
Will try and ponder my question from the lecture again, since I wasn't quite dextrous enough to get the point across in chat-room format.

I think the point I was missing is that in Fig. 4.7(b) (last graph on slide 21 of May 10 lecture), the stochastic noise is in fact fixed at zero. I was probably having flashbacks to Fig. 4.3(b), where it's a fixed non-zero value, in which case the behavior of E_out would depend on N as well as lambda, right? So I was wondering for what choice of N the graph was plotted, and how the behavior of the Q_f = { 15, 30, 100 } lines would change with N. And imagining that N=15 as in previous examples, it was surprising that regularization wouldn't help out when Q_f=15!

But with zero stochastic noise, the expected deterministic noise is just whatever it is, independent of N, as the fit is the same regardless of what random points you pick. Well, I suppose we'd better have N >Q_f, at least, or we're in trouble!

Have I got that right?
__________________
Have faith in probability
Reply With Quote
  #4  
Old 05-21-2012, 03:48 PM
jbaker jbaker is offline
Member
 
Join Date: Apr 2012
Posts: 11
Default Re: Choice of regularization parameter

Yes indeed, thanks. I knew there was an N hiding in there somewhere! Also wasn't keeping the potential distinction between Q/Qf entirely clear in my head, so that also demystifies it a bit.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 09:31 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.