LFD Book Forum Overfitting with Polynomials : deterministic noise
 User Name Remember Me? Password
 FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
08-17-2012, 12:38 PM
 Kevin Junior Member Join Date: Jul 2012 Posts: 1
Overfitting with Polynomials : deterministic noise

I have some difficulty to understand the idea of deterministic noise. I think there are some disturbing contradiction with what we've seen with the bias-variance tradeoff, particularly with 50th order noiseless target exemple.

Chapter 4 state that:
- A 2nd order polynomial could be better than a 10th order polynomial to fit a 50th order polynomial target and it's due to the deterministic noise.

--> So I conclude that there is more deterministic noise with 10th order than with the 2nd order.

- Deterministic noise=bias

But we've seen with the bias-variance tradeoff, that a more complex model than an other have a lower bias.

Obvioulsy, I'm wrong somewhere, but where ?
#2
08-17-2012, 02:26 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,475
Re: Overfitting with Polynomials : deterministic noise

Quote:
 Originally Posted by Kevin Chapter 4 state that: - A 2nd order polynomial could be better than a 10th order polynomial to fit a 50th order polynomial target and it's due to the deterministic noise. --> So I conclude that there is more deterministic noise with 10th order than with the 2nd order.
The confusion is justified, since there are two opposing factors here (see Exercise 4.3). There is more deterministic noise with the 2nd order model than with the 10th order model (which would suggest more overfitting with the 2nd order), but the model itself is simpler so that would suggest less overfitting. It turns out that the latter factor wins here.

If you want to isolate the impact of deterministic noise on overfitting without interference from the model complexity, you can fix the model and change the complexity of the target function.
__________________
Where everyone thinks alike, no one thinks very much
#3
11-03-2016, 12:45 PM
 CountVonCount Member Join Date: Oct 2016 Posts: 17
Re: Overfitting with Polynomials : deterministic noise

I think the confusion comes from Figure 4.4 compared to the figures of the stochastic noise.

Here you write the shading is the deterministic noise, since this is the difference between the best fit of the current model and the target function.
Exactly this shading is from the bias-variance analyses. Thus the value of the deterministic noise is directly related to the bias.

When you talk about stochastic noise you say that the out-of-sample error will increase with the model-complexity and this is related to the area between the final hypothesis and the target . Thus the reader might think the bias is increasing with the complexity of the model. However the bias depends on and not on . And the reason why this area increases is due to the stochastic noise. If there isn't any noise the final hypothesis will have a better chance to fit (depending on the position of the samples).

In fact (and this is not really clear form the text, but from Exercise 4.3) on a noiseless target the shaded area in Figure 4.4 will decrease when the model complexity increases and thus the bias decreases.
My suggestion is to make is more clear, that in case of stochastic noise you talk about the actual final hypothesis and in case of deterministic noise you talk about the best fitting hypothesis, that is related to .

From my understanding I would say:
Overfitting does not apply to the best fit of the model () but to the real hypothesis (). In the bias-variance-analyses we saw the variance will increase together with the model complexity (at the same number of samples). So I think Overfitting is a major part of the variance, either due to the stochastic noise or due to the deterministic noise.

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 05:56 PM.

 Contact Us - LFD Book - Top