![]() |
|
#1
|
|||
|
|||
![]()
I am still a little confused about this. It's clear to me that reducing deterministic noise can lead to overfitting (if there is not enough in-sample data), but the presence of deterministic noise itself seems (to me) to cause underfitting. Am I just being pedantic?
(I contrast this with stochastic noise: it cannot be reduced, and clearly any attempt to fit it is overfitting, because no amount of in-sample data will clarify its shape). |
#2
|
||||
|
||||
![]()
This is a very subtle question!
The most important thing to realize is that in learning, ![]() ![]() i) If there is stochastic noise with 'magnitude' ![]() ii) If there deterministic noise then you are in trouble. The stochastic noise can be viewed as one part of the data generation process (eg. measurement errors). The deterministic noise can similarly be viewed as another part of the data generation process, namely ![]() ![]() I just need to tell you what 'trouble' means. Well, we actually use another word instead of 'trouble' - overfitting. This means you may be likely to make an inferior choice over the superior choice because the inferior choice has lower in-sample error. Doing stuff that looks good in-sample that leads to disasters out-of-sample is the essence of overfitting. An example of this is trying to choose the regularization parameter. If you pick a lower regularization parameter, then you have lower in-sample error, but it leads to higher out-of-sample error - you picked the ![]() ![]() ![]() ![]() ![]() ![]() ![]() Now let's get back to the subtle part of your question. There is actually another way to decrease the deterministic noise - increase the complexity of ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Quote:
__________________
Have faith in probability |
#3
|
|||
|
|||
![]()
Thanks very much for the detailed reply!
Quote:
Quote:
![]() What I am curious about is how we can be "led astray" if ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
#4
|
||||
|
||||
![]()
The "being led astray" refers to the "noise" in the finite data set leading the learning algorithm in the wrong direction and outputting the wrong final hypothesis (though
![]() ![]() We didn't precisely define deterministic noise, we just gave the intuitive idea. bias is very related to it though not exactly the same. Indeed though ![]() ![]() ![]() ![]() ![]() ![]() Quote:
__________________
Have faith in probability |
![]() |
Thread Tools | |
Display Modes | |
|
|