![]() |
#1
|
|||
|
|||
![]()
First we generate a training set of 1000 points. We also generate a vector y using the target function given.
Now we are directed to randomly flip the sign of 10% of the training set. The training set has 4000 numbers at this point. We should randomly choose 400 of these numbers and flip the sign? Including the y values? Also include the values for the 1000 x0 which we initialized to 1.0? |
#2
|
|||
|
|||
![]()
I'm pretty sure we only flip the sign on the ys. That's what I did and got reasonable results. That corresponds to noise in your sample of the target function.
-James |
#3
|
||||
|
||||
![]()
You are correct.
__________________
Where everyone thinks alike, no one thinks very much |
#4
|
|||
|
|||
![]()
Thank you both for that clarification. I believe I am inching closer to understanding noise.
I have some follow up questions: How is E_in defined? Do you compare the linear regression results (i.e. sign(w'x) where w is obtained by linear regression using the "noisy" y) to the true value of y or to the the noisy value from the training data? In the real world, since the target function is unknown, the best one can do is E_in_estimated by comparing sign(w'x) to the "noisy" y. But in this instance we actually do have the target function. So if we are asked to compute E_in should we use the original y? Edit: I tried both and the closest answer didn't change, but I'd still like to understand the correct definition of E_in. |
#5
|
||||
|
||||
![]() Quote:
__________________
When one teaches, two learn. |
#6
|
|||
|
|||
![]()
What about with Eout? Do we also compare with noisy y or with y without noise?
|
#7
|
|||
|
|||
![]()
Here's how I think about
![]() ![]() The in-sample performance ![]() ![]() ![]() ![]() ![]() The out-of-sample performance ![]() ![]() ![]() ![]() Of course in a real situation we won't have ![]() ![]() |
#8
|
|||
|
|||
![]()
ttt
Think this thread will help others as it did me. The "flip" comment confused me as well. |
![]() |
Thread Tools | |
Display Modes | |
|
|