Thread: bias-variance plot on p67 View Single Post
#1
 Steve_Y Junior Member Join Date: May 2017 Posts: 2 bias-variance plot on p67

Hi Prof. Abu-Mostafa,

As you suggested, I post below the question that I emailed you earlier, in case other people also have similar questions. However, I couldn't seem to insert/upload images properly here (it showed only a link), so I'll just do a text-only question.

Specifically, Im a little confused about the bias-variance plot at the bottom of page 67. In the plot, the bias appears to be a flat line, i.e. constant, independent of the sample (training set) size, N. I wondered if this is (approx.) true in general, so I did some experiments (simulations). What I found was that while this was indeed approximately true for the linear regression; it didnt appear so true when I used the 1-nearest-neighbor (1-NN) algorithm. (Similar to Example 2.8, I tried to learn a sinusoid.)

More specifically, for the linear regression, the averaged learned hypothesis, i.e. "g bar", stays almost unchanged when the size of the training set (N) increases from 4 to 10 in my simulation. Even for N=2, "g bar" doesnt deviate too much.

However, for the 1-Nearest-Neighbor (1-NN) algorithm, "g bar" changes considerably as N grows from 2 to 4, and to 10. This seems reasonable to me though, because as N increases, the distance between a test point (x) and its nearest neighbor decreases, with high probability. So its natural to expect "g bar" to converge to the sinusoid, and the bias to decrease as N increases.

Here's the simulated average (squared) bias when N was 2, 4, and 8:
OLS: 0.205, 0.199, 0.198
1NN: 0.184, 0.052, 0.013
where OLS stands for ordinary least squares linear regression.

Do these results and interpretations look correct to you? Or am I mistaken somewhere? Id greatly appreciate it, if youd clarify this a little bit more for me. Thanks a lot!

BTW, in my simulation, the training set of size N is sampled independently and uniformly on the [0,1] interval. I then averaged the learned hypotheses from 5000 training sets to obtain each "g bar".

Best regards,
Steve