![]() |
bias-variance plot on p67
Hi Prof. Abu-Mostafa,
As you suggested, I post below the question that I emailed you earlier, in case other people also have similar questions. However, I couldn't seem to insert/upload images properly here (it showed only a link), so I'll just do a text-only question. Specifically, I’m a little confused about the bias-variance plot at the bottom of page 67. In the plot, the bias appears to be a flat line, i.e. constant, independent of the sample (training set) size, N. I wondered if this is (approx.) true in general, so I did some experiments (simulations). What I found was that while this was indeed approximately true for the linear regression; it didn’t appear so true when I used the 1-nearest-neighbor (1-NN) algorithm. (Similar to Example 2.8, I tried to learn a sinusoid.) More specifically, for the linear regression, the averaged learned hypothesis, i.e. "g bar", stays almost unchanged when the size of the training set (N) increases from 4 to 10 in my simulation. Even for N=2, "g bar" doesn’t deviate too much. However, for the 1-Nearest-Neighbor (1-NN) algorithm, "g bar" changes considerably as N grows from 2 to 4, and to 10. This seems reasonable to me though, because as N increases, the distance between a test point (x) and its nearest neighbor decreases, with high probability. So it’s natural to expect "g bar" to converge to the sinusoid, and the bias to decrease as N increases. Here's the simulated average (squared) bias when N was 2, 4, and 8: OLS: 0.205, 0.199, 0.198 1NN: 0.184, 0.052, 0.013 where OLS stands for ordinary least squares linear regression. Do these results and interpretations look correct to you? Or am I mistaken somewhere? I’d greatly appreciate it, if you’d clarify this a little bit more for me. Thanks a lot! BTW, in my simulation, the training set of size N is sampled independently and uniformly on the [0,1] interval. I then averaged the learned hypotheses from 5000 training sets to obtain each "g bar". Best regards, Steve |
Re: bias-variance plot on p67
Quote:
Just one more question: in the quote above, when you said "there's some best ![]() ![]() ![]() ![]() ![]() |
Re: bias-variance plot on p67
Quote:
![]() ![]() ![]() ![]() ![]() ![]() |
All times are GMT -7. The time now is 02:15 PM. |
Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.