
#1




Exercise 4.10
Would anyone please give me a clue to part (c)? This seems rather counterintuitive to me.
Thanks a lot! 
#2




Re: Exercise 4.10
I think this can be a possible explanation for Exercise 4.10.(c):
When K is 1, then estimation of outof sample error by validation error is not ‘that’ good because of the penalty term. Thus, the model chosen from this poor estimate might not be the ‘best’ one. This explains Expectation[OutofSample Error of g^_(m*)] < Expectation[OutofSample Error of g_(m*)]. This situation somewhat improves as K increases. Please let me know if this explanation is not correct. 
#3




Re: Exercise 4.10
Also I wanted to validate my explanation of other parts of this exercise.
For part (b), this is what I think: As K increases, the estimation of outof sample error by validation error gets better. That explains the initial decrease in Expectation[OutofSample Error of g_(m*)]. Then, as K increases beyond the ‘optimal’ value, the training goes bad, which explains the rise. Please let me know if my understanding is correct or not. For part (a), I can't figure out the initial decrease in Expectation[OutofSample Error of g^_(m*)]. Any clue on this will be great. Thanks, Sayan 
#4




Re: Exercise 4.10
Well, I'm not sure about my understanding but here is my guess: (If they are not correct please tell me, especially for (c).)
(a) Because is the hypothesis with smallest among hypotheses, and we have already known that is close to for small and large , hence the initial decrease. As we set out more data for validating, we use less data for training and that leads to worse hypotheses, hence the afterward increase. (b) The reason for the initial decrease is already discussed above. A note here is that initially is very close to , this is because the size of training set used for outputing is very close to the size of training set used for ouputing . Then it takes a rather long ride for to increase again despite of the worse hypotheses, because those worse and worse hypotheses still lead us to the good enough choice of learning model until they get so worse that they finally lead us to the worse choice of learning model. (c) A possible case is that when , and have almost the same size of training set hence almost the same chance to be a good final hypothesis, however has the guarantee of small through small while does not have this guarantee. However, as increase, is trained using less and less data compared to , hence 's performance cannot compete with 's anymore. Thank you. 
Thread Tools  
Display Modes  

