Thread: Exercise 4.10
View Single Post
  #4  
Old 05-04-2016, 11:26 AM
ntvy95 ntvy95 is offline
Member
 
Join Date: Jan 2016
Posts: 37
Default Re: Exercise 4.10

Well, I'm not sure about my understanding but here is my guess: (If they are not correct please tell me, especially for (c).)

(a) Because g^{-}_{m^{*}} is the hypothesis with smallest E_{in} among M hypotheses, and we have already known that E_{in}(g^{-}_{m^{*}}) is close to E_{out}(g^{-}_{m^{*}}) for small M and large K, hence the initial decrease. As we set out more data for validating, we use less data for training and that leads to worse M hypotheses, hence the afterward increase.

(b) The reason for the initial decrease is already discussed above. A note here is that initially \mathbb{E}[E_{out}(g^{-}_{m^{*}})] is very close to \mathbb{E}[E_{out}(g_{m^{*}})], this is because the size N - K of training set used for outputing g^{-}_{m^{*}} is very close to the size N of training set used for ouputing g_{m^{*}}. Then it takes a rather long ride for \mathbb{E}[E_{out}(g_{m^{*}})] to increase again despite of the worse M hypotheses, because those worse and worse M hypotheses still lead us to the good enough choice of learning model until they get so worse that they finally lead us to the worse choice of learning model.

(c) A possible case is that when K = 1, g^{-}_{m^{*}} and g_{m^{*}} have almost the same size of training set hence almost the same chance to be a good final hypothesis, however g^{-}_{m^{*}} has the guarantee of small \mathbb{E}[E_{out}(g^{-}_{m^{*}})] through small E_{in}(g^{-}_{m^{*}}) while g_{m^{*}} does not have this guarantee. However, as K increase, g^{-}_{m^{*}} is trained using less and less data compared to g_{m^{*}}, hence g^{-}_{m^{*}}'s performance cannot compete with g_{m^{*}}'s anymore.

Thank you.
Reply With Quote