Re: Exercise 1.11
If the probability of +1 is in fact less than 0.5, than h_C does better than h_S out of training data (as it will predict the more probable -1 for each point). But the probability of this happening (p being less than 0.5 with all 25 training data points showing y=+1) is P(|v-p|>0.5) < 2exp(-2*0.5**2 * 25) = 7.45e-06
|