Regarding part d) of this problem, I think the answer is "no", correct, since isn't the real question whether or not sampled data has any value for infering a paramter? Did I totally miss the point? The first 3 I got. 
If the probability of +1 is in fact less than 0.5, than h_C does better than h_S out of training data (as it will predict the more probable 1 for each point). But the probability of this happening (p being less than 0.5 with all 25 training data points showing y=+1) is P(vp>0.5) < 2exp(2*0.5**2 * 25) = 7.45e06

