Exercise 4.7
I feel like I'm overthinking Exercise 4.7 (b) and I am hoping for a little bit of insight.
My gut instinct says that I arrived at this idea by considering that the probability is similar to the standard deviation which is the square root of the variance so since: and does ??? Then for part (c) on the exercise, assuming that the above is true, I used the notion that because if the probability of error were greater than 0.5 then the learned g would just flip its classification. Therefore this shows that for any in a classification problem, and therefore: Any indication as to whether I'm working along the correct lines would be appreciated! 
Re: Exercise 4.7
Re: Exercise 4.7

Re: Exercise 4.7
I'm not sure if I can reinterpret the Figure 4.8 like this: If you train your data with one horrible hypothesis you will get a very bad generalization bound despite the number of data points is large?

