I feel like I'm overthinking Exercise 4.7 (b) and I am hoping for a little bit of insight.

My gut instinct says that

I arrived at this idea by considering that the probability is similar to the standard deviation which is the square root of the variance so since:

and

does

???

Then for part (c) on the exercise, assuming that the above is true, I used the notion that

because if the probability of error were greater than 0.5 then the learned g would just flip its classification. Therefore this shows that for any

in a classification problem,

and therefore:

Any indication as to whether I'm working along the correct lines would be appreciated!