Well, I'm not sure about my understanding but here is my guess: (If they are not correct please tell me, especially for (c).)
(a) Because

is the hypothesis with smallest

among

hypotheses, and we have already known that

is close to

for small

and large

, hence the initial decrease. As we set out more data for validating, we use less data for training and that leads to worse

hypotheses, hence the afterward increase.
(b) The reason for the initial decrease is already discussed above. A note here is that initially
![\mathbb{E}[E_{out}(g^{-}_{m^{*}})] \mathbb{E}[E_{out}(g^{-}_{m^{*}})]](/vblatex/img/526958608eba20cd9a0bc60e4c74ed2a-1.gif)
is very close to
![\mathbb{E}[E_{out}(g_{m^{*}})] \mathbb{E}[E_{out}(g_{m^{*}})]](/vblatex/img/6405151fdee4cd492e7ae9be952b455f-1.gif)
, this is because the size

of training set used for outputing

is very close to the size

of training set used for ouputing

. Then it takes a rather long ride for
![\mathbb{E}[E_{out}(g_{m^{*}})] \mathbb{E}[E_{out}(g_{m^{*}})]](/vblatex/img/6405151fdee4cd492e7ae9be952b455f-1.gif)
to increase again despite of the worse

hypotheses, because those worse and worse

hypotheses still lead us to the good enough choice of learning model until they get so worse that they finally lead us to the worse choice of learning model.
(c) A possible case is that when

,

and

have almost the same size of training set hence almost the same chance to be a good final hypothesis, however

has the guarantee of small
![\mathbb{E}[E_{out}(g^{-}_{m^{*}})] \mathbb{E}[E_{out}(g^{-}_{m^{*}})]](/vblatex/img/526958608eba20cd9a0bc60e4c74ed2a-1.gif)
through small

while

does not have this guarantee. However, as

increase,

is trained using less and less data compared to

, hence

's performance cannot compete with

's anymore.
Thank you.