Quote:
Originally Posted by hashable
In the book and the lecture, it is said that generally a larger hypothesis set (more complex model) has lower bias and higher variance. This is intuitively explained by the pictures on page 64. Bias is shown as distance of ḡ (gbar) from the target function f. Variance is illustrated with a shaded region around the target function f.
My question is: It appears from the picture that it should be possible to increase the hypothesis space in a way so that it does not include the target function f. E.g. If we include hypotheses in the direction "further away" from the target function f, then we may have managed to still keep the bias high (or even increase the bias).
From this line of reasoning, it appears that adding complexity to a model or a larger hypothesis space does not necessarily imply a decrease in bias (and/or an increase in variance). The decrease in bias occurs only when the hypothesis space grows in a way so that ḡ ends up being closer to f. But this need not happen always (theoretically at least).
Is this conclusion correct? If yes, then should it be kept in mind when applying these concepts in practice? Also if this is correct, then could you give some examples that can illustrate how adding complexity can still increase the bias.

Your conclusion is correct. Since you are referring to the figure that illustrates an idealized version of bias (how far the best hypothesis is, rather than how far the average learned hypothesis is), I will answer your question in terms of that idealized bias.
It is indeed possible to construct cases where you enlarge the hypothesis set without decreasing the bias. In practice, since the target function is unknown, the chances are enlarging the hypothesis set will result in including some hypotheses that are closer to the target, so the bias goes down.
Now, if you want the bias to actually
increase as a result of increasing the complexity of the hypothesis set, that cannot be done by enlarging the set, but by going for an alternative hypothesis set that is more complex and also further away from the target.
For a specific construction, take the more complex hypothesis set as the set of all hypotheses excluding those that happen to be close to the target function. This will a very complex hypothesis set but still with a large bias, by construction. Of course that's a deliberately unfriendly hypothesis set that does not happen in practice.