LFD Book Forum Question on Bias Variance Tradeoff

#1
08-09-2012, 12:37 AM
 hashable Junior Member Join Date: Jul 2012 Posts: 8

In the book and the lecture, it is said that generally a larger hypothesis set (more complex model) has lower bias and higher variance. This is intuitively explained by the pictures on page 64. Bias is shown as distance of (g-bar) from the target function f. Variance is illustrated with a shaded region around the target function f.

My question is: It appears from the picture that it should be possible to increase the hypothesis space in a way so that it does not include the target function f. E.g. If we include hypotheses in the direction "further away" from the target function f, then we may have managed to still keep the bias high (or even increase the bias).

From this line of reasoning, it appears that adding complexity to a model or a larger hypothesis space does not necessarily imply a decrease in bias (and/or an increase in variance). The decrease in bias occurs only when the hypothesis space grows in a way so that ends up being closer to f. But this need not happen always (theoretically at least).

Is this conclusion correct? If yes, then should it be kept in mind when applying these concepts in practice? Also if this is correct, then could you give some examples that can illustrate how adding complexity can still increase the bias.
#2
08-09-2012, 01:20 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: Question on Bias Variance Tradeoff

Quote:
 Originally Posted by hashable In the book and the lecture, it is said that generally a larger hypothesis set (more complex model) has lower bias and higher variance. This is intuitively explained by the pictures on page 64. Bias is shown as distance of ḡ (g-bar) from the target function f. Variance is illustrated with a shaded region around the target function f. My question is: It appears from the picture that it should be possible to increase the hypothesis space in a way so that it does not include the target function f. E.g. If we include hypotheses in the direction "further away" from the target function f, then we may have managed to still keep the bias high (or even increase the bias). From this line of reasoning, it appears that adding complexity to a model or a larger hypothesis space does not necessarily imply a decrease in bias (and/or an increase in variance). The decrease in bias occurs only when the hypothesis space grows in a way so that ḡ ends up being closer to f. But this need not happen always (theoretically at least). Is this conclusion correct? If yes, then should it be kept in mind when applying these concepts in practice? Also if this is correct, then could you give some examples that can illustrate how adding complexity can still increase the bias.
Your conclusion is correct. Since you are referring to the figure that illustrates an idealized version of bias (how far the best hypothesis is, rather than how far the average learned hypothesis is), I will answer your question in terms of that idealized bias.

It is indeed possible to construct cases where you enlarge the hypothesis set without decreasing the bias. In practice, since the target function is unknown, the chances are enlarging the hypothesis set will result in including some hypotheses that are closer to the target, so the bias goes down.

Now, if you want the bias to actually increase as a result of increasing the complexity of the hypothesis set, that cannot be done by enlarging the set, but by going for an alternative hypothesis set that is more complex and also further away from the target.

For a specific construction, take the more complex hypothesis set as the set of all hypotheses excluding those that happen to be close to the target function. This will a very complex hypothesis set but still with a large bias, by construction. Of course that's a deliberately unfriendly hypothesis set that does not happen in practice.
__________________
Where everyone thinks alike, no one thinks very much
#3
08-09-2012, 01:48 AM
 hashable Junior Member Join Date: Jul 2012 Posts: 8
Re: Question on Bias Variance Tradeoff

Thanks for the quick reply. I have some follow up questions in regards to variance.

How does increasing bias artificially in this way (by choosing a pathological hypothesis space) affect the variance?

Variance appears to only depend on and is independent of f. Perhaps it could be considered to indirectly depend of f to the extent that each g tries to approximate f. Thus is variance affected by whether is close to f or not?

Is it possible to increase complexity/hypothesis-set-size without increasing the variance? It is not obvious that this is not possible although the intuitive explanation is that a larger hypothesis set will result in a larger variance.
#4
08-09-2012, 02:03 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: Question on Bias Variance Tradeoff

Quote:
 Originally Posted by hashable Thanks for the quick reply. I have some follow up questions in regards to variance. How does increasing bias artificially in this way (by choosing a pathological hypothesis space) affect the variance? Variance appears to only depend on ḡ and is independent of f. Perhaps it could be considered to indirectly depend of f to the extent that each g tries to approximate f. Thus is variance affected by whether ḡ is close to f or not? Is it possible to increase complexity/hypothesis-set-size without increasing the variance? It is not obvious that this is not possible although the intuitive explanation is that a larger hypothesis set will result in a larger variance.
The dependency of the variance on the target is, as you point out. more complicated. For instance, if you try to learn the constant function, most models will converge with litlle variance, whereas a more complex target will result in bigger variance with the same models. The intuition beyond the bias and variance is valid for a lot of situations, but may not be valid for some.
__________________
Where everyone thinks alike, no one thinks very much

 Tags bias, variance

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 10:19 AM.