LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 4

Reply
 
Thread Tools Display Modes
  #1  
Old 08-09-2012, 12:37 AM
hashable hashable is offline
Junior Member
 
Join Date: Jul 2012
Posts: 8
Default Question on Bias Variance Tradeoff

In the book and the lecture, it is said that generally a larger hypothesis set (more complex model) has lower bias and higher variance. This is intuitively explained by the pictures on page 64. Bias is shown as distance of (g-bar) from the target function f. Variance is illustrated with a shaded region around the target function f.

My question is: It appears from the picture that it should be possible to increase the hypothesis space in a way so that it does not include the target function f. E.g. If we include hypotheses in the direction "further away" from the target function f, then we may have managed to still keep the bias high (or even increase the bias).

From this line of reasoning, it appears that adding complexity to a model or a larger hypothesis space does not necessarily imply a decrease in bias (and/or an increase in variance). The decrease in bias occurs only when the hypothesis space grows in a way so that ends up being closer to f. But this need not happen always (theoretically at least).

Is this conclusion correct? If yes, then should it be kept in mind when applying these concepts in practice? Also if this is correct, then could you give some examples that can illustrate how adding complexity can still increase the bias.
Reply With Quote
  #2  
Old 08-09-2012, 01:20 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Question on Bias Variance Tradeoff

Quote:
Originally Posted by hashable View Post
In the book and the lecture, it is said that generally a larger hypothesis set (more complex model) has lower bias and higher variance. This is intuitively explained by the pictures on page 64. Bias is shown as distance of (g-bar) from the target function f. Variance is illustrated with a shaded region around the target function f.

My question is: It appears from the picture that it should be possible to increase the hypothesis space in a way so that it does not include the target function f. E.g. If we include hypotheses in the direction "further away" from the target function f, then we may have managed to still keep the bias high (or even increase the bias).

From this line of reasoning, it appears that adding complexity to a model or a larger hypothesis space does not necessarily imply a decrease in bias (and/or an increase in variance). The decrease in bias occurs only when the hypothesis space grows in a way so that ends up being closer to f. But this need not happen always (theoretically at least).

Is this conclusion correct? If yes, then should it be kept in mind when applying these concepts in practice? Also if this is correct, then could you give some examples that can illustrate how adding complexity can still increase the bias.
Your conclusion is correct. Since you are referring to the figure that illustrates an idealized version of bias (how far the best hypothesis is, rather than how far the average learned hypothesis is), I will answer your question in terms of that idealized bias.

It is indeed possible to construct cases where you enlarge the hypothesis set without decreasing the bias. In practice, since the target function is unknown, the chances are enlarging the hypothesis set will result in including some hypotheses that are closer to the target, so the bias goes down.

Now, if you want the bias to actually increase as a result of increasing the complexity of the hypothesis set, that cannot be done by enlarging the set, but by going for an alternative hypothesis set that is more complex and also further away from the target.

For a specific construction, take the more complex hypothesis set as the set of all hypotheses excluding those that happen to be close to the target function. This will a very complex hypothesis set but still with a large bias, by construction. Of course that's a deliberately unfriendly hypothesis set that does not happen in practice.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 08-09-2012, 01:48 AM
hashable hashable is offline
Junior Member
 
Join Date: Jul 2012
Posts: 8
Default Re: Question on Bias Variance Tradeoff

Thanks for the quick reply. I have some follow up questions in regards to variance.

How does increasing bias artificially in this way (by choosing a pathological hypothesis space) affect the variance?

Variance appears to only depend on and is independent of f. Perhaps it could be considered to indirectly depend of f to the extent that each g tries to approximate f. Thus is variance affected by whether is close to f or not?

Is it possible to increase complexity/hypothesis-set-size without increasing the variance? It is not obvious that this is not possible although the intuitive explanation is that a larger hypothesis set will result in a larger variance.
Reply With Quote
  #4  
Old 08-09-2012, 02:03 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Question on Bias Variance Tradeoff

Quote:
Originally Posted by hashable View Post
Thanks for the quick reply. I have some follow up questions in regards to variance.

How does increasing bias artificially in this way (by choosing a pathological hypothesis space) affect the variance?

Variance appears to only depend on and is independent of f. Perhaps it could be considered to indirectly depend of f to the extent that each g tries to approximate f. Thus is variance affected by whether is close to f or not?

Is it possible to increase complexity/hypothesis-set-size without increasing the variance? It is not obvious that this is not possible although the intuitive explanation is that a larger hypothesis set will result in a larger variance.
The dependency of the variance on the target is, as you point out. more complicated. For instance, if you try to learn the constant function, most models will converge with litlle variance, whereas a more complex target will result in bigger variance with the same models. The intuition beyond the bias and variance is valid for a lot of situations, but may not be valid for some.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
Reply

Tags
bias, variance

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 10:52 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.