LFD Book Forum Bias-Variance Analysis
 User Name Remember Me? Password
 Register FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
03-29-2015, 04:40 AM
 Andrew87 Junior Member Join Date: Feb 2015 Posts: 6
Bias-Variance Analysis

Hello,

I'm getting confused about . Why is it the best approximation of the target function we could obtain in the unreal case of infinite training sets ?

Thank you in advance,
Andrea
#2
03-29-2015, 09:48 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Bias-Variance Analysis

Quote:
 Originally Posted by Andrew87 Hello, I'm getting confused about . Why is it the best approximation of the target function we could obtain in the unreal case of infinite training sets ? Thank you in advance, Andrea
It is not necessarily the best approximation of the target function, but it is often close. If we have one, infinite-size training set, and we have infinite computational power that goes with it, we can arrive at the best approximation. In the bias-variance analysis, we are given an infinite number of finite training sets, and we are restricted to using one of these finite training sets at a time, then averaging the resulting hypotheses. This restriction can take us away from the absolute optimal, but usually not by much.
__________________
Where everyone thinks alike, no one thinks very much
#3
04-03-2015, 06:21 AM
 Andrew87 Junior Member Join Date: Feb 2015 Posts: 6
Re: Bias-Variance Analysis

Thank you very much for your answer Prof. Yaser. It clarified my doubt.

My kind regards,
Andrea
#4
06-04-2015, 02:23 PM
 sayan751 Junior Member Join Date: Jun 2015 Posts: 5
Re: Bias-Variance Analysis

Hi,

I have a doubt regarding g bar.

I tried to calculate the bias for the second learner, i.e. h(x) = ax + b. So this is how did it:
• Generated around 1000 data points (x ranging from -1 to 1)
• Then picked up two sample data points at random
• Solved for a and b using matrix
• Repeated this process for around 3000 times and
• Lastly took mean for a and mean for b, which formed the g2 bar
• Used this g2 bar for calculating the respective bias, which also matched with the given value of bias

Now I have two questions:
1. Please let me know whether I am proceeding in the right direction or not.
2. When I am trying to repeat this process with a polynomial model instead of linear model, my calculated bias for the polynomial model varies in great margin, even if the sample data points doesn't change. For polynomial as well, I took the mean of the coefficients, but still my answer (both g bar and bias) varies greatly with each run. What I am missing here?
#5
06-04-2015, 11:35 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Bias-Variance Analysis

Quote:
 Originally Posted by sayan751 1. Please let me know whether I am proceeding in the right direction or not. 2. When I am trying to repeat this process with a polynomial model instead of linear model, my calculated bias for the polynomial model varies in great margin, even if the sample data points doesn't change. For polynomial as well, I took the mean of the coefficients, but still my answer (both g bar and bias) varies greatly with each run. What I am missing here?
1. Your approach is correct. While sampling from a fixed 1000-point set is not the same as sampling from the whole domain, it should be close enough.

2. Not sure if this is the reason, but if you are still using a 2-point training set, a polynomial model will have too many parameters, leading to non-unique solutions that could vary wildly.
__________________
Where everyone thinks alike, no one thinks very much
#6
06-04-2015, 11:49 PM
 sayan751 Junior Member Join Date: Jun 2015 Posts: 5
Re: Bias-Variance Analysis

Thank You Prof. Yaser for your reply.

I am using a 10 point dataset for the polynomial model. However, the problem I am referring to defines y = f(x) + noise = x + noise.

Previously by mistake I was assuming f(x) as y rather than only x. Later I noticed that all the calculation of bias and variance concentrate purely on f(x). Hence later I ignored the noise and now I am getting stable bias and variance for polynomial model for each run.
#7
06-14-2015, 02:03 PM
 prithagupta.nsit Junior Member Join Date: Jun 2015 Posts: 7
Re: Bias-Variance Analysis

Hello,

I have a few questions if we consider the following model:
Suppose instances x are distributed uniformly in X = [0; 10] and outputs are given by
y = f (x) + e  = x + e;
where  e is an error term with a standard normal distribution.

Now to analyse the decomposition of the generalization error into bias + variance + noise by generating random samples of size N = 10, fitting the models gi, and determining the predictions and prediction errors for x = 0, 1/100,.....,10.

1. During calculations of gbar ,bias and variance won't it be wrong to not consider error during the generation of data sets? if not why?

2. How can we calculate noise separately for the polynomial hypothesis?

3. My understanding to calculate the predictions and prediction errors:
Predictions would be the value given by function gbar on x and prediction error would be the difference of that value from the value generated by function f(x). Am I correct?

Looking forward to a reply
#8
06-19-2015, 12:38 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Bias-Variance Analysis

Quote:
 Originally Posted by prithagupta.nsit Hello, I have a few questions if we consider the following model: Suppose instances x are distributed uniformly in X = [0; 10] and outputs are given by y = f (x) + e  = x + e; where  e is an error term with a standard normal distribution. Now to analyse the decomposition of the generalization error into bias + variance + noise by generating random samples of size N = 10, fitting the models gi, and determining the predictions and prediction errors for x = 0, 1/100,.....,10. 1. During calculations of gbar ,bias and variance won't it be wrong to not consider error during the generation of data sets? if not why? 2. How can we calculate noise separately for the polynomial hypothesis? 3. My understanding to calculate the predictions and prediction errors: Predictions would be the value given by function gbar on x and prediction error would be the difference of that value from the value generated by function f(x). Am I correct? Looking forward to a reply
Would you clarify some points as I didn't quite understand the questions? First, I take it that what you referred to as model is the target function (target distribution in this noisy case). If so, what is the learning model (hypothesis set) you are using? Perhaps you can rephrase your three questions after you define the model.
__________________
Where everyone thinks alike, no one thinks very much
#9
06-20-2015, 02:51 AM
 prithagupta.nsit Junior Member Join Date: Jun 2015 Posts: 7
Re: Bias-Variance Analysis

Dear Prof. Mostafa,

The two hypothesis sets are:

g1(x) = b

g2(x) = α4 . x^4+ α3 x^3 + α2 x^2 + α1 +b

Analyze the decomposition of the generalization error into bias + variance + noise
by generating random samples of size N = 10, fitting the models gi , and determining the predictions and prediction errors for x = 0, 1/100, . . . , 10.

How to generalize noise and during the calculation of bias and variance, how can we ignore the error e in the target function?

How to determine the predictions and prediction errors for different values of x?
#10
06-20-2015, 04:31 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Bias-Variance Analysis

Quote:
 Originally Posted by prithagupta.nsit How to generalize noise and during the calculation of bias and variance, how can we ignore the error e in the target function? How to determine the predictions and prediction errors for different values of x?
The formula for decomposing the out-of-sample error into bias+variance+noise is discussed in Lecture 11 of the Learning From Data online course, in the part corresponding to slides 18-20.

If you look at this derivation, what you refer to as the error in the target function (which I assume is the noisy part) is not ignored. Also, the formula is given for each value of .

Of course, evaluating these terms explicitly requires knowledge of , which is the case in bias-variance analysis in general. You can calculate them in your example since you spelled out the target. The benefit is to illustrate how these quantities change as you vary the number of data points, the level of noise, etc.
__________________
Where everyone thinks alike, no one thinks very much

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 06:56 PM.

 Contact Us - LFD Book - Top

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.