 LFD Book Forum bias and variance - definition of g bar
 User Name Remember Me? Password
 Register FAQ Calendar Mark Forums Read Thread Tools Display Modes
#11 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 597 Re: bias and variance - definition of g bar

All the numbers you mention below are approximately correct. You can now explicitly compute bias(x) and var(x) in terms of x, mean(a), mean(b), var(a) and var(b) (mean(b)=0):  Bias is the average of bias(x) over x; var is the average of var(x) over x. Set . One can show that  Note: you can also compute the bias and variance via simulation.

Quote:
 Originally Posted by the cyclist I am struggling to replicate the variance of H_1 of Ex. 2.8 in the text. I was able to get the bias correct (and both bias and variance for H_0), as well as getting the related quiz problem correct, so this is really puzzling me. I'm trying to narrow down where my mistake might be. Can someone please verify whether or not the correct average hypothesis is g_bar(x) = a_mean * x + b_mean where a_mean ~= 0.776 and b_mean ~= 0. I plot that, and it does look like the figure in the book. Also, when I take the standard deviation (over the data sets) of the coefficients a and b, I get std(a) ~= 1.52 std(b) ~= 0.96 Do those look right? I am truly puzzled here!
__________________
Have faith in probability
#12
 munchkin Member Join Date: Jul 2012 Posts: 38 Re: bias and variance - definition of g bar

I'm having doubts about the variance value in example 2.8 since it indicates that the root mean square deviation of the test data from the sinusoid line is 1.3= sqrt(1.69). So the magnitude of the average (a*x+b) difference from (a_mean*x+b_mean) evaluated at a given point on the sinusoid is bigger than the root mean square value (.7071) of the sinusoid that generated the data point in the first place? I'm inclined to doubt that.

The mean square deviation between the slope of each generated line and a_mean is larger than 1.69 so at this point I have no idea where that variance number came from.
#13 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478 Re: bias and variance - definition of g bar

Quote:
 Originally Posted by munchkin I'm having doubts about the variance value in example 2.8 since it indicates that the root mean square deviation of the test data from the sinusoid line is 1.3= sqrt(1.69). So the magnitude of the average (a*x+b) difference from (a_mean*x+b_mean) evaluated at a given point on the sinusoid is bigger than the root mean square value (.7071) of the sinusoid that generated the data point in the first place? I'm inclined to doubt that.
The average is taken over the entire domain so that includes points where the line (which fits the two training points on the sinusoid) diverges significantly from the sinusoid and from the average line. The figures on page 65 illustrate that.
__________________
Where everyone thinks alike, no one thinks very much
#14
 munchkin Member Join Date: Jul 2012 Posts: 38 Re: bias and variance - definition of g bar

Yes, I can see that on the charts for example 2.8 but those outlying points do not exert an effect (at a given x) for the averaged g(D)[x] calculation I am using. So I am wrong on both counts!

A careful rereading of page 63 has led me to try averaging over the calculated data set g's at an arbitrary (generic?) point x and using that to calculate the variance of g(D)[x]. This seems to be a step in the right direction since the calculated variance is now a function of that arbitrary x point and has a minimum around x=0 just like the chart in the example 2.8 but based on the values at the extremes and in the middle I can't see how my average variance over the domain [-1,1] would be as low as 1.69. We shall see.

Thanks so much for your helpful comments, they are really appreciated and this is a great class even if I am a little dense in absorbing some of the material. Have a great day.
#15
 the cyclist Member Join Date: Jul 2012 Posts: 26 Re: bias and variance - definition of g bar

Finally got it. Thanks to magdon for confirming one part of my calculation, so that I did not need to waste time poring over it. Thanks also to yaser for a tip, in another thread, that helped me a lot. It turns out that I was incorrectly reusing the sample dataset to calculate (via simulation) the variance. Instead, I needed to generate a fresh dataset for that.

It's funny how sometimes making mistakes at first leads to a much more solid understanding later!
#16
 munchkin Member Join Date: Jul 2012 Posts: 38 Re: bias and variance - definition of g bar

I've got it too! Repeatedly evaluate var[ g(D)[x] ] over the entire data set with x ranging from -1 to 1 and average those values to get 1.69 ! ! ! Feeling a sense of real accomplishment here. Thread Tools Show Printable Version Email this Page Display Modes Linear Mode Switch to Hybrid Mode Switch to Threaded Mode Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 02:17 PM.

 Contact Us - LFD Book - Top

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd. The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.