 LFD Book Forum Online homework 4, question 4
 Register FAQ Calendar Mark Forums Read

#1
 Tobias Junior Member Join Date: Sep 2013 Posts: 1 Online homework 4, question 4

Hi there.

I have some understanding how to find g-bar(x). After a lot of tries, I have got to the following solution, but I am far from sure it is valid.
g-bar(x) must be the h(x)=ax, which minimizes the expected squared error for any point, i.e. the expected value of . Since x is uniformly distributed this is the same as minimizing , which yields a=3/pi =0.955

To this I have a few questions
1. Am I correct
2. Does g-bar depend on the size of the sample?
3. Is there a general approach to find g-bar?
#2 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478 Re: Online homework 4, question 4

Quote:
 Originally Posted by Tobias Hi there. I have some understanding how to find g-bar(x). After a lot of tries, I have got to the following solution, but I am far from sure it is valid. g-bar(x) must be the h(x)=ax, which minimizes the expected squared error for any point, i.e. the expected value of . Since x is uniformly distributed this is the same as minimizing , which yields a=3/pi =0.955 To this I have a few questionsAm I correct Does g-bar depend on the size of the sample? Is there a general approach to find g-bar?
Close. What you have calculated is the best approximation of the target using the model, but it is based on knowing the entire target function. If you assume you know only two points at a time (the data set given in the example), then you should fit the two points with a line then get the average of those lines as you vary the two points. You will get something close, but not identical, to the slope you got.

This answers your second question in the affirmative as well. Doing this exercise with two points at a time is not the same as with three points at a time so does depend on the size of the training set in general.

The general approach to finding is exactly following the definition. In integral form, it will be a double integral if the data set has two points, triple integral if it has three points etc., but in general it is done with Monte Carlo so no actual integration is needed.
__________________
Where everyone thinks alike, no one thinks very much

 Thread Tools Show Printable Version Email this Page Display Modes Linear Mode Switch to Hybrid Mode Switch to Threaded Mode Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 04:32 AM. The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.