LFD Book Forum HW 2 Problem 6
 User Name Remember Me? Password
 FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
07-21-2012, 03:55 PM
 dbaksi@gmail.com Junior Member Join Date: Jul 2012 Posts: 2
HW 2 Problem 6

How is this different from problem 5 other than N=1000 and the fact that these simulated 'out of sample' points (E_out) are generated fresh ? I may be missing something but it seems to boil down to running the same program as in problem 5 with N=1000 for 1000 times; can someone clarify please ? thanks
#2
07-21-2012, 09:00 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: HW 2 Problem 6

Quote:
 Originally Posted by dbaksi@gmail.com How is this different from problem 5 other than N=1000 and the fact that these simulated 'out of sample' points (E_out) are generated fresh ? I may be missing something but it seems to boil down to running the same program as in problem 5 with N=1000 for 1000 times; can someone clarify please ? thanks
There are indeed instances in the homeworks where the same experiment covers a number of homework problems.

Problem 5 asks about while Problem 6 asks about (an estimate of) . In both problems, ( stands for the number of training examples in our notation).
__________________
Where everyone thinks alike, no one thinks very much
#3
07-22-2012, 12:58 AM
 MLearning Senior Member Join Date: Jul 2012 Posts: 56
Re: HW 2 Problem 6

It is my understanding that "fresh data" refers to cross-validation data. Do we then compute Eout using the weights obtained in problem 5? When I do this, Eout < Ein. When I design the weights using the fresh data, Eout is approximately equal to Ein. Does this makes sense?
#4
07-22-2012, 01:06 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: HW 2 Problem 6

Quote:
 Originally Posted by MLearning It is my understanding that "fresh data" refers to cross-validation data. Do we then compute Eout using the weights obtained in problem 5?
It is simpler than cross validation (a topic that will be covered in detail in a later lecture). You just generate new data points that were not involved in training and evaluate the final hypothesis on those points.

The final hypothesis is indeed the one whose weights were determined in Problem 5, where the training took place.
__________________
Where everyone thinks alike, no one thinks very much
#5
07-22-2012, 03:49 AM
 dsvav Junior Member Join Date: Jul 2012 Posts: 8
Re: HW 2 Problem 6

I am confused here , I don't understand what is final hypothesis here.

There are 1000 target function and corresponding 1000 weight vectors/hypothesis in problem 5 .

So for problem 6 , 1000 times I generate 1000 out-of-sample data and then for each weight vector and target function(from problem 5) I evaluate E_out for that out-of-sample data and finally average them. This is how I have done.

I don't see final hypothesis here , what I am missing , any hint

Could it be that in problem 5 there is supposed to be only one target function and many in-sample data ? If so then the final hypothesis/weights could be that produces minimum in-sample error E_in .

Thanks a lot.
#6
07-22-2012, 04:00 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: HW 2 Problem 6

Quote:
 Originally Posted by dsvav I am confused here , I don't understand what is final hypothesis here. There are 1000 target function and corresponding 1000 weight vectors/hypothesis in problem 5 . So for problem 6 , 1000 times I generate 1000 out-of-sample data and then for each weight vector and target function(from problem 5) I evaluate E_out for that out-of-sample data and finally average them. This is how I have done. I don't see final hypothesis here , what I am missing , any hint Could it be that in problem 5 there is supposed to be only one target function and many in-sample data ? If so then the final hypothesis/weights could be that produces minimum in-sample error E_in . Please clarify. Thanks a lot.
There is a final hypothesis for each of the 1000 runs. The only reason we are repeating the runs is to average out statistical fluctuations, but all the notions of the learning problem, including the final hypothesis, pertain to a single run.
__________________
Where everyone thinks alike, no one thinks very much
#7
07-22-2012, 05:55 AM
 dbaksi@gmail.com Junior Member Join Date: Jul 2012 Posts: 2
Re: HW 2 Problem 6

Thanks a lot. The statements about (i) N being the number of 'in-sample' training data in both problems and (ii) the freshly generated 1000 points being disjoint from the first set clarified the confusion I had.

 Thread Tools Display Modes Hybrid Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 07:13 PM.

 Contact Us - LFD Book - Top