LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 2

Reply
 
Thread Tools Display Modes
  #1  
Old 07-21-2012, 04:55 PM
dbaksi@gmail.com dbaksi@gmail.com is offline
Junior Member
 
Join Date: Jul 2012
Posts: 2
Default HW 2 Problem 6

How is this different from problem 5 other than N=1000 and the fact that these simulated 'out of sample' points (E_out) are generated fresh ? I may be missing something but it seems to boil down to running the same program as in problem 5 with N=1000 for 1000 times; can someone clarify please ? thanks
Reply With Quote
  #2  
Old 07-21-2012, 10:00 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: HW 2 Problem 6

Quote:
Originally Posted by dbaksi@gmail.com View Post
How is this different from problem 5 other than N=1000 and the fact that these simulated 'out of sample' points (E_out) are generated fresh ? I may be missing something but it seems to boil down to running the same program as in problem 5 with N=1000 for 1000 times; can someone clarify please ? thanks
There are indeed instances in the homeworks where the same experiment covers a number of homework problems.

Problem 5 asks about E_{\rm in} while Problem 6 asks about (an estimate of) E_{\rm out}. In both problems, N=100 (N stands for the number of training examples in our notation).
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 07-22-2012, 01:58 AM
MLearning MLearning is offline
Senior Member
 
Join Date: Jul 2012
Posts: 56
Default Re: HW 2 Problem 6

It is my understanding that "fresh data" refers to cross-validation data. Do we then compute Eout using the weights obtained in problem 5? When I do this, Eout < Ein. When I design the weights using the fresh data, Eout is approximately equal to Ein. Does this makes sense?
Reply With Quote
  #4  
Old 07-22-2012, 02:06 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: HW 2 Problem 6

Quote:
Originally Posted by MLearning View Post
It is my understanding that "fresh data" refers to cross-validation data. Do we then compute Eout using the weights obtained in problem 5?
It is simpler than cross validation (a topic that will be covered in detail in a later lecture). You just generate new data points that were not involved in training and evaluate the final hypothesis g on those points.

The final hypothesis is indeed the one whose weights were determined in Problem 5, where the training took place.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 07-22-2012, 04:49 AM
dsvav dsvav is offline
Junior Member
 
Join Date: Jul 2012
Posts: 8
Default Re: HW 2 Problem 6

I am confused here , I don't understand what is final hypothesis here.

There are 1000 target function and corresponding 1000 weight vectors/hypothesis in problem 5 .

So for problem 6 , 1000 times I generate 1000 out-of-sample data and then for each weight vector and target function(from problem 5) I evaluate E_out for that out-of-sample data and finally average them. This is how I have done.

I don't see final hypothesis here , what I am missing , any hint

Could it be that in problem 5 there is supposed to be only one target function and many in-sample data ? If so then the final hypothesis/weights could be that produces minimum in-sample error E_in .

Please clarify.
Thanks a lot.
Reply With Quote
  #6  
Old 07-22-2012, 05:00 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: HW 2 Problem 6

Quote:
Originally Posted by dsvav View Post
I am confused here , I don't understand what is final hypothesis here.

There are 1000 target function and corresponding 1000 weight vectors/hypothesis in problem 5 .

So for problem 6 , 1000 times I generate 1000 out-of-sample data and then for each weight vector and target function(from problem 5) I evaluate E_out for that out-of-sample data and finally average them. This is how I have done.

I don't see final hypothesis here , what I am missing , any hint

Could it be that in problem 5 there is supposed to be only one target function and many in-sample data ? If so then the final hypothesis/weights could be that produces minimum in-sample error E_in .

Please clarify.
Thanks a lot.
There is a final hypothesis for each of the 1000 runs. The only reason we are repeating the runs is to average out statistical fluctuations, but all the notions of the learning problem, including the final hypothesis, pertain to a single run.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #7  
Old 07-22-2012, 06:55 AM
dbaksi@gmail.com dbaksi@gmail.com is offline
Junior Member
 
Join Date: Jul 2012
Posts: 2
Default Re: HW 2 Problem 6

Thanks a lot. The statements about (i) N being the number of 'in-sample' training data in both problems and (ii) the freshly generated 1000 points being disjoint from the first set clarified the confusion I had.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 08:50 AM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.