LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 2

Reply
 
Thread Tools Display Modes
  #11  
Old 07-23-2012, 02:41 PM
rakhlin rakhlin is offline
Member
 
Join Date: Jun 2012
Posts: 24
Default Re: HW 2 Problem 6

Quote:
Originally Posted by yaser View Post
Just to clarify. You used the in-sample points to train and arrived at a final set of weights (corresponding to the final hypothesis). Each out of-sample point is now tested on this hypothesis and compared to the target value on the same point. Now, what exactly do you do to get the two scenarios you are describing?
1-st (normal) scenario: I test out-of-sample data set (100 points) against linear model. I repeat it 1000 times: generate 100 in-sample points, linear fit, generate 100 out-of-sample points, test. On each iteration accumulate # of mistaken points. Average errors when done. Average error is stable from run to run.

2-nd scenario: fit linear model only once. Repeat 1000 times: generate 100 out-of-sample points, test. Accumulate and average errors when done. Here I get remarkable variation in average error.

I'd like to understand why these scenarios different. I believe they must not
Reply With Quote
  #12  
Old 07-23-2012, 04:04 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: HW 2 Problem 6

Quote:
Originally Posted by rakhlin View Post
2-nd scenario: fit linear model only once. Repeat 1000 times: generate 100 out-of-sample points, test. Accumulate and average errors when done. Here I get remarkable variation in average error.not
Does variation in the average error mean that you repeat the entire experiment you described (including the target, training set, and resulting linear fit) and look at the different averages you get?
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #13  
Old 07-23-2012, 05:59 PM
rakhlin rakhlin is offline
Member
 
Join Date: Jun 2012
Posts: 24
Default Re: HW 2 Problem 6

Quote:
Originally Posted by yaser View Post
Does variation in the average error mean that you repeat the entire experiment you described (including the target, training set, and resulting linear fit) and look at the different averages you get?
Not quite so. When I repeat entire experiment (including the target, training set, and resulting linear fit) I always get approximately same averages. I get different averages (0.01 ... 0.13) when I use one target and average over 1000 out-of-sample sets (for a single target).

Now as I plotted both lines (target and hypothesis) per your advice I begin to think that this is maybe what we should expect. Linear regression not always fits well. Usually it looks good giving small in-sample error. But sometimes disagreement is visually large (>0.1 in-sample error). This is the root of variation in average error when I use same "bad" regression for all 1000 iterations. I hope this type of experiment isn't implied by the problem. Otherwise it has no certain answer - at least 2 answers match.

So there is another question. Is >0.1 in-sample error and visually non-optimal fit still valid outcome of linear regression for linearly separable data?
Reply With Quote
  #14  
Old 07-23-2012, 08:59 PM
ilya239 ilya239 is offline
Senior Member
 
Join Date: Jul 2012
Posts: 58
Default Re: HW 2 Problem 6

Quote:
Originally Posted by rakhlin View Post
When I generate new data and hypothesis for every single run of 1000 (as the problem suggests) I get stable out-of-sample result close to (slightly greater than) in-sample error.
When I estimate 1000 different out-of-samples for one in-sample and single hypothesis I get very different average error rates with high variability from 0.01 to 0.13 Why so?
That's like problem 1: if you try many enough different out-of-samples, inevitably there will be one on which your hypothesis does great, and one on which it does badly. As an extreme case, if your out-of-sample testing size was 1 instead of 1000, on some of these out-of-samples you'd get 0% error rate and on some you'd get 100% error rate. To get an actual estimate of out-of-sample error rate you should pool all these together.
Reply With Quote
  #15  
Old 07-24-2012, 02:02 AM
rakhlin rakhlin is offline
Member
 
Join Date: Jun 2012
Posts: 24
Default Re: HW 2 Problem 6

Quote:
Originally Posted by ilya239 View Post
That's like problem 1: if you try many enough different out-of-samples, inevitably there will be one on which your hypothesis does great, and one on which it does badly. As an extreme case, if your out-of-sample testing size was 1 instead of 1000, on some of these out-of-samples you'd get 0% error rate and on some you'd get 100% error rate. To get an actual estimate of out-of-sample error rate you should pool all these together.
This is not the case - I average 1000 out-of-samples anyway. Is seems the reason is large variation in in-sample error for different train samples.
Reply With Quote
  #16  
Old 07-24-2012, 08:17 AM
MLearning MLearning is offline
Senior Member
 
Join Date: Jul 2012
Posts: 56
Default Re: HW 2 Problem 6

I also observe some discrepancy while computing Eout. When I hold the target function fixed, Eout is approximately equal to Ein. When I use different target functions for each experiment, Eout is significantly higher than Ein. Is this expected?
Reply With Quote
  #17  
Old 07-24-2012, 08:54 AM
dsvav dsvav is offline
Junior Member
 
Join Date: Jul 2012
Posts: 8
Default Re: HW 2 Problem 6

@MLearning

In my opinion E_out should be near E_in when target is not fixed(i.e averaging over 1000 iterations) .

Unfortunately problem can be anywhere , but most probably in computing error.

Could it be that when you are computing error for E_out for one iteration, the number of sample points for out of sample are 1000 , and if you forgot to change the number of samples points from 100 (from Q 5) to 1000 , then may be that is the cause of difference. (misclassified/sample_size)

I am just guessing , since I make this kind of mistakes often.
Reply With Quote
  #18  
Old 07-24-2012, 09:33 AM
MLearning MLearning is offline
Senior Member
 
Join Date: Jul 2012
Posts: 56
Default Re: HW 2 Problem 6

@dsvav,

Thank you for your comments. You were right, I did forget to change the sample number (N) to 1000. But that doesn't change the result. It is possible that Eout is not the same as Ein although that is what we want. Indeed, we are applying linear regression to a random data that it hasn't seen before; hence, the larger deviation between Eout and Ein.
Reply With Quote
  #19  
Old 07-24-2012, 09:52 AM
dsvav dsvav is offline
Junior Member
 
Join Date: Jul 2012
Posts: 8
Default Re: HW 2 Problem 6

@MLearning

When I compute the difference between E_in and E_out I get the difference to be around 0.01.

I still think difference should not be significant , does not this comes from Hoeffding Inequality ?

Also since we are suppressing the "very bad event happening" and "very good event happening" by taking average over 1000 runs , so E_out should track E_in.


This is my understanding , there is good chance that I am wrong

By the way what is the difference you are getting ?
Reply With Quote
  #20  
Old 07-24-2012, 10:22 AM
rakhlin rakhlin is offline
Member
 
Join Date: Jun 2012
Posts: 24
Default Re: HW 2 Problem 6

Quote:
Originally Posted by MLearning View Post
I also observe some discrepancy while computing Eout. When I hold the target function fixed, Eout is approximately equal to Ein. When I use different target functions for each experiment, Eout is significantly higher than Ein. Is this expected?
For a given target and regression Ein and Eout must not deviate much from each other for large N. The intuition is error zone between two lines is fixed, and points in Ein and Eout distributed uniformly.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 04:16 AM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.