LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > The Final

Reply
 
Thread Tools Display Modes
  #1  
Old 09-13-2012, 10:24 PM
munchkin munchkin is offline
Member
 
Join Date: Jul 2012
Posts: 38
Default P14-17 Out-Of-Sample Data Set Size?

Should the out-of-sample data set also be 100 randomly-generated points or should it be larger like in several of the earlier homeworks? Thanks for your attention.
Reply With Quote
  #2  
Old 09-13-2012, 11:13 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: P15-18 Out-Of-Sample Data Set Size?

Quote:
Originally Posted by munchkin View Post
Should the out-of-sample data set also be 100 randomly-generated points or should it be larger like in several of the earlier homeworks? Thanks for your attention.
Larger will give you a more reliable estimate of E_{\rm out}, and that may be necessary to make sure that the chosen answer is correct.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 09-13-2012, 11:24 PM
JohnH JohnH is offline
Member
 
Join Date: Jul 2012
Posts: 43
Default Re: P15-18 Out-Of-Sample Data Set Size?

The size of the out-of-sample set determines the accuracy and precision of the estimate of E_{out}. I've generally used sets of at least 1000 points.
Reply With Quote
  #4  
Old 09-14-2012, 12:53 AM
Andrs Andrs is offline
Member
 
Join Date: Jul 2012
Posts: 47
Default Re: P15-18 Out-Of-Sample Data Set Size?

In general the answers alternatives have good margins. If you run at least 1000 experiments, you should get "reasonable results" (E_out) on average with at least 100 test points. As always in machine learning, the more test data, the better.
Reply With Quote
  #5  
Old 09-14-2012, 03:45 AM
TonySuarez TonySuarez is offline
Member
 
Join Date: Jul 2012
Location: Lisboa, Portugal
Posts: 35
Default Re: P15-18 Out-Of-Sample Data Set Size?

Quote:
Originally Posted by munchkin View Post
Should the out-of-sample data set also be 100 randomly-generated points or should it be larger like in several of the earlier homeworks? Thanks for your attention.
I settled in 200 points for testing, the same set for all batch of 1000 runs, and Eout seemed very stable.
Reply With Quote
  #6  
Old 09-14-2012, 10:01 AM
samirbajaj samirbajaj is offline
Member
 
Join Date: Jul 2012
Location: Silicon Valley
Posts: 48
Default Re: P15-18 Out-Of-Sample Data Set Size?

Quote:
Originally Posted by Andrs View Post
In general the answers alternatives have good margins. ...

For questions 17 and 18, my answers are different from one set of experiments to the next.

Can't say any more, but I'm wondering if anyone else had a similar experience.

Thanks.

-Samir
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 03:41 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.