View Single Post
  #7  
Old 08-10-2012, 11:34 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Data snooping (test vs. train data)

Quote:
Originally Posted by rseiter View Post
The three cases I am trying to distinguish (understand how they compare in the effect on d_vc) are:
1. The learning algorithm chooses the discretization to use.
2. I choose the discretization to use based on looking at the data (snooping).
3. I choose the discretization to use based on my prior knowledge (without looking at the data).
You are right that case 3 patently has no snooping. It seems to me that for both 1 and 2 you can depend entirely on the inputs of the data set without looking at the outputs (labels), so that also would not involve snooping.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote