View Single Post
Old 08-10-2012, 11:34 PM
yaser's Avatar
yaser yaser is offline
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,478
Default Re: Data snooping (test vs. train data)

Originally Posted by rseiter View Post
The three cases I am trying to distinguish (understand how they compare in the effect on d_vc) are:
1. The learning algorithm chooses the discretization to use.
2. I choose the discretization to use based on looking at the data (snooping).
3. I choose the discretization to use based on my prior knowledge (without looking at the data).
You are right that case 3 patently has no snooping. It seems to me that for both 1 and 2 you can depend entirely on the inputs of the data set without looking at the outputs (labels), so that also would not involve snooping.
Where everyone thinks alike, no one thinks very much
Reply With Quote