View Single Post
Old 09-14-2012, 11:12 AM
yaser's Avatar
yaser yaser is offline
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,478
Default Re: Hiding Initial Hypothesis from the Data

Originally Posted by david.vorick View Post
Suppose I have a set of hypothesis, but I also want to look at the data to refine my choosing of the hypotheses to test.

So I randomly select 1/3 of the data points and see how my initial hypotheses work, and refine them, pick a new set H.

If I test the new H on the next 2/3 of the data, can I disregard the first 1/3 of the data and the first hypotheses that I tested, and therefore get a smaller H when using Hoeffding's bound? Or do I still have to consider all of the hypotheses tested so far?
The 2/3 of the data were not involved in the selection of the new hypothesis set, so you can apply the Hoeffding/VC bounds to the 2/3 data and new hypothesis set.
Where everyone thinks alike, no one thinks very much
Reply With Quote