View Single Post
#2
02-05-2013, 09:18 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: Is the Hoeffding Inequality really valid for each bin despite non-random sampling

Quote:
 Originally Posted by scottedwards2000 The multiple bin analogy of picking the best h is a very helpful way of visualizing the situation, and I totally get how the union bound sets an upper limit on the probability of exceeding the error threshold. What I am actually questioning is whether those individual probabilities that compose the union bound are correct. I can see that they are just the individual Hoeffding Inequalities for each h, but is the Hoeffding Inequality really valid for all those h's in spite of the fact that we are NOT taking random samples from each "bin"? We are only picking our marbles (x's) ONCE, and then re-picking the same marbles from each bin (yes, the red-green colors of those marbles can change, based on the specific h, but aren't they the same marbles (x's)?).!
I see your concern. Here is one way to argue about it.

Assume the marbles have numbers to identify the point they are associated with, in addition to being green and red. Start with one bin, pick the marbles at random and look at the colors in sample and out of sample. It is clear that the Hoeffding inequality holds for this bin since the experiment is that of a single bin, regardless of the numbers on the marbles in sample.

Now if you reuse these numbers to do something else somewhere else, that will not alter the applicability of Hoeffding to the bin we started with, right? You can now view the multiple-bin experiment as starting with any given bin, then reusing the numbers you got in sample from that bin for the rest of the bins. Hoeffding still applies to that original given bin, regardless of which one it is. Since that bin is arbitrary, Hoeffding must apply to all of the bins individually.
__________________
Where everyone thinks alike, no one thinks very much