
#1




Hoeffding Inequality
Hi,
On page 22, it says, "the hypothesis h is fixed before you generate the data set, and the probability is with respect to random data sets D; we emphasize that the assumption "h is fixed before you generate the data set" is critical to the validity of this bound". Few questions: 1. Does the "data set" in "generate the data set" refer to the marble (which is the data set D) we pick randomly from the jar? Or it refers to the set of outputs (red/green) of h(x) on D? 2. It keeps mentioning "h is fixed before you generate the data set". Does it mean in machine learning, a set of h should be predefined before seeing any training data and no h can be added to the set after seeing the training data? Thanks! 
#2




Re: Hoeffding Inequality
Quote:
Quote:
__________________
Where everyone thinks alike, no one thinks very much 
#3




Re: Hoeffding Inequality
Thanks for your quick reply Professor!
Now, I wonder why "we cannot just plug in g for h in the Hoeffding inequality". Given g is one of h's and for each h, Hoeffding inequality is valid for the upper bound of P[Ein(h)  Eout(h) > E]. Even g is picked after we look at all the outputs of all h's, g is still one of h's. So Hoeffding inequality should be still valid for g. No? Thanks! 
#4




Re: Hoeffding Inequality
Quote:
__________________
Where everyone thinks alike, no one thinks very much 
#5




Re: Hoeffding Inequality
Hi Professor,
I just have a hard time to understand that how choosing a hypothesis changes a theory  Hoeffding inequality. Let's say h1(x) < P1, h2(x) < P2. We choose h2 to be g. Then h2(x) < P2 is no longer true? I sort of understand your example because we pick the run of coin flipping that produces most heads, and so if we plot the graph, the graph indicates that Hoeffding inequality doesn't apply. But Hoeffding inequality is talking about probability and so the reality might be off a bit. Maybe I am in a wrong direction? 
#6




Re: Hoeffding Inequality
It's a subtle point. There is "cherry picking" if we fish for a sample that has certain properties after many trials, instead of having a sample that is fairly drawn from a fixed hypothesis.
Statements involving probability are tricky because they don't guarantee a particular outcome, just the likelihood of getting that outcome. Therefore, changing the game to allow more trials or different conditions would change the probabilities.
__________________
Where everyone thinks alike, no one thinks very much 
Tags 
fixed hypothesis, hoeffding inequality 
Thread Tools  
Display Modes  

