LFD Book Forum Hoeffding Inequality

#1
09-18-2015, 11:26 PM
 henry2015 Member Join Date: Aug 2015 Posts: 31
Hoeffding Inequality

Hi,

On page 22, it says, "the hypothesis h is fixed before you generate the data set, and the
probability is with respect to random data sets D; we emphasize that the assumption "h is fixed before you generate the data set" is critical to the validity of this bound".

Few questions:
1. Does the "data set" in "generate the data set" refer to the marble (which is the data set D) we pick randomly from the jar? Or it refers to the set of outputs (red/green) of h(x) on D?
2. It keeps mentioning "h is fixed before you generate the data set". Does it mean in machine learning, a set of h should be predefined before seeing any training data and no h can be added to the set after seeing the training data?

Thanks!
#2
09-18-2015, 11:38 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: Hoeffding Inequality

Quote:
 Originally Posted by henry2015 1. Does the "data set" in "generate the data set" refer to the marble (which is the data set D) we pick randomly from the jar? Or it refers to the set of outputs (red/green) of h(x) on D?
The target is assumed to be fixed, so since is also fixed, the colors of all marbles are fixed and picking the data set would mean picking the marbles in the sample.

Quote:
 2. It keeps mentioning "h is fixed before you generate the data set". Does it mean in machine learning, a set of h should be predefined before seeing any training data and no h can be added to the set after seeing the training data?
This is the assumption that the theory is based on. If one wants to add hypotheses after seeing the data and still apply the theory, one should take the set of hypotheses to include all potential hypotheses that may be added (whatever the data set may be).
__________________
Where everyone thinks alike, no one thinks very much
#3
09-19-2015, 12:36 AM
 henry2015 Member Join Date: Aug 2015 Posts: 31
Re: Hoeffding Inequality

Now, I wonder why "we cannot just plug in g for h in the Hoeffding inequality". Given g is one of h's and for each h, Hoeffding inequality is valid for the upper bound of P[|Ein(h) - Eout(h)| > E]. Even g is picked after we look at all the outputs of all h's, g is still one of h's. So Hoeffding inequality should be still valid for g. No?

Thanks!
#4
09-19-2015, 03:56 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: Hoeffding Inequality

Quote:
 Originally Posted by henry2015 Thanks for your quick reply Professor! Now, I wonder why "we cannot just plug in g for h in the Hoeffding inequality". Given g is one of h's and for each h, Hoeffding inequality is valid for the upper bound of P[|Ein(h) - Eout(h)| > E]. Even g is picked after we look at all the outputs of all h's, g is still one of h's. So Hoeffding inequality should be still valid for g. No? Thanks!
This is the main point of this part. Take the coin flipping example, with each of 1000 fair coins flipped 10 times. Hoeffding applies to each coin, right? Now if we pick "g" to be the coin that produced the most heads, we lose the Hoeffding guarantee because the small probability of bad behavior for each coin accumulates into a not-so-small probability of bad behavior of some coin (which we picked deliberately because it behaved badly).
__________________
Where everyone thinks alike, no one thinks very much
#5
10-01-2015, 05:22 AM
 henry2015 Member Join Date: Aug 2015 Posts: 31
Re: Hoeffding Inequality

Hi Professor,

I just have a hard time to understand that how choosing a hypothesis changes a theory -- Hoeffding inequality.

Let's say h1(x) < P1, h2(x) < P2. We choose h2 to be g. Then h2(x) < P2 is no longer true?

I sort of understand your example because we pick the run of coin flipping that produces most heads, and so if we plot the graph, the graph indicates that Hoeffding inequality doesn't apply. But Hoeffding inequality is talking about probability and so the reality might be off a bit.

Maybe I am in a wrong direction?
#6
10-01-2015, 10:38 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: Hoeffding Inequality

It's a subtle point. There is "cherry picking" if we fish for a sample that has certain properties after many trials, instead of having a sample that is fairly drawn from a fixed hypothesis.

Statements involving probability are tricky because they don't guarantee a particular outcome, just the likelihood of getting that outcome. Therefore, changing the game to allow more trials or different conditions would change the probabilities.
__________________
Where everyone thinks alike, no one thinks very much

 Tags fixed hypothesis, hoeffding inequality

 Thread Tools Display Modes Hybrid Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 03:13 PM.