View Single Post
Old 09-18-2015, 10:38 PM
yaser's Avatar
yaser yaser is offline
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,478
Default Re: Hoeffding Inequality

Originally Posted by henry2015 View Post
1. Does the "data set" in "generate the data set" refer to the marble (which is the data set D) we pick randomly from the jar? Or it refers to the set of outputs (red/green) of h(x) on D?
The target f is assumed to be fixed, so since h is also fixed, the colors of all marbles are fixed and picking the data set would mean picking the marbles in the sample.

2. It keeps mentioning "h is fixed before you generate the data set". Does it mean in machine learning, a set of h should be predefined before seeing any training data and no h can be added to the set after seeing the training data?
This is the assumption that the theory is based on. If one wants to add hypotheses after seeing the data and still apply the theory, one should take the set of hypotheses to include all potential hypotheses that may be added (whatever the data set may be).
Where everyone thinks alike, no one thinks very much
Reply With Quote