Hoeffding Inequality
Hi,
On page 22, it says, "the hypothesis h is fixed before you generate the data set, and the
probability is with respect to random data sets D; we emphasize that the assumption "h is fixed before you generate the data set" is critical to the validity of this bound".
Few questions:
1. Does the "data set" in "generate the data set" refer to the marble (which is the data set D) we pick randomly from the jar? Or it refers to the set of outputs (red/green) of h(x) on D?
2. It keeps mentioning "h is fixed before you generate the data set". Does it mean in machine learning, a set of h should be predefined before seeing any training data and no h can be added to the set after seeing the training data?
Thanks!
