M=H? (Lecture 2 slide 1617)
I have a question regarding the value of M in the multiplebins Hoeffding bound slides.
M is supposed to be the number of different alternate hypotheses considered by the learning algorithm.
At the same time, H is the space of possible hypotheses that can be considered by the algorithm (e.g., all linear functions, etc).
I keep going back and forth in my mind about whether M=H.
Specifically, suppose that for a SPECIFIC training set X, after looking at the data points in X, the algorithm only explored some subset of H, say G with G < H.
Would it then be correct to set M = G and say that for the specific training set X, the probability of the hypothesis being bad is at most 2G*the hoeffding bound ? Or would this be incorrect since the theorem only deals with the behavior of the system over all possible X with the distribution P.
Thanks!
