role of P(X) ?
The Hoeffding bound for the model H in chapter one, only requires that
we make the assumption that the input examples are a random sample
from the bin; so we can generalize the sample error.
What role does the distribution on X play? It appears to me that we don't need
it. (at least the way the issue of feasibility is setup in chapter 1)
ie. true mismatch ~ sample mismatch.
Thanks.
