Hoeffding applies to a single hypothesis
. To take a concrete case of the setting in the problem, suppose that
is obtained by flipping a fair coin for every data point to obtain
. The problem proves that your expected off training set error is 0.5; no surprise, since
is random
Hoeffding tells you that your training error (if N is large enough) will not deviate far from Eout with very high probability. This means that with very high probability, Ein will be close to 0.5.
Problem 1.10 says that if you are in such a pathalogical learning situation with a random
, then no matter what you do insample, your outofsample performance will be very bad. (This is sometimes called nofreelunch, because you cannot expect to succeed unless you assume that f is not completely random).
Hoeffding (and VC) says that provided your hypothesis set is not too large compared to N, you will
know that you are in a bad situation because your Ein will reflect Eout and be close to 0.5. The nature of your
with respect to
will determine how good your Eout; if
is very bad, then Eout will be close to 0.5. Hoeffding role is to ensure that Ein will tell you that you are in this tough situation.
To summarize: problem 1.10 identifies one case in which you are in trouble and Eout will be 0.5. Hoeffeing tells you when you will be able to know that you are in trouble.
Quote:
Originally Posted by vsthakur
Hi,
If I got it right, in a noiseless setting, for a fixed D, if all f are equally likely, the expected offtrainingerror of any hypothesis h is 0.5 (part d of problem 1.10, page 37) and hence any two algorithms are the same in terms of expected off training error (part e of the same problem).
My question is, does this not contradict the generalization by Hoeffding. Specifically, the following point is bothering me
By Hoeffding : Ein approaches Eout for larger number of hypothesis (i.e for small epsilon) as N grows sufficiently large. Which would imply that expected(Eout) should be approximately the same as expected(Ein) and not a constant (0.5).
Can you please provide some insight on this, perhaps my comparison is erroneous.
Thanks.
