View Single Post
Old 08-11-2012, 07:15 PM
magdon's Avatar
magdon magdon is offline
Join Date: Aug 2009
Location: Troy, NY, USA.
Posts: 597
Default Re: Problem 1.10 : Expected Off Training Error

Hoeffding applies to a single hypothesis h. To take a concrete case of the setting in the problem, suppose that f is obtained by flipping a fair coin for every data point to obtain \pm 1. The problem proves that your expected off training set error is 0.5; no surprise, since f is random

Hoeffding tells you that your training error (if N is large enough) will not deviate far from Eout with very high probability. This means that with very high probability, Ein will be close to 0.5.

Problem 1.10 says that if you are in such a pathalogical learning situation with a random f, then no matter what you do in-sample, your out-of-sample performance will be very bad. (This is sometimes called no-free-lunch, because you cannot expect to succeed unless you assume that f is not completely random).

Hoeffding (and VC) says that provided your hypothesis set is not too large compared to N, you will know that you are in a bad situation because your Ein will reflect Eout and be close to 0.5. The nature of your \cal H with respect to f will determine how good your Eout; if f is very bad, then Eout will be close to 0.5. Hoeffding role is to ensure that Ein will tell you that you are in this tough situation.

To summarize: problem 1.10 identifies one case in which you are in trouble and Eout will be 0.5. Hoeffeing tells you when you will be able to know that you are in trouble.

Originally Posted by vsthakur View Post
If I got it right, in a noiseless setting, for a fixed D, if all f are equally likely, the expected off-training-error of any hypothesis h is 0.5 (part d of problem 1.10, page 37) and hence any two algorithms are the same in terms of expected off training error (part e of the same problem).

My question is, does this not contradict the generalization by Hoeffding. Specifically, the following point is bothering me

By Hoeffding : Ein approaches Eout for larger number of hypothesis (i.e for small epsilon) as N grows sufficiently large. Which would imply that expected(Eout) should be approximately the same as expected(Ein) and not a constant (0.5).

Can you please provide some insight on this, perhaps my comparison is erroneous.

Have faith in probability
Reply With Quote