View Single Post
Old 04-12-2013, 03:58 PM
grozhd grozhd is offline
Junior Member
Join Date: Apr 2013
Posts: 4
Default Re: Is the Hoeffding Inequality really valid for each bin despite non-random sampling

I have the same concern as scottedwards2000 and I still don't understand how it is resolved.

As I understand the bin symbolizes the probability space of all possible inputs X. Sample of balls drawn randomly from the bin symbolizes our training set \bar{x_0}=(x_1,...,x_N).

Now we pick a hypothesis h_1 (suppose we are running PLA). We look at our sample \bar{x_0}, compute \nu and use Hoeffding's Inequality. We do one step of PLA and come up with new hypothesis h_2 which automatically gives us \nu_2 and professor is saying that we can write down Hoeffding inequality for \nu_2 and \mu_2?

I guess, we can. But that inequality tells us something about random variable \nu_2: X^n \rightarrow [0;1], i.e. about: P\left( |\nu_2(\bar{x}) - \mu_2 |\right) where \bar{x} is a random sample. But it seems like we are using \nu_2(\bar{x_0}) where \bar{x_0} is hardly random with regard to h_2 since we built h_2 using that sample.

Here is an example that illustrates my point: say we tried some random h_1, compared it with target function f on our training sample (x_1,...,x_N), wrote down Hoeffding's inequality. Now let's construct h_2 as follows: h_2(x_i) = f(x_i), \ i=1,...,N and h_2(x) = h_1(x), \ x \neq x_i, \ i=1,...,N. Let's write down Hoeffding's ineqaulity for this hypothesis. If we are indeed using \nu_2(\bar{x_0}) then here it would be equal to 1 since h_2 = f on \bar{x_0} and we would have: P\left( |1 - \mu_2| > \varepsilon \right) is small. So somehow we are saying with high probability that h_2 does an excellent job out of sample though we didn't change it much from h_1. This example shouldn't be correct, right? If it isn't how is the one with PLA correct?
Reply With Quote