![]() |
|
#1
|
|||
|
|||
![]()
I really struggled to understand Hoeffding’s inequality in the lectures and now we have a question on it so I really would like to understand. Please can anybody help?...
P[|Ein(g) − Eout(g)| > epsilon] ≤ ... My questions are: 1) from the lectures I understood Eout(g) to be the "true" error (if you had access to the full population of data). In reality, when we are applying a machine learning technique, we do so because we have limited data and therefore I fail to understand how this formula can be applied? Perhaps I misunderstood, perhaps Eout is the error found from a seperate set of data (not from the whole population)? I.e., I have learn't elswehere that it is a good idea when "learning from data" to have a training set, a cross validation set and a test set. Maybe Eout actually refers to the cross validation set? 2) what is a suitable value for epsilon? This wasn't mentioned in the lectures (unless I missed it)? Can this be calculated? |
#2
|
|||
|
|||
![]()
Just watching lecture 2 again and is all starting to make sense.. but I'm sure I will still be back with more questions later lol
|
#3
|
|||
|
|||
![]()
Going back to my questions, I now feel I grasp the principles and realise that because the right side of the inequality equation doesn't rely on Eout (or Ein) to calculate, that this tells us the probability of generalisation based on values of n, epsilon and M. But M will be infinite when we deal with learning techniques such as perceptron learning etc. i.e., providing a large value of n (big sample size) will tend to mean better generalisation, a large value of M will have the adverse affect causing poor generalisation.
Because we were told M will be infinite for "learning techniques" I believe we will be told a way to get round this problem in future lectures? I believe the epsilon gives us a confidence interval as to whether the sample (Ein) generalises to the overall population. Please could somebody confirm the following: if I set epsilon to say 0.01 in Hoeffding’s inequality, is this equivalent to saying "what is the probability that Ein DOES NOT generalise to Eout by at least 99%, given the parameters n and M"? Last edited by itooam; 07-18-2012 at 05:34 AM. Reason: incorrect wording in my last paragraph |
#4
|
||||
|
||||
![]() Quote:
![]()
__________________
Where everyone thinks alike, no one thinks very much |
#5
|
|||
|
|||
![]()
Thanks yaser, I will look forward to that lecture (btw, I'm loving the course
![]() |
#6
|
|||
|
|||
![]()
I got all numbers, but don't understand the problem.
First, c_rand and c_min aren't certain coins. They're many random different coins so we can't derive their features from the experiment. Formally we can speak of c_1 only. Second, all three nu's (their averages) converge to theoretical limits - not sure I can post actual numbers - two equal numbers and another one. In this sense all three "coins" satisfy Hoeffding’s inequality, that is they generate data that probabilistically converge to certain limits (not 0.5 in case of c_min). But we can't select all three as an answer. So I would be grateful for clarification to this problem |
#7
|
||||
|
||||
![]()
This is precisely the point why there is a question whether or not they satisfy Hoeffding inequality. The fact that they are not fixed coins does not prevent us from recording whether we got heads or tails from them in each iteration, so we can compute the frequency of heads. Since all the coins are identical in terms of the probability of getting a head or a tail, we have that same probability of heads regardless of which coin comes out.
Now the question is, when you compute the statistics of heads and tails in such experiment, do these statistics happen to fall under the bound that the basic Hoeffding inequality predicts for a fixed coin, notwithstanding that this experiment is not based on a fixed coin?
__________________
Where everyone thinks alike, no one thinks very much |
#8
|
|||
|
|||
![]()
Thank you for explanation, Professor.
Quote:
All three "coins" satisfy Hoeffding’s inequality in my opinion. Of cause c_min is a special story, it has another prediction. It's apparent. Maybe I am overthinking the problem. |
![]() |
Thread Tools | |
Display Modes | |
|
|