![]() |
Hoeffding’s inequality
I really struggled to understand Hoeffding’s inequality in the lectures and now we have a question on it so I really would like to understand. Please can anybody help?...
P[|Ein(g) − Eout(g)| > epsilon] ≤ ... My questions are: 1) from the lectures I understood Eout(g) to be the "true" error (if you had access to the full population of data). In reality, when we are applying a machine learning technique, we do so because we have limited data and therefore I fail to understand how this formula can be applied? Perhaps I misunderstood, perhaps Eout is the error found from a seperate set of data (not from the whole population)? I.e., I have learn't elswehere that it is a good idea when "learning from data" to have a training set, a cross validation set and a test set. Maybe Eout actually refers to the cross validation set? 2) what is a suitable value for epsilon? This wasn't mentioned in the lectures (unless I missed it)? Can this be calculated? |
Re: Hoeffding’s inequality
Just watching lecture 2 again and is all starting to make sense.. but I'm sure I will still be back with more questions later lol
|
Re: Hoeffding’s inequality
Going back to my questions, I now feel I grasp the principles and realise that because the right side of the inequality equation doesn't rely on Eout (or Ein) to calculate, that this tells us the probability of generalisation based on values of n, epsilon and M. But M will be infinite when we deal with learning techniques such as perceptron learning etc. i.e., providing a large value of n (big sample size) will tend to mean better generalisation, a large value of M will have the adverse affect causing poor generalisation.
Because we were told M will be infinite for "learning techniques" I believe we will be told a way to get round this problem in future lectures? I believe the epsilon gives us a confidence interval as to whether the sample (Ein) generalises to the overall population. Please could somebody confirm the following: if I set epsilon to say 0.01 in Hoeffding’s inequality, is this equivalent to saying "what is the probability that Ein DOES NOT generalise to Eout by at least 99%, given the parameters n and M"? |
Re: Hoeffding’s inequality
Thanks yaser, I will look forward to that lecture (btw, I'm loving the course :) a big thank you)
|
Re: Hoeffding’s inequality
I got all numbers, but don't understand the problem.
First, c_rand and c_min aren't certain coins. They're many random different coins so we can't derive their features from the experiment. Formally we can speak of c_1 only. Second, all three nu's (their averages) converge to theoretical limits - not sure I can post actual numbers - two equal numbers and another one. In this sense all three "coins" satisfy Hoeffding’s inequality, that is they generate data that probabilistically converge to certain limits (not 0.5 in case of c_min). But we can't select all three as an answer. So I would be grateful for clarification to this problem |
Re: Hoeffding’s inequality
Quote:
Now the question is, when you compute the statistics of heads and tails in such experiment, do these statistics happen to fall under the bound that the basic Hoeffding inequality predicts for a fixed coin, notwithstanding that this experiment is not based on a fixed coin? |
Re: Hoeffding’s inequality
Thank you for explanation, Professor.
Quote:
All three "coins" satisfy Hoeffding’s inequality in my opinion. Of cause c_min is a special story, it has another prediction. It's apparent. Maybe I am overthinking the problem. |
Re: Hoeffding’s inequality
hi
I have some trouble grasping Hoeffding’s inequality and the idea of using coins as marbles. first of all,the probability of error in whole space is .5,since the coins are fair.while the probability of error in sample space is frequency of heads,say. Hoeffding’s inequality stats that given an experiment,the probability of how badly we learned does not depend on how the experiment performed;however, in case of finite flipping,the right hand side depends on it and the threshold. so,in our problem,after running the experiment we have some coins with known in sample error.we choose one of them and now how we should use the Hoeffding’s inequality? thank you to putting your time on this. |
Re: Hoeffding’s inequality
My point of view for this questions it's that we should take the Hoeffding’s inequality as a form of convergence guarantee. This is an obvious observation of course, but seeing things in this light lets you compare the averages to the true expected value and conclude that if some value is close enough (say with eps = 0.01), then it satisfy the inequality because the probability of something bad happening it's veeery close to zero.
On the subject of the coins experiments, it doesn't matter if the coin you select it's always different, in reality they should be indistinguishable from each other, they all are fair coins independently tossed. |
All times are GMT -7. The time now is 08:52 AM. |
Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.