Hoeffding’s inequality
I really struggled to understand Hoeffding’s inequality in the lectures and now we have a question on it so I really would like to understand. Please can anybody help?...
P[Ein(g) − Eout(g) > epsilon] ≤ ...
My questions are:
1) from the lectures I understood Eout(g) to be the "true" error (if you had access to the full population of data). In reality, when we are applying a machine learning technique, we do so because we have limited data and therefore I fail to understand how this formula can be applied? Perhaps I misunderstood, perhaps Eout is the error found from a seperate set of data (not from the whole population)? I.e., I have learn't elswehere that it is a good idea when "learning from data" to have a training set, a cross validation set and a test set. Maybe Eout actually refers to the cross validation set?
2) what is a suitable value for epsilon? This wasn't mentioned in the lectures (unless I missed it)? Can this be calculated?
