Thread: HW 4 question3 View Single Post
#13
05-02-2012, 06:10 AM
 IamMrBB Invited Guest Join Date: Apr 2012 Posts: 107
Re: HW 4 question3

Quote:
 Originally Posted by elkka I first thought the same thing about 1. But then, where do we see ? It is the measure of difference between E_in and E_out, which can be small, and can be big depending on the experiment. Suppose you are talking about an experiment with very large numbers, like the number of minutes people use in a month on a cell phone (which, say, average 200). Than it is totally meaningful to consider a prediction that assures you that (or 5, or 10) with probability 0.95. So, it totally makes sense to rate the bounds even if they all are >1
I don't think you are right on this. E_in and E_out in the Vapnik-Chervonenkis Inequality (lecture 6), which is the basis for the VC bound, are fractions and not absolute numbers. I know elsewhere in the course the professor has used E_out also for numbers which can be bigger than 1 (e.g. squared error, lecture 8), however when you lookup the Vapnik-Chervonenkis Inequality, you'll see that E_in and E_out are probabilities/probility measures (i.e. fraction incorrectly classified).

To see that your example probably doesn't make sense (IMHO): replace the minutes in your example with either nanoseconds or, on the other hand, ages, and you would get very different numbers on the left side of the equation (i.e. epsilon) while it wouldn't make a difference for the right side of the equation. This can't be right (it would e.g. be unlikely that E_in and E_out are 60 seconds apart but likely that they are a minute apart?!): it would make the inequalities meaningless.

Also on the slides of lecture 6, it is fractions (in)correctly classified that are used for the Vapnik-Chervonenkis Inequality.

Dislaimer: I'm not an expert on the matter, and perhaps I miss a/the point somewhere, so hope we'll get a verdict by the course staff.