LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 4 (http://book.caltech.edu/bookforum/forumdisplay.php?f=133)
-   -   HW 4 question3 (http://book.caltech.edu/bookforum/showthread.php?t=421)

 rohanag 05-02-2012 04:49 AM

Re: HW 4 question3

thanks elkka , i don't know that I was thinking :(

 silvrous 05-02-2012 05:01 AM

Re: HW 4 question3

Quote:
 Originally Posted by elkka (Post 1753) I first thought the same thing about 1. But then, where do we see ? It is the measure of difference between E_in and E_out, which can be small, and can be big depending on the experiment. Suppose you are talking about an experiment with very large numbers, like the number of minutes people use in a month on a cell phone (which, say, average 200). Than it is totally meaningful to consider a prediction that assures you that (or 5, or 10) with probability 0.95. So, it totally makes sense to rate the bounds even if they all are >1
Except that Ein and Eout are ratios. I quote from HW2: "Eout (number of out-of-sample points misclassiﬁed / total number of outof-sample points)".
Therefore, it is quite impossible for epsilon to ever exceed 1.

 IamMrBB 05-02-2012 05:10 AM

Re: HW 4 question3

Quote:
 Originally Posted by elkka (Post 1753) I first thought the same thing about 1. But then, where do we see ? It is the measure of difference between E_in and E_out, which can be small, and can be big depending on the experiment. Suppose you are talking about an experiment with very large numbers, like the number of minutes people use in a month on a cell phone (which, say, average 200). Than it is totally meaningful to consider a prediction that assures you that (or 5, or 10) with probability 0.95. So, it totally makes sense to rate the bounds even if they all are >1
I don't think you are right on this. E_in and E_out in the Vapnik-Chervonenkis Inequality (lecture 6), which is the basis for the VC bound, are fractions and not absolute numbers. I know elsewhere in the course the professor has used E_out also for numbers which can be bigger than 1 (e.g. squared error, lecture 8), however when you lookup the Vapnik-Chervonenkis Inequality, you'll see that E_in and E_out are probabilities/probility measures (i.e. fraction incorrectly classified).

To see that your example probably doesn't make sense (IMHO): replace the minutes in your example with either nanoseconds or, on the other hand, ages, and you would get very different numbers on the left side of the equation (i.e. epsilon) while it wouldn't make a difference for the right side of the equation. This can't be right (it would e.g. be unlikely that E_in and E_out are 60 seconds apart but likely that they are a minute apart?!): it would make the inequalities meaningless.

Also on the slides of lecture 6, it is fractions (in)correctly classified that are used for the Vapnik-Chervonenkis Inequality.

Dislaimer: I'm not an expert on the matter, and perhaps I miss a/the point somewhere, so hope we'll get a verdict by the course staff.

 elkka 05-02-2012 05:12 AM

Re: HW 4 question3

You know, I think you are right. We are indeed only talking about classification problem, so E_in and E_out must be <= 1.

 kkkkk 05-02-2012 07:21 AM

Re: HW 4 question3

Here is my view which can be wrong. Refer to lecture 4, slides 7 onwards.

Ein and Eout are the average of the error measure per point. And it is up to the user to choose the error measure. So Ein and Eout are just numbers and not probabilities. And so epsilon which is the difference between the two, is also a number.

Also see lecture 8, slides 15 and 20: Eout = bias + variance = 0.21 + 1.69 > 1

 mikesakiandcp 05-02-2012 09:16 AM

Re: HW 4 question3

Quote:
 Originally Posted by silvrous (Post 1723) I also got values larger than 1 for all of them, and therefore considered them to be equally meaningless for small N...
I also assumed this, since it is a classification problem. Since they are bounds and all greater than one, we cannot infer anything about epsilon for all of them in this range of N, thus they should all be equivalent.

 silvrous 05-03-2012 07:57 AM

Re: HW 4 question3

Could someone from the course staff perhaps weigh in on this? There seem to be two equally valid theories....

 yaser 05-03-2012 01:57 PM

Re: HW 4 question3

Quote:
 Originally Posted by silvrous (Post 1812) Could someone from the course staff perhaps weigh in on this? There seem to be two equally valid theories....
If it is a probability then indeed bounds greater than 1 are trivial, but the question just asked about the quality of the bounds for what it's worth. In general, the behavior in practice is proportinal to the VC bound, so the actual value (as opposed to relative value) is not as critical.

 data_user 07-31-2012 05:35 PM

Re: HW 4 question3

It suggested to use the simple approximate bound N^d_vc for the growth function, if N > d_vc. In Problem 3, N=5<d_vc=50. Should we still use N^d_vc as an approximation for the growth function? Or, maybe it is more reasonable to use 2^N, assuming that H is complex enough?

 yaser 07-31-2012 09:24 PM

Re: HW 4 question3

Quote:
 Originally Posted by data_user (Post 3773) It suggested to use the simple approximate bound N^d_vc for the growth function, if N > d_vc. In Problem 3, N=5
Indeed, if , then the growth function is exactly . The fact that is complex enough is already implied by the value of the VC dimension.

All times are GMT -7. The time now is 07:26 AM.