LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 2 - Training versus Testing (http://book.caltech.edu/bookforum/forumdisplay.php?f=109)
-   -   Help in understanding proof for VC-dimension of perceptron. (http://book.caltech.edu/bookforum/showthread.php?t=4609)

 yongxien 07-22-2015 11:20 PM

Help in understanding proof for VC-dimension of perceptron.

https://work.caltech.edu/library/072.pdf

I am referring to the slides given in the link above.

I have a few questions regarding this proof:
1. Is the matrix invertible because of the way we construct it , such that it is lower triangular. If it is the case, I don't see why it does not work for d+2 or any other k > dimension of the perceptron.
2. Why the second part of the proof d + 1 >= d_vc not work for the case when k = d + 1 but only d +2?
3. I don't understand the statement why more points than dimension means we must have x_j = \sigma a_i x_i? (btw, d+1 is also more points than dimension)

 yongxien 07-23-2015 12:01 AM

Re: Help in understanding proof for VC-dimension of perceptron.

I don't understand the second part of the proof too. Why y_i = sign(w^T x_i) = sign(a_i).

 yaser 07-23-2015 01:20 AM

Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by yongxien (Post 11986) 1. Is the matrix invertible because of the way we construct it , such that it is lower triangular. If it is the case, I don't see why it does not work for d+2 or any other k > dimension of the perceptron.
The number of columns is restricted to (length of the input vector) so the matrix would not be square if we had rows.

Quote:
 2. Why the second part of the proof d + 1 >= d_vc not work for the case when k = d + 1 but only d +2?
Because if you have only vectors, they can be linearly independent. The inevitable linear dependence of any vectors is what makes that part of the proof work.

Quote:
 3. I don't understand the statement why more points than dimension means we must have x_j = \sigma a_i x_i? (btw, d+1 is also more points than dimension)
This is a consequence of the basic linear algebra result mentioned in the response to point 2. As for the other remark, the length of the ector is , so (because of the zeroth coordinate) would not be more points than dimensions.

 MaciekLeks 05-05-2016 06:55 AM

Re: Help in understanding proof for VC-dimension of perceptron.

Context:
To show that we need to prove we cannot shatter any set od points. In this case we have vectors of , where , hence we have vector than the total number of dimensions (our space is -dimensional). According to the theory of vector spaces when we have more vectors than dimensions of the space then they are linearly dependent (one of the vectors is a linear combination of the others). Which means that for instance our extra point , can be computed by the equation (in -dimensional space): , (where not all coefficients are zeros).
We need to show that we can't get ( ) dichotomies, where is of size.

Problem:

And now let me define my problem I have with understanding the video proof. Why do we correlate coefficient with the while they don't correlate? On what basis we can draw that , and that's why , hence we can't generate ?

 ntvy95 05-05-2016 09:57 AM

Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by MaciekLeks (Post 12334) Context: To show that we need to prove we cannot shatter any set od points. In this case we have vectors of , where , hence we have vector than the total number of dimensions (our space is -dimensional). According to the theory of vector spaces when we have more vectors than dimensions of the space then they are linearly dependent (one of the vectors is a linear combination of the others). Which means that for instance our extra point , can be computed by the equation (in -dimensional space): , (where not all coefficients are zeros). We need to show that we can't get ( ) dichotomies, where is of size. Problem: And now let me define my problem I have with understanding the video proof. Why do we correlate coefficient with the while they don't correlate? On what basis we can draw that , and that's why , hence we can't generate ?
Here is my understanding:

For any data set of size d + 2, we must have linear dependence, and the question is: With such inevitable linear dependence, can we find at least a specific data set that can be implemented 2^N dichotomies? The video lecture shows that for any data set of size d + 2, there are some dichotomies (specific to the data set) that the perceptron cannot implement, hence there's no such a data set of size d + 2 can be shattered by the perceptron hypothesis set.

The proof tries to consider some dichotomies (specific to the data set) have two following properties:
- with non-zero get .
- gets .

Hope this helps.

 MaciekLeks 05-06-2016 12:40 AM

Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by ntvy95 (Post 12336) Here is my understanding: For any data set of size d + 2, we must have linear dependence, and the question is: With such inevitable linear dependence, can we find at least a specific data set that can be implemented 2^N dichotomies? The video lecture shows that for any data set of size d + 2, there are some dichotomies (specific to the data set) that the perceptron cannot implement, hence there's no such a data set of size d + 2 can be shattered by the perceptron hypothesis set.
I agree. I tried to write it in Context part of my post.

Quote:
 Originally Posted by ntvy95 (Post 12336) The proof tries to consider some dichotomies (specific to the data set) have two following properties: - with non-zero get . - gets . Hope this helps.
Unfortunately it does not help. I understand the assumption, but I do not understand the source of confidence that we can correlate with perceptron outputs.

 ntvy95 05-06-2016 01:42 AM

Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by MaciekLeks (Post 12339) Yes, I tried to write it in Context part of my post. Unfortunately it does not help. I understand the assumption, but I do not understand the source of confidence that we can correlate with perceptron outputs.
I'm not sure if I understand your point right:

If the perceptron hypothesis set can shatter the data set then for the same , there exists and such that and . It means that if the data set is shatterable by the perceptron hypothesis set, for any in the data set, we can choose the case to be or and be sure that there exists at least one hypothesis from perceptron hypothesis set can implement the case we have chosen.

Instead of choosing or explicitly, the proof let the choice for the value of depends on the value of : , because for whatever the real value of is, only has value of 1 or -1, hence or . In my understanding, this dependence does not make the chosen dichotomy invalid.

 MaciekLeks 05-06-2016 03:34 AM

Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by ntvy95 (Post 12340) Instead of choosing or explicitly, the proof let the choice for the value of depends on the value of : , because for whatever the real value of is, only has value of 1 or -1, hence or . In my understanding, this dependence does not make the chosen dichotomy invalid.
That's the point. This part is crucial: "the proof let the choice for the value of depends on the value of ". The only certain correlation is . How do we know (definition, theorem, lemma,...) that works the same as ? IMHO it cannot be drawn from the , (where not all coefficients are zeros).

 ntvy95 05-06-2016 08:08 AM

Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by MaciekLeks (Post 12341) That's the point. This part is crucial: "the proof let the choice for the value of depends on the value of ". The only certain correlation is . How do we know (definition, theorem, lemma,...) that works the same as ? IMHO it cannot be drawn from the , (where not all coefficients are zeros).
I'm not sure if this is the point you are talking about: Here is my understanding:

If the perceptron hypothesis set can shatter the data set then: If for every that then there exists such a that satisfies for every that . In other words, we first have for every that , then we must be able to find a that satisfies the equation if the perceptron hypothesis set can shatter the data set.

 MaciekLeks 05-09-2016 07:18 AM

Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by ntvy95 (Post 12342) I'm not sure if this is the point you are talking about: Here is my understanding: If the perceptron hypothesis set can shatter the data set then: If for every that then there exists such a that satisfies for every that . In other words, we first have for every that , then we must be able to find a that satisfies the equation if the perceptron hypothesis set can shatter the data set.
You've built up the intuition for the proof. I knew this, but when I read what you wrote, I started to have less doubt. Thanks for that. I still have some doubt, but it's even hard to describe. It's high time to move on.

All times are GMT -7. The time now is 12:27 AM.