LFD Book Forum Help in understanding proof for VC-dimension of perceptron.

#1
07-22-2015, 11:20 PM
 yongxien Junior Member Join Date: Jun 2015 Posts: 8
Help in understanding proof for VC-dimension of perceptron.

https://work.caltech.edu/library/072.pdf

I am referring to the slides given in the link above.

I have a few questions regarding this proof:
1. Is the matrix invertible because of the way we construct it , such that it is lower triangular. If it is the case, I don't see why it does not work for d+2 or any other k > dimension of the perceptron.
2. Why the second part of the proof d + 1 >= d_vc not work for the case when k = d + 1 but only d +2?
3. I don't understand the statement why more points than dimension means we must have x_j = \sigma a_i x_i? (btw, d+1 is also more points than dimension)
#2
07-23-2015, 12:01 AM
 yongxien Junior Member Join Date: Jun 2015 Posts: 8
Re: Help in understanding proof for VC-dimension of perceptron.

I don't understand the second part of the proof too. Why y_i = sign(w^T x_i) = sign(a_i).
#3
07-23-2015, 01:20 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by yongxien 1. Is the matrix invertible because of the way we construct it , such that it is lower triangular. If it is the case, I don't see why it does not work for d+2 or any other k > dimension of the perceptron.
The number of columns is restricted to (length of the input vector) so the matrix would not be square if we had rows.

Quote:
 2. Why the second part of the proof d + 1 >= d_vc not work for the case when k = d + 1 but only d +2?
Because if you have only vectors, they can be linearly independent. The inevitable linear dependence of any vectors is what makes that part of the proof work.

Quote:
 3. I don't understand the statement why more points than dimension means we must have x_j = \sigma a_i x_i? (btw, d+1 is also more points than dimension)
This is a consequence of the basic linear algebra result mentioned in the response to point 2. As for the other remark, the length of the ector is , so (because of the zeroth coordinate) would not be more points than dimensions.
__________________
Where everyone thinks alike, no one thinks very much
#4
05-05-2016, 06:55 AM
 MaciekLeks Member Join Date: Jan 2016 Location: Katowice, Upper Silesia, Poland Posts: 17
Re: Help in understanding proof for VC-dimension of perceptron.

Context:
To show that we need to prove we cannot shatter any set od points. In this case we have vectors of , where , hence we have vector than the total number of dimensions (our space is -dimensional). According to the theory of vector spaces when we have more vectors than dimensions of the space then they are linearly dependent (one of the vectors is a linear combination of the others). Which means that for instance our extra point , can be computed by the equation (in -dimensional space): , (where not all coefficients are zeros).
We need to show that we can't get ( ) dichotomies, where is of size.

Problem:

And now let me define my problem I have with understanding the video proof. Why do we correlate coefficient with the while they don't correlate? On what basis we can draw that , and that's why , hence we can't generate ?
#5
05-05-2016, 09:57 AM
 ntvy95 Member Join Date: Jan 2016 Posts: 37
Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by MaciekLeks Context: To show that we need to prove we cannot shatter any set od points. In this case we have vectors of , where , hence we have vector than the total number of dimensions (our space is -dimensional). According to the theory of vector spaces when we have more vectors than dimensions of the space then they are linearly dependent (one of the vectors is a linear combination of the others). Which means that for instance our extra point , can be computed by the equation (in -dimensional space): , (where not all coefficients are zeros). We need to show that we can't get ( ) dichotomies, where is of size. Problem: And now let me define my problem I have with understanding the video proof. Why do we correlate coefficient with the while they don't correlate? On what basis we can draw that , and that's why , hence we can't generate ?
Here is my understanding:

For any data set of size d + 2, we must have linear dependence, and the question is: With such inevitable linear dependence, can we find at least a specific data set that can be implemented 2^N dichotomies? The video lecture shows that for any data set of size d + 2, there are some dichotomies (specific to the data set) that the perceptron cannot implement, hence there's no such a data set of size d + 2 can be shattered by the perceptron hypothesis set.

The proof tries to consider some dichotomies (specific to the data set) have two following properties:
- with non-zero get .
- gets .

Hope this helps.
#6
05-06-2016, 12:40 AM
 MaciekLeks Member Join Date: Jan 2016 Location: Katowice, Upper Silesia, Poland Posts: 17
Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by ntvy95 Here is my understanding: For any data set of size d + 2, we must have linear dependence, and the question is: With such inevitable linear dependence, can we find at least a specific data set that can be implemented 2^N dichotomies? The video lecture shows that for any data set of size d + 2, there are some dichotomies (specific to the data set) that the perceptron cannot implement, hence there's no such a data set of size d + 2 can be shattered by the perceptron hypothesis set.
I agree. I tried to write it in Context part of my post.

Quote:
 Originally Posted by ntvy95 The proof tries to consider some dichotomies (specific to the data set) have two following properties: - with non-zero get . - gets . Hope this helps.
Unfortunately it does not help. I understand the assumption, but I do not understand the source of confidence that we can correlate with perceptron outputs.
#7
05-06-2016, 01:42 AM
 ntvy95 Member Join Date: Jan 2016 Posts: 37
Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by MaciekLeks Yes, I tried to write it in Context part of my post. Unfortunately it does not help. I understand the assumption, but I do not understand the source of confidence that we can correlate with perceptron outputs.
I'm not sure if I understand your point right:

If the perceptron hypothesis set can shatter the data set then for the same , there exists and such that and . It means that if the data set is shatterable by the perceptron hypothesis set, for any in the data set, we can choose the case to be or and be sure that there exists at least one hypothesis from perceptron hypothesis set can implement the case we have chosen.

Instead of choosing or explicitly, the proof let the choice for the value of depends on the value of : , because for whatever the real value of is, only has value of 1 or -1, hence or . In my understanding, this dependence does not make the chosen dichotomy invalid.
#8
05-06-2016, 03:34 AM
 MaciekLeks Member Join Date: Jan 2016 Location: Katowice, Upper Silesia, Poland Posts: 17
Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by ntvy95 Instead of choosing or explicitly, the proof let the choice for the value of depends on the value of : , because for whatever the real value of is, only has value of 1 or -1, hence or . In my understanding, this dependence does not make the chosen dichotomy invalid.
That's the point. This part is crucial: "the proof let the choice for the value of depends on the value of ". The only certain correlation is . How do we know (definition, theorem, lemma,...) that works the same as ? IMHO it cannot be drawn from the , (where not all coefficients are zeros).
#9
05-06-2016, 08:08 AM
 ntvy95 Member Join Date: Jan 2016 Posts: 37
Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by MaciekLeks That's the point. This part is crucial: "the proof let the choice for the value of depends on the value of ". The only certain correlation is . How do we know (definition, theorem, lemma,...) that works the same as ? IMHO it cannot be drawn from the , (where not all coefficients are zeros).
I'm not sure if this is the point you are talking about: Here is my understanding:

If the perceptron hypothesis set can shatter the data set then: If for every that then there exists such a that satisfies for every that . In other words, we first have for every that , then we must be able to find a that satisfies the equation if the perceptron hypothesis set can shatter the data set.
#10
05-09-2016, 07:18 AM
 MaciekLeks Member Join Date: Jan 2016 Location: Katowice, Upper Silesia, Poland Posts: 17
Re: Help in understanding proof for VC-dimension of perceptron.

Quote:
 Originally Posted by ntvy95 I'm not sure if this is the point you are talking about: Here is my understanding: If the perceptron hypothesis set can shatter the data set then: If for every that then there exists such a that satisfies for every that . In other words, we first have for every that , then we must be able to find a that satisfies the equation if the perceptron hypothesis set can shatter the data set.
You've built up the intuition for the proof. I knew this, but when I read what you wrote, I started to have less doubt. Thanks for that. I still have some doubt, but it's even hard to describe. It's high time to move on.

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 01:36 PM.