Quote:
Originally Posted by MaciekLeks
Context:
To show that  we need to prove we cannot shatter any set od  points. In this case we have  vectors of  , where  , hence we have  vector than the total number of dimensions (our space is  -dimensional). According to the theory of vector spaces when we have more vectors than dimensions of the space then they are linearly dependent (one of the vectors is a linear combination of the others). Which means that for instance our extra point  , can be computed by the equation (in  -dimensional space):  , (where not all  coefficients are zeros).
We need to show that we can't get  (  ) dichotomies, where  is of  size.
Problem:
And now let me define my problem I have with understanding the video proof. Why do we correlate  coefficient with the  while they don't correlate? On what basis we can draw that  , and that's why  , hence we can't generate  ?
|
Here is my understanding:
For
any data set of size d + 2, we must have linear dependence, and the question is: With such inevitable linear dependence, can we find at least a specific data set that can be implemented 2^N dichotomies? The video lecture shows that for
any data set of size d + 2, there are
some dichotomies (specific to the data set) that the perceptron cannot implement, hence there's no such a data set of size d + 2 can be shattered by the perceptron hypothesis set.
The proof tries to consider
some dichotomies (specific to the data set) have two following properties:
-

with non-zero

get

.
-

gets

.
Hope this helps.