Quote:
Originally Posted by hsolo
Is the 'heuristic' number of parameters (the VC dimension proxy) to be used while reasoning about generalization then the number of margin support vectors << the number of all support vectors?
When we use kernel functions with soft SVMs (problem 2 etc), where there is no explicit w, does the above translate to :
* 1==> Use all support vectors to compute the sigma term in the hypothesis function g()
* 2==> Use only margin support vectors for b (which is also used in g()
I was wondering if this aspect was covered in the lecture or any of the additional material -- I seem to have missed.
|
The computation of

involves all support vectors, margin and otherwise, since it involves all

's that are bigger than zero. Assuming

has been computed, the computation of

, for both hard and soft margins, involves any one support vector (margin support vector in the case of soft margin) since it is based on solving the equation

for

.
In the case of kernels, the explicit evaluation of

followed by taking an inner product with a point

is replaced by evaluating the kernel with two arguments; one is a support vector (margin or otherwise) and the other is the point

, and repeating that for all support vectors (margin or otherwise).