Question 12
Looks like there's two answers for Q13. It's possible to get different number of support vectors with octave qp and libsvm.

Re: Question 13

Re: Question 13
I think it has to do with the fact that qp ( and quadprog in MATLAB) provide alpha values that are negligbly small. By setting an appropriate threshold, it is possible to filter out these very small values.
In Homework 7, one of the students introduced a trick as means to go around the initialization problem in qp (or quadprod). When I applied this trick, qp and libsvm provide different number of SVs. However, when I initialize all alphas to a vector of zeros, libsvm and Octave's qp yield the same number of SVs. 
Re: Question 13
In this problem vectors are placed symmetrically. In qp solution one of them touches the margin with alpha==0.

Re: Question 13
Quote:

Re: Question 13
This is the only question I got wrong on the final, and I would have got it right if I used my libsvm version of the answer rather than my handbuilt version with qp (all in Octave). My qp (wrong!) answer was one less support vector than I got with libsvm and that might only be because I used 10e012 as a threshhold. (If I had omitted the threshhold I would have gotten the same number of sv's as in libsvm :shock:).
I got w = [0.88889, 5.0e016] and b = 1.6667 using qp, but strangely I get w = [0.88869, 0] and b = 1.6663 using libsvm. They both have Ein=0 and on a thousand test runs of a million random points in [3,3]^2 they agree on labels on average 99.999% of the cases. (For libsvm, I use svmpredict with all labels = +1 which is ~71% accurate :) to get the actual prediction labels.) The difference in sign may not be significant. I got w and b for qp directly by following the class slides, but I got w = model.SVs'*model.sv_coef and b =  model.rho in the libsvm case (which may not be exactly correct). The values of alpha (for qp) are different from model.sv_coef, and the qp version uses all but the last of the libsvm support vectors. So I do agree that there may be 2 correct answers for this question, based on numerical issues and different ways qp and libsvm handle the calculations, but beyond the control of the student. If required I can PM the alphas and the code I used to support the claim, or wait and post an **answer** after the deadline. 
Re: Question 13
Quote:

Re: Question 13
Can you guys do the following: Perturb one of the SV's that are symmetric by a small amount, run your qp programs again, and see if the ambiguity goes away? I will do that myself but I just wanted more people with different packages to try as well. Thank you.

Re: Question 13
Quote:
Code:
w = model.SVs' * model.sv_coef; 
Re: Question 13
Using x = [1 0;0 1;0 1.00002; 1 0; 0 2.0001; 0 2; 2 0]; I still get one less s.v. for qp than libsvm (ie same values I get without perturbing). This remains the case when only perturbing one s.v. The most notable change is that the second weight entry grows, although the first and b also change.
Also, thanks to fgpancorbo for the code for getting w & b from libsvm. Useful for the future. 
Re: Question 13
Quote:
In z space X6 and X7 map to the same point in z space, i.e, X6 (0, 2) and X7 (2,0) map to (3,5). I wonder if this has any effect on the computation. 
Re: Question 13
Quote:

Re: Question 13
Using Matlab libsvm I perturbed [0,1] to [0,0.94] and it reduced the number of support vectors by one. W and b agree with what others have seen for libsvm.

Re: Question 13
I used libsvm and got this question right.

Re: Question 13

Re: Question 13
As a quick (read minimal effort) check of model equivalence, I compared the predicted results of 1,000,000 randomly selected points within [3,3][3,3] using the support vectors from both Octave/QP and Python/libsvm. Only three points were classified differently despite the difference in the number of support vectors returned by the two approaches. I'm certain that an analytical comparison of the support vectors would prove their equivalence; however, it hardly seems necessary given the empirical results.

Re: Question 13
I had quite some problems with solving this problem and it forced me to play around with different approaches (qp, libsvm). One more approach to consider is this: Lecture 15, slide 5:
In the case of the kernel used in exercice 13, there is a corresponding transformation, given explicitly on that slide. So why not giving it a try? I got some confidence in the result after reading the slides title:). 
All times are GMT 7. The time now is 12:16 AM. 
Powered by vBulletin® Version 3.8.3
Copyright ©2000  2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. AbuMostafa, Malik MagdonIsmail, and HsuanTien Lin, and participants in the Learning From Data MOOC by Yaser S. AbuMostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.