![]() |
#1
|
|||
|
|||
![]()
I generated 10 real number pairs between [-1,+1] and labelled them using the function f = 1+1.91*x1 + 2.13*x2. Used LibSVM package in a Windows C++ environment setting SVM_Type as C_SVC and kernel_type as LINEAR. Called svm_train() to train on this data. It returned 6 SVs and model parameters. Using the same data set for testing (called svm_test()), I found that 2 out of the 10 points are misclassified! Happens quite frequently on repeating this process by randomly changing the function F() and the data. That's quite a poor result given that a PLA can easily get to 100% classification on the training sets. I have to believe I'm doing something wrong. Has anyone used LibSVM in this way? Would appreciate any help, tips or pointers. This is driving me up the wall! Thanks.
|
#2
|
||||
|
||||
![]()
In this homework, where we are using "hard-margin" SVM (the type covered in Lecture 14, as opposed to "soft-margin" which is covered next week) and with the data set being linearly separable by design, the in-sample error has to be zero. There has to be a bug somewhere. Hopefully others can share their experience with other QP packages as well.
__________________
Where everyone thinks alike, no one thinks very much |
#3
|
|||
|
|||
![]()
I'm using a different package (python and cvxopt), but had the same issue (some points would occasionally be misclassified). Turned out to be a bug in my code, in the way I was computing
![]() ![]() Try checking whether the returned solution satisfies the constraints, and whether it is at least a local minimum (small changes in ![]() ![]() ![]() |
#4
|
|||
|
|||
![]()
I am using svm command in R using e1071 package. I get good results from it. But I found that for "linear" kernel I need to set the cost (i.e. C) value very high (e.g. 200 to 2000) to get a good fit and low number of support vectors. The problem did not specify we should be using high C for model fitting. Is that a practical observation for a linear kernel?
|
#5
|
|||
|
|||
![]() Quote:
|
#6
|
|||
|
|||
![]()
I'm getting pretty poor results from qp() in Octave: so far it has classified my points correctly, but frequently the line is in the wrong place, so that it is not equally far from the support points. I found from discussion by the previous class that this is a known problem with the Octave implementation, and there is a trick (which I haven't tried yet) to make Octave perform better.
So it wouldn't surprise me if other languages also were giving so-so results, and that this is a problem that has nothing to do with student bugs. On the other hand, lots of people have bugs in their code... |
#7
|
|||
|
|||
![]()
Thanks for your suggestions, Ilya, Anand & Anne.
I'll try making the cost parameter C high. So far I only tried C=1.0 and 0.5 I've seen that the SV's that the LibSVM package identifies cannot be equidistant to any linear equation. All their alpha values are identical. In my implementation I do the training, which creates the model and then do a test with the exact same file as the training file. The testing uses the model file created by LibSVM. Thus alpha's and the parameter b etc. are transparent and I don't mess with them. The only possibility of errors is if some wrong parameter or errors in the input. |
#8
|
|||
|
|||
![]()
Thanks for the hint on the cost parameter - that finally got my R code to produce sensible results :-)
|
#9
|
|||
|
|||
![]()
Just wanted to close the loop. The cost parameter C was the culprit! When it is > ~50.0 then the results make sense and the number of SVs and alpha values stabilizes. Around C=15, the the number of SV's begins to increase and results get poorer. I think the comment that high C forces a hard margin solution is possibly the explanation. I noticed that at lower values of C the number of iterations used in getting a solution is also lower.
Thanks all. |
![]() |
Thread Tools | |
Display Modes | |
|
|