LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   The Final (http://book.caltech.edu/bookforum/forumdisplay.php?f=138)
-   -   P13 Question (http://book.caltech.edu/bookforum/showthread.php?t=1571)

 DavidNJ 09-17-2012 02:26 AM

P13 Question

The problem wording is to find the runs that are not linearly separable with a hard margin SVM. The expectation is that there are some.

Using LIBSVM with C=10^6, Ein is always 0; svmpredict is always 100%. The number of support vectors varies between 6 and 12, mode and median are 8.

Based on the question I was expecting some to be linearly inseparable. I increased the noise level (from .25 to 5.0). Still solved everyone, although the number of support vectors went up.

I tried 1000 points instead of 100. Eout soared, but Ein didn't budge.

 Andrs 09-17-2012 05:09 AM

Re: P14 Question

Quote:
 Originally Posted by DavidNJ (Post 5401) The problem wording is to find the runs that are not linearly separable with a hard margin SVM. The expectation is that there are some. Using LIBSVM with C=10^6, Ein is always 0; svmpredict is always 100%. The number of support vectors varies between 6 and 12, mode and median are 8. Based on the question I was expecting some to be linearly inseparable. I increased the noise level (from .25 to 5.0). Still solved everyone, although the number of support vectors went up. I tried 1000 points instead of 100. Eout soared, but Ein didn't budge.
How many experiments are you running with 100 diff training points in each?
If you are running enough experiments, you should get some experiments with data that are not linearly separable with the rbf kernel. Are you calculating your f(x) correctly?

 MLearning 09-17-2012 06:12 AM

Re: P14 Question

Quote:
 Originally Posted by Andrs (Post 5408) How many experiments are you running with 100 training points? If you are running enough experiments, you should get some experiments with data that are not linearly separable with the rbf kernel. Are you calculating your f(x) correctly?
Regardless of the number of experiments run, libsvm returns Ein=0. I tried 10000 runs and libsvm sees a linearly separable data for all the runs. On the other hand, an Octave implementation that I wrote returns Ein ~=0 most of the time (clearly qp could not handle 100X100 matrix).

 vtrajan@vtrajan.net 09-17-2012 06:24 AM

Re: P14 Question

Quote:
 Originally Posted by DavidNJ (Post 5401) The problem wording is to find the runs that are not linearly separable with a hard margin SVM. The expectation is that there are some. Using LIBSVM with C=10^6, Ein is always 0; svmpredict is always 100%. The number of support vectors varies between 6 and 12, mode and median are 8. Based on the question I was expecting some to be linearly inseparable. I increased the noise level (from .25 to 5.0). Still solved everyone, although the number of support vectors went up. I tried 1000 points instead of 100. Eout soared, but Ein didn't budge.
My results agree with yours. I used libvsm in python (sklearn).
Rajan

 DavidNJ 09-17-2012 06:47 AM

Re: P14 Question

Quote:
 Originally Posted by MLearning (Post 5412) Regardless of the number of experiments run, libsvm returns Ein=0. I tried 10000 runs and libsvm sees a linearly separable data for all the runs. On the other hand, an Octave implementation that I wrote returns Ein ~=0 most of the time (clearly qp could not handle 100X100 matrix).
Which would be correct: LIBSVM or your Octave QP implementation?

Note, Eout varies, gets worse with a higher N, and with greater noise (presumably the RBF kernel is fitting the noise).

 MLearning 09-17-2012 07:04 AM

Re: P14 Question

Quote:
 Originally Posted by DavidNJ (Post 5416) Which would be correct: LIBSVM or your Octave QP implementation? Note, Eout varies, gets worse with a higher N, and with greater noise (presumably the RBF kernel is fitting the noise).
In my case, when I plot the training data, I can verify that the data is linearly separable. But my Octave impelemtation fails to see it. Personally, I would rely on libsvm given the fact it is designed to handle large matrices and uses efficient optimization techniques.

 yaser 09-17-2012 07:46 AM

Re: P14 Question

Quote:
 Originally Posted by DavidNJ (Post 5401) The problem wording is to find the runs that are not linearly separable with a hard margin SVM. The expectation is that there are some.
There is not necessarily an expectation one way or the other. You should report whatever the data gives you.

 DavidNJ 09-17-2012 01:44 PM

Re: P14 Question

Quote:
 Originally Posted by yaser (Post 5422) There is not necessarily an expectation one way or the other. You should report whatever the data gives you.
Professor Yaser: A trick question? Ouch! :(

Quote:
 Originally Posted by MLearning In my case, when I plot the training data, I can verify that the data is linearly separable.
The noise definition (.25*sin()) limits the size of the noise to fairly small values. And, this is fitting an arbitrary number of support vectors (in my test, up to 12 with .25 and up to 17 with 5.0) which greatly extends what is "linear". For comparison, question 12 could be answered on visual inspection of the plot.

 yaser 09-17-2012 02:31 PM

Re: P14 Question

Quote:
 Originally Posted by DavidNJ (Post 5436) Professor Yaser: A trick question? Ouch! :(
We have to keep everyone on their toes. :)

 All times are GMT -7. The time now is 10:40 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.