LFD Book Forum P13 Question
 User Name Remember Me? Password
 FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
09-17-2012, 03:26 AM
 DavidNJ Member Join Date: Jul 2012 Posts: 28
P13 Question

The problem wording is to find the runs that are not linearly separable with a hard margin SVM. The expectation is that there are some.

Using LIBSVM with C=10^6, Ein is always 0; svmpredict is always 100%. The number of support vectors varies between 6 and 12, mode and median are 8.

Based on the question I was expecting some to be linearly inseparable. I increased the noise level (from .25 to 5.0). Still solved everyone, although the number of support vectors went up.

I tried 1000 points instead of 100. Eout soared, but Ein didn't budge.
#2
09-17-2012, 06:09 AM
 Andrs Member Join Date: Jul 2012 Posts: 47
Re: P14 Question

Quote:
 Originally Posted by DavidNJ The problem wording is to find the runs that are not linearly separable with a hard margin SVM. The expectation is that there are some. Using LIBSVM with C=10^6, Ein is always 0; svmpredict is always 100%. The number of support vectors varies between 6 and 12, mode and median are 8. Based on the question I was expecting some to be linearly inseparable. I increased the noise level (from .25 to 5.0). Still solved everyone, although the number of support vectors went up. I tried 1000 points instead of 100. Eout soared, but Ein didn't budge.
How many experiments are you running with 100 diff training points in each?
If you are running enough experiments, you should get some experiments with data that are not linearly separable with the rbf kernel. Are you calculating your f(x) correctly?
#3
09-17-2012, 07:12 AM
 MLearning Senior Member Join Date: Jul 2012 Posts: 56
Re: P14 Question

Quote:
 Originally Posted by Andrs How many experiments are you running with 100 training points? If you are running enough experiments, you should get some experiments with data that are not linearly separable with the rbf kernel. Are you calculating your f(x) correctly?
Regardless of the number of experiments run, libsvm returns Ein=0. I tried 10000 runs and libsvm sees a linearly separable data for all the runs. On the other hand, an Octave implementation that I wrote returns Ein ~=0 most of the time (clearly qp could not handle 100X100 matrix).
#4
09-17-2012, 07:24 AM
 vtrajan@vtrajan.net Junior Member Join Date: Jul 2012 Posts: 5
Re: P14 Question

Quote:
 Originally Posted by DavidNJ The problem wording is to find the runs that are not linearly separable with a hard margin SVM. The expectation is that there are some. Using LIBSVM with C=10^6, Ein is always 0; svmpredict is always 100%. The number of support vectors varies between 6 and 12, mode and median are 8. Based on the question I was expecting some to be linearly inseparable. I increased the noise level (from .25 to 5.0). Still solved everyone, although the number of support vectors went up. I tried 1000 points instead of 100. Eout soared, but Ein didn't budge.
My results agree with yours. I used libvsm in python (sklearn).
Rajan
#5
09-17-2012, 07:47 AM
 DavidNJ Member Join Date: Jul 2012 Posts: 28
Re: P14 Question

Quote:
 Originally Posted by MLearning Regardless of the number of experiments run, libsvm returns Ein=0. I tried 10000 runs and libsvm sees a linearly separable data for all the runs. On the other hand, an Octave implementation that I wrote returns Ein ~=0 most of the time (clearly qp could not handle 100X100 matrix).
Which would be correct: LIBSVM or your Octave QP implementation?

Note, Eout varies, gets worse with a higher N, and with greater noise (presumably the RBF kernel is fitting the noise).
#6
09-17-2012, 08:04 AM
 MLearning Senior Member Join Date: Jul 2012 Posts: 56
Re: P14 Question

Quote:
 Originally Posted by DavidNJ Which would be correct: LIBSVM or your Octave QP implementation? Note, Eout varies, gets worse with a higher N, and with greater noise (presumably the RBF kernel is fitting the noise).
In my case, when I plot the training data, I can verify that the data is linearly separable. But my Octave impelemtation fails to see it. Personally, I would rely on libsvm given the fact it is designed to handle large matrices and uses efficient optimization techniques.
#7
09-17-2012, 08:46 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: P14 Question

Quote:
 Originally Posted by DavidNJ The problem wording is to find the runs that are not linearly separable with a hard margin SVM. The expectation is that there are some.
There is not necessarily an expectation one way or the other. You should report whatever the data gives you.
__________________
Where everyone thinks alike, no one thinks very much
#8
09-17-2012, 02:44 PM
 DavidNJ Member Join Date: Jul 2012 Posts: 28
Re: P14 Question

Quote:
 Originally Posted by yaser There is not necessarily an expectation one way or the other. You should report whatever the data gives you.
Professor Yaser: A trick question? Ouch!

Quote:
 Originally Posted by MLearning In my case, when I plot the training data, I can verify that the data is linearly separable.
The noise definition (.25*sin()) limits the size of the noise to fairly small values. And, this is fitting an arbitrary number of support vectors (in my test, up to 12 with .25 and up to 17 with 5.0) which greatly extends what is "linear". For comparison, question 12 could be answered on visual inspection of the plot.
#9
09-17-2012, 03:31 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,477
Re: P14 Question

Quote:
 Originally Posted by DavidNJ Professor Yaser: A trick question? Ouch!
We have to keep everyone on their toes.
__________________
Where everyone thinks alike, no one thinks very much

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 07:44 PM.

 Contact Us - LFD Book - Top

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.