LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 7 (http://book.caltech.edu/bookforum/forumdisplay.php?f=136)
-   -   Q9, SVM vs PLA (http://book.caltech.edu/bookforum/showthread.php?t=4301)

 Dorian 05-20-2013 09:28 PM

Q9, SVM vs PLA

In Question 9 I would have expected naively that the more training points one has, the closer are SVM and PLA and thus a more "balanced" percentage of SVM being better than PLA.

I am saying this because with more training points you have less margin for the margin (sorry for the game of words). My program also concluded this but obviously something went wrong both with the program and my expectation :)

Why does the opposite happen, i.e. SVM approximates better the target function than PLA with more points?

 marek 05-20-2013 09:48 PM

Re: Q9, SVM vs PLA

Quote:
 Originally Posted by Dorian (Post 10895) In Question 9 I would have expected naively that the more training points one has, the closer are SVM and PLA and thus a more "balanced" percentage of SVM being better than PLA. I am saying this because with more training points you have less margin for the margin (sorry for the game of words). My program also concluded this but obviously something went wrong both with the program and my expectation :) Why does the opposite happen, i.e. SVM approximates better the target function than PLA with more points?
From my numbers, the performance of SVM vs PLA didn't change very much at all between N = 10 and N = 100. Granted I'm not sure my program is functioning properly since I know I have a few small bugs with the QP.

 yaser 05-20-2013 10:16 PM

Re: Q9, SVM vs PLA

Quote:
 Originally Posted by Dorian (Post 10895) In Question 9 I would have expected naively that the more training points one has, the closer are SVM and PLA and thus a more "balanced" percentage of SVM being better than PLA.
Without commenting directly on whether the percentage would go up or down or stay the same, let me just address the quoted point. The fact that there is less room for improvement doesn't necessarily relate to how often SVM would beat PLA, since the percentage reflects being better regardless of how much better it is.

 Dorian 05-20-2013 10:22 PM

Re: Q9, SVM vs PLA

ok, thinking more about it, maybe this happens because SVM generalizes better as it has a better effective dVC. Still looking for that bug in my code :)

 catherine 05-21-2013 05:51 PM

Re: Q9, SVM vs PLA

Same thing here. I used ipop from the kernlab package in R. I checked Ein and b, they behave as expected, and I'm getting the expected number of support vectors. I also plotted the results for one iteration, they match the figures I'm getting. Still the performance of my SVM model is only marginally better than the performance of a perceptron-based model, especially for N = 100.

Here are the results I'm getting:
For N = 10: SVM fares better than PLA for 63.9 % of the iterations . |EoutSVM| = 0.09050551, where as |EoutPLA| = 0.1221962
For N = 100: Even though for 56.9% of the iterations SVM fares better than PLA, |EoutSVM| = 0.01877277, where as |EoutPLA| = 0.01374174

In a way these results (I mean the fact that PLA catches up on SVM the larger the training set is) match my expectations - though I'm a bit disappointed about the SVM's lack of flamboyance in this particular case - is this because this is completely random data? They don't match the answer key though, according to which the SVM's overall performance as compared to PLA improves with the number of items in the training set. :clueless:

Note: Not sure this is relevant - I'm using a test set of 1,000 data points.

 Elroch 05-22-2013 02:56 AM

Re: Q9, SVM vs PLA

You can think of it like this. How big are the sets of misclassified points in the two experiments? How many of your 1000 points are misclassified on average? How accurate an estimate do you think you are getting for each of the misclassified sets?

Actually it's worse than if you want to estimate the misclassification error for one method, as if and are the two sets of misclassified points, you are only interested in the points that are in one set but not the other.

Note: if you have a fraction of a set that you are trying to estimate and you use N sample points, it's not difficult to calculate the standard deviation on such an estimate, which you can use to get a very good handle on how reliable your estimates and conclusions are.

 jlaurentum 05-22-2013 06:23 AM

Re: Q9, SVM vs PLA

@Catherine:

I also attempted to use ipop in the R kernlab package. I was having issues with the upper u constraint bounding the alphas. Depending on the u value I used, I'd get more volatility on the differences in the b values (I mean the bias b term in the weights). As many in other threads have pointed out, you never get any alphas equal to zero, just really low values on the order of <10^(-5). No matter if I calculated the weight vector summing up over all alphas or just wiping out those alphas close to zero, my bias terms were not equal when I solved for the support vectors. What really rang the alarm bells though was that the Ein error rate for the proposed solution obtained through the quadratic programming routine was never zero. Furthermore, sometimes the ipop returned with errors.

So I opted for using the ksvm function in the same package to obtain the support vector solutions and thereafter usign predict to calculate the out of sample error rate (with a large test data set). The ksvm function always returned an insample error of zero but, although I got question 8 correct using it, I failed to get questions 9 and 10 correctly.

Could you indicate how you got the ipop function to work? what parameters did you feed it? Did u use "vanilladot" as the kernel function for the H matrix?

 Elroch 05-22-2013 07:19 AM

Re: Q9, SVM vs PLA

I managed to get ipop to work with vanilladot. Took me a while before I was completely confident in the results though.

Did you plot your solutions? This is what made me confident I had it spot on. [You can do a pretty good job of SVM with 100 points in 2 dimensions using a ruler (straightedge)]
Did you manage to keep the sampling errors low enough as hinted at in my last post?

 jlaurentum 05-22-2013 10:53 AM

Re: Q9, SVM vs PLA

Elroch:

I didn't plot the solutions obtained by ipop because on seeing that the insample error was not zero, that invalidated everything for me. What parameters did you use for ipop?

 Elroch 05-22-2013 12:16 PM

Re: Q9, SVM vs PLA

Quote:
 Originally Posted by jlaurentum (Post 10918) Elroch: I didn't plot the solutions obtained by ipop because on seeing that the insample error was not zero, that invalidated everything for me. What parameters did you use for ipop?
You may have seen a clue from the plot. I recall it helped me.

Essentially, I used the recipe in the R documentation page for ipop, except after a bit of experimentation I changed the value of H to kernelPol(vanilladot(),x,,y) and played about with the cost (I'm still not sure about that - anyone able to clarify?)

I should point out that (possibly due to not being at all familiar with ipop) I wrote a chunk of code to get the hypothesis needed for comparison. Basically it constructed the weight vector from the alphas and the support vectors as described in the lecture, calculated the values of the dot products on all of the support vectors, and then adjusted the first parameter of the weight vector so that the zero line was right in the middle of the support vectors. The main help of visualisation was seeing that the right points were support vectors. I am guessing there is probably a way to do this more directly (by doing the dual?)

All times are GMT -7. The time now is 09:53 AM.