LFD Book Forum Should SVMs ALWAYS converge to the same solution given the same data?
 User Name Remember Me? Password
 Register FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
09-03-2012, 03:40 AM
 itooam Senior Member Join Date: Jul 2012 Posts: 100
Should SVMs ALWAYS converge to the same solution given the same data?

Should SVMs ALWAYS converge to the same solution given the same data?

I've been running a few tests on Q7 and I find that if I randomize the order of the data in both the training and test sets, I get different solutions/errors? I am now wondering whether I should be averaging over say a hundred runs and whether I need to go back to previous questions and do the same where necessary? Argh (if so)!? Please can somebody confirm? Maybe my "shuffling" code is incorrect but looks correct to me:

Code:
```trainingData = ReadCaltechFile('features.train');

%randomize data:
[dummy,ix] = sort(rand(1,rows(trainingData)));
newData = trainingData(ix,:);
trainingData = newData;

y = double(trainingData(:, 1));
X = double(trainingData(:, 2:end));```
...and similar for test data.
#2
09-03-2012, 04:53 AM
 Andrs Member Join Date: Jul 2012 Posts: 47
Re: Should SVMs ALWAYS converge to the same solution given the same data?

Hi

I also found the same problem when I used 10-fold with shuffle. See the following: http://book.caltech.edu/bookforum/showthread.php?t=1282

My guess is that if you have data with reasonably well separated features ("good data" in relation to our model), we should get the same result (or more or less the same results) if we have the same data that is processed in different sequences. But if we have a "difficult data", the classification margins may be very narrow and if we handle the data in different sequences we may get different results. I think that is the case with "one against all" in q5 where I do not get "repeatable results". I am skipping shuffle.
What is the purpose for randomizing the data in this case?
#3
09-03-2012, 05:13 AM
 itooam Senior Member Join Date: Jul 2012 Posts: 100
Re: Should SVMs ALWAYS converge to the same solution given the same data?

In answer to my original question I found this:

http://compbio.soe.ucsc.edu/genex/ge...tml/node3.html

which says:

In addition to counteracting overfitting, the SVM's use of the maximum margin hyperplane leads to a straightforward learning algorithm that can be reduced to a convex optimization problem. In order to train the system, the SVM must find the unique minimum of a convex function. Unlike the backpropagation learning algorithm for artificial neural networks, a given SVM will always deterministically converge to the same solution for a given data set, regardless of the initial conditions. For training sets containing less than approximately 5000 points, gradient descent provides an efficient solution to this optimization problem [Campbell and Cristianini, 1999].

So the random results I am seeing seem to oppose the above? Also mentioned at above link which may explain this is:

The selection of an appropriate kernel function is important, since the kernel function defines the feature space in which the training set examples will be classified. As long as the kernel function is legitimate, an SVM will operate correctly even if the designer does not know exactly what features of the training data are being used in the kernel-induced feature space. The definition of a legitimate kernel function is given by Mercer's theorem [Vapnik, 1998]: the function must be continuous and positive definite. Human experts often find it easier to specify a kernel function than to specify explicitly the training set features that should be used by the classifier. The kernel expresses prior knowledge about the phenomenon being modeled, encoded as a similarity measure between two vectors.

Maybe then I can propose that the polynomial kernel as supplied for the homework doesn't represent the feature space well enough and is the reason for the variation in my results?

One last point. I tried using RBF in Q7 instead, and though there were still discrepancies from run to run, they were much much smaller!

But before I get carried away, please could somebody confirm they see different results when they shuffle their data sets prior to learning?
#4
09-03-2012, 05:19 AM
 itooam Senior Member Join Date: Jul 2012 Posts: 100
Re: Should SVMs ALWAYS converge to the same solution given the same data?

Thanks Andrs, (sorry I started composing the above before seeing your post). I can only think it is the choice of kernel as I saw this problem in Q7 where it was classifiers 1 v 5. Would be great to hear from one of the Professors on this (please)?
#5
09-03-2012, 05:44 AM
 Keith Member Join Date: Jul 2012 Posts: 16
Re: Should SVMs ALWAYS converge to the same solution given the same data?

I see different cross validation errors, but the models always give the same Ein and Eout. There is variation in the svmtrain output, different iterations, values of obj, rho, and even nBSV change, but the models always seem to work the same with svmpredict.
#6
09-03-2012, 05:55 AM
 itooam Senior Member Join Date: Jul 2012 Posts: 100
Re: Should SVMs ALWAYS converge to the same solution given the same data?

Quote:
 Originally Posted by Andrs ...But if we have a "difficult data", the classification margins may be very narrow and if we handle the data in different sequences we may get different results.
That also sounds reasonable to me as an explanation.

Keith can you confirm that you shuffled your data first? I can understand why the output could be considerably different when cross validation is applied due to what data gets put into each fold. However for a straight learn and predict I hadn't anticipated a "considerable" difference in Ein and Eout when the data is the same but in a different order.
#7
09-03-2012, 06:07 AM
 Keith Member Join Date: Jul 2012 Posts: 16
Re: Should SVMs ALWAYS converge to the same solution given the same data?

Yes I shuffled, ran svmtrain and svmpredict and then shuffled again and repeated many times. I saw NO change in Ein or Eout.
#8
09-03-2012, 06:48 AM
 itooam Senior Member Join Date: Jul 2012 Posts: 100
Re: Should SVMs ALWAYS converge to the same solution given the same data?

I have looked in more detail, and it is the larger values of C where the discrepancies occur. I.e., when C >= 100... this may make more sense as from what I remember from the lectures, the higher the value of C, the more you tend towards a "hard" margin, which means the classifier will be more sensitive to noise in the data.

Keith that is unusual (maybe you were using low values of C in your tests)?

I did a quick google, this explains it well:

http://stackoverflow.com/questions/4...r-soft-margins

I will put it down to "sensitivity to noise" cause by high C (harder margins), unless otherwise corrected?

For clarity here are my results over 3 runs of Q7 when the power (Q) is set to 5. Each run has a shuffled training and test set:

C_________ NoOfSVs____ Eout_______ Ein
1.0000e-002 2.3000e+001 2.1226e-002 3.8437e-003
1.0000e+000 2.1000e+001 2.1226e-002 3.2031e-003
1.0000e+002 1.3000e+001 2.1226e-002 3.8437e-003
1.0000e+004 1.1000e+001 3.0660e-002 1.7937e-002
1.0000e+006 9.0000e+000 6.3679e-002 5.1249e-002

C_________ NoOfSVs____ Eout_______ Ein
1.0000e-002 2.3000e+001 2.1226e-002 3.8437e-003
1.0000e+000 2.1000e+001 2.1226e-002 3.2031e-003
1.0000e+002 1.3000e+001 2.3585e-002 4.4843e-003
1.0000e+004 9.0000e+000 3.3491e-001 3.3312e-001
1.0000e+006 9.0000e+000 3.3726e-001 3.4465e-001

C_________ NoOfSVs____ Eout_______ Ein
1.0000e-002 2.3000e+001 2.1226e-002 3.8437e-003
1.0000e+000 2.1000e+001 2.1226e-002 3.2031e-003
1.0000e+002 1.3000e+001 2.3585e-002 3.2031e-003
1.0000e+004 9.0000e+000 3.5377e-001 3.4145e-001
1.0000e+006 1.1000e+001 3.5142e-001 3.5106e-001
#9
09-03-2012, 07:20 AM
 Keith Member Join Date: Jul 2012 Posts: 16
Re: Should SVMs ALWAYS converge to the same solution given the same data?

This seems to be occurring when there is no convergence, maximum iterations reached, large Ein.
#10
09-03-2012, 07:35 AM
 htlin NTU Join Date: Aug 2009 Location: Taipei, Taiwan Posts: 601
Re: Should SVMs ALWAYS converge to the same solution given the same data?

Mathematically, the primal problem of SVM enjoys a unique solution; the dual, on the other hand, may or may not come with a unique solution. In some degenerate cases, there can be multiple solutions of the dual (that correspond to the same solution of the primal).

Hope this helps.
__________________
When one teaches, two learn.

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 01:34 PM.

 Contact Us - LFD Book - Top

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.