LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 8

Reply
 
Thread Tools Display Modes
  #1  
Old 09-03-2012, 03:40 AM
itooam itooam is offline
Senior Member
 
Join Date: Jul 2012
Posts: 100
Default Should SVMs ALWAYS converge to the same solution given the same data?

Should SVMs ALWAYS converge to the same solution given the same data?

I've been running a few tests on Q7 and I find that if I randomize the order of the data in both the training and test sets, I get different solutions/errors? I am now wondering whether I should be averaging over say a hundred runs and whether I need to go back to previous questions and do the same where necessary? Argh (if so)!? Please can somebody confirm? Maybe my "shuffling" code is incorrect but looks correct to me:

Code:
trainingData = ReadCaltechFile('features.train');

%randomize data:
[dummy,ix] = sort(rand(1,rows(trainingData))); 
newData = trainingData(ix,:);
trainingData = newData; 

y = double(trainingData(:, 1));
X = double(trainingData(:, 2:end));
...and similar for test data.
Reply With Quote
  #2  
Old 09-03-2012, 04:53 AM
Andrs Andrs is offline
Member
 
Join Date: Jul 2012
Posts: 47
Default Re: Should SVMs ALWAYS converge to the same solution given the same data?

Hi

I also found the same problem when I used 10-fold with shuffle. See the following: http://book.caltech.edu/bookforum/showthread.php?t=1282

My guess is that if you have data with reasonably well separated features ("good data" in relation to our model), we should get the same result (or more or less the same results) if we have the same data that is processed in different sequences. But if we have a "difficult data", the classification margins may be very narrow and if we handle the data in different sequences we may get different results. I think that is the case with "one against all" in q5 where I do not get "repeatable results". I am skipping shuffle.
What is the purpose for randomizing the data in this case?
Reply With Quote
  #3  
Old 09-03-2012, 05:55 AM
itooam itooam is offline
Senior Member
 
Join Date: Jul 2012
Posts: 100
Default Re: Should SVMs ALWAYS converge to the same solution given the same data?

Quote:
Originally Posted by Andrs View Post
...But if we have a "difficult data", the classification margins may be very narrow and if we handle the data in different sequences we may get different results.
That also sounds reasonable to me as an explanation.

Keith can you confirm that you shuffled your data first? I can understand why the output could be considerably different when cross validation is applied due to what data gets put into each fold. However for a straight learn and predict I hadn't anticipated a "considerable" difference in Ein and Eout when the data is the same but in a different order.
Reply With Quote
  #4  
Old 09-03-2012, 06:07 AM
Keith Keith is offline
Member
 
Join Date: Jul 2012
Posts: 16
Default Re: Should SVMs ALWAYS converge to the same solution given the same data?

Yes I shuffled, ran svmtrain and svmpredict and then shuffled again and repeated many times. I saw NO change in Ein or Eout.
Reply With Quote
  #5  
Old 09-03-2012, 06:48 AM
itooam itooam is offline
Senior Member
 
Join Date: Jul 2012
Posts: 100
Default Re: Should SVMs ALWAYS converge to the same solution given the same data?

I have looked in more detail, and it is the larger values of C where the discrepancies occur. I.e., when C >= 100... this may make more sense as from what I remember from the lectures, the higher the value of C, the more you tend towards a "hard" margin, which means the classifier will be more sensitive to noise in the data.

Keith that is unusual (maybe you were using low values of C in your tests)?

I did a quick google, this explains it well:

http://stackoverflow.com/questions/4...r-soft-margins

I will put it down to "sensitivity to noise" cause by high C (harder margins), unless otherwise corrected?

For clarity here are my results over 3 runs of Q7 when the power (Q) is set to 5. Each run has a shuffled training and test set:

C_________ NoOfSVs____ Eout_______ Ein
1.0000e-002 2.3000e+001 2.1226e-002 3.8437e-003
1.0000e+000 2.1000e+001 2.1226e-002 3.2031e-003
1.0000e+002 1.3000e+001 2.1226e-002 3.8437e-003
1.0000e+004 1.1000e+001 3.0660e-002 1.7937e-002
1.0000e+006 9.0000e+000 6.3679e-002 5.1249e-002

C_________ NoOfSVs____ Eout_______ Ein
1.0000e-002 2.3000e+001 2.1226e-002 3.8437e-003
1.0000e+000 2.1000e+001 2.1226e-002 3.2031e-003
1.0000e+002 1.3000e+001 2.3585e-002 4.4843e-003
1.0000e+004 9.0000e+000 3.3491e-001 3.3312e-001
1.0000e+006 9.0000e+000 3.3726e-001 3.4465e-001

C_________ NoOfSVs____ Eout_______ Ein
1.0000e-002 2.3000e+001 2.1226e-002 3.8437e-003
1.0000e+000 2.1000e+001 2.1226e-002 3.2031e-003
1.0000e+002 1.3000e+001 2.3585e-002 3.2031e-003
1.0000e+004 9.0000e+000 3.5377e-001 3.4145e-001
1.0000e+006 1.1000e+001 3.5142e-001 3.5106e-001
Reply With Quote
  #6  
Old 09-03-2012, 07:20 AM
Keith Keith is offline
Member
 
Join Date: Jul 2012
Posts: 16
Default Re: Should SVMs ALWAYS converge to the same solution given the same data?

This seems to be occurring when there is no convergence, maximum iterations reached, large Ein.
Reply With Quote
  #7  
Old 09-03-2012, 05:13 AM
itooam itooam is offline
Senior Member
 
Join Date: Jul 2012
Posts: 100
Default Re: Should SVMs ALWAYS converge to the same solution given the same data?

In answer to my original question I found this:

http://compbio.soe.ucsc.edu/genex/ge...tml/node3.html

which says:

In addition to counteracting overfitting, the SVM's use of the maximum margin hyperplane leads to a straightforward learning algorithm that can be reduced to a convex optimization problem. In order to train the system, the SVM must find the unique minimum of a convex function. Unlike the backpropagation learning algorithm for artificial neural networks, a given SVM will always deterministically converge to the same solution for a given data set, regardless of the initial conditions. For training sets containing less than approximately 5000 points, gradient descent provides an efficient solution to this optimization problem [Campbell and Cristianini, 1999].

So the random results I am seeing seem to oppose the above? Also mentioned at above link which may explain this is:

The selection of an appropriate kernel function is important, since the kernel function defines the feature space in which the training set examples will be classified. As long as the kernel function is legitimate, an SVM will operate correctly even if the designer does not know exactly what features of the training data are being used in the kernel-induced feature space. The definition of a legitimate kernel function is given by Mercer's theorem [Vapnik, 1998]: the function must be continuous and positive definite. Human experts often find it easier to specify a kernel function than to specify explicitly the training set features that should be used by the classifier. The kernel expresses prior knowledge about the phenomenon being modeled, encoded as a similarity measure between two vectors.

Maybe then I can propose that the polynomial kernel as supplied for the homework doesn't represent the feature space well enough and is the reason for the variation in my results?

One last point. I tried using RBF in Q7 instead, and though there were still discrepancies from run to run, they were much much smaller!

But before I get carried away, please could somebody confirm they see different results when they shuffle their data sets prior to learning?
Reply With Quote
  #8  
Old 09-03-2012, 05:19 AM
itooam itooam is offline
Senior Member
 
Join Date: Jul 2012
Posts: 100
Default Re: Should SVMs ALWAYS converge to the same solution given the same data?

Thanks Andrs, (sorry I started composing the above before seeing your post). I can only think it is the choice of kernel as I saw this problem in Q7 where it was classifiers 1 v 5. Would be great to hear from one of the Professors on this (please)?
Reply With Quote
  #9  
Old 09-03-2012, 05:44 AM
Keith Keith is offline
Member
 
Join Date: Jul 2012
Posts: 16
Default Re: Should SVMs ALWAYS converge to the same solution given the same data?

I see different cross validation errors, but the models always give the same Ein and Eout. There is variation in the svmtrain output, different iterations, values of obj, rho, and even nBSV change, but the models always seem to work the same with svmpredict.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 03:41 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.