LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 7 (http://book.caltech.edu/bookforum/forumdisplay.php?f=136)
-   -   Octave QP issue (http://book.caltech.edu/bookforum/showthread.php?t=1133)

patrickjtierney 08-23-2012 08:16 PM

Octave QP issue
 
I came across a clever trick in the previous classes discussions for improving the stability and results of qp() in Octave. Essentially you just add a tiny amount to the H matrix's diagonal (10^-15) and then solve for alpha0 using qp. Then you rerun qp using alpha0 as the initial value and use your original H to obtain the value of alpha.

However, for N=10, I found the percent better for SVM went up by 15-20% as a result consistently. This leads to an in-between value which is on the verge of changing my answer to Q8.

I know this must have come up last time, but I wonder if this possible spread in the results due to implementation issues is accounted for in the answers?

htlin 08-24-2012 05:35 AM

Re: Octave QP issue
 
Quote:

Originally Posted by patrickjtierney (Post 4360)
I came across a clever trick in the previous classes discussions for improving the stability and results of qp() in Octave. Essentially you just add a tiny amount to the H matrix's diagonal (10^-15) and then solve for alpha0 using qp. Then you rerun qp using alpha0 as the initial value and use your original H to obtain the value of alpha.

Your trick indeed looks very clever. I find it difficult to believe the huge difference between your solutions, though. So you may want to double-check on which one is correct. Hope this helps.

patrickjtierney 08-24-2012 08:22 AM

Re: Octave QP issue
 
I think that Octave qp() is very sensitive to initial value for alpha. By improving the invertablity of the input matrix H, one gets a good approximation of alpha, which can then be used as a starting point when using the actual H. So its not so much magically improving the results as it is correcting a problem in qp() for this kind of usage. Basically qp() without the fix is frequently running to MaxIter without finding a solution. But I stand by the improvement in results. On different runs it is always 10-20% better versus PLA than it was without the "trick".

Also, credit for the idea goes to previous student elkka.

zifmia 08-24-2012 12:17 PM

Re: Octave QP issue
 
Thanks for posting this. Went from only occasionally converging to converging always (N=100) or usually (N=10). Got all these answers correct on submission.

For N=10, I was finding that Octave failed to converge even on the modified problem about 10% of the time. For purposes of this assignment I just ignored these cases, but clearly there is more tweaking to be done. (Although other posts on Octave didn't seem to mention this, so perhaps there is something wrong with my problem formulation).

Followup: A quick experiment I should have done sooner shows that all of the cases that failed to converge were degenerate cases where all 10 points had the same classification. Apparently with my random line target and random point data this happens about 10% of the time with 10 points, and almost all of these fail to solve within the default 200 iterations. (Of course when they do solve, they have zero support vectors.)

dthal 02-20-2013 08:33 PM

Re: Octave QP issue
 
I have to second this trick. It got me from never converging to always converging.

Earlier, my code worked some of the time, and only on N=10. Other times, it returned wrong output. It took me a while to figure out that the problem was that qp wasn't converging. If you are using Octave, check the info object that you get back from qp to see if you are getting code 0 (OK) or something else.


All times are GMT -7. The time now is 09:36 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.