LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 7

Thread Tools Display Modes
Prev Previous Post   Next Post Next
Old 08-29-2012, 11:58 AM
tzs29970 tzs29970 is offline
Invited Guest
Join Date: Apr 2012
Posts: 52
Default Calculating w from just support vectors--numerically risky?

In a numerically perfect world, \alpha_n would be exactly 0 except at the support vectors, and so w=\sum_{n=1}^N \alpha_n y_n x_n would give the same result is w=\sum_{x_n\mathop{is}SV} \alpha_n y_n x_n.

On real computers, of course, we have to deal with the fact that our calculations have limited precision, and so \alpha_n is usually non-zero nearly everywhere.

I found that if I identified the support vectors before calculating w, by looking for \alpha_n>\epsilon for some small \epsilon, and then calculated w just from those support vectors, I did not get a consistent b. If there were 3 support vectors, sometime I'd get the same b from all 3, but maybe half the time I'd get one b from two of them, and the third would give a b that was significantly off.

If, however, I used all the vectors to calculate w, rather than just the support vectors, then I'd get the same b from all the support vectors.

My speculation is that just as the \alpha_n values that are supposed to be 0 are off slightly due to floating point precision issues, so too are those that are supposed to be non-zero, and that when you use ALL of the \alpha's to calculate w the errors are balancing out. When you exclude the ones that were "supposed" to be 0, you increase the error in w. This makes intuitive sense because the QP solver was using all the \alpha's to try to achieve minimization, and so any error should be spread among all of them. If we only have 3 support vectors, and so only use 3 \alpha's, the error will be high because 3 is so small we get high variance. By using all the \alpha's, the variance will be lower, and so the error is closer to the mean error, which should be zero.

Those who had errors on problems 8-10, if you just used the support vectors, and calculated b from one support vector, it might be worth putting in a check to see if you get a different b from different support vectors.
Reply With Quote

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -7. The time now is 11:44 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.