LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 7

Reply
 
Thread Tools Display Modes
  #1  
Old 02-21-2013, 06:23 PM
ilya239 ilya239 is offline
Senior Member
 
Join Date: Jul 2012
Posts: 58
Question SVMs and the input distribution

If the input distribution has high density near the target boundary, the sample will likely contain points near the boundary, so that large-margin or small-margin classifiers will be similar. If the input distribution has low density near the boundary, then the sample will have few near-boundary points, giving advantage to a large-margin classifier -- but then also, the probability of drawing a near-margin point during out-of-sample use is low, so E_out for low-margin classifiers is not much affected.

Why does this not limit the advantage of large-margin classifiers in practice?
Reply With Quote
  #2  
Old 02-21-2013, 10:10 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: SVMs and the input distribution

Quote:
Originally Posted by ilya239 View Post
If the input distribution has high density near the target boundary, the sample will likely contain points near the boundary, so that large-margin or small-margin classifiers will be similar.
You seem to have an interesting way of looking at the situation here, but I want to clarify the setup first. The sample here is the data set that is going to be used to train SVM, right? If so, can you explain why large and small margins will be similar in the above situation?
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 02-21-2013, 10:22 PM
ilya239 ilya239 is offline
Senior Member
 
Join Date: Jul 2012
Posts: 58
Default Re: SVMs and the input distribution

Quote:
Originally Posted by yaser View Post
You seem to have an interesting way of looking at the situation here, but I want to clarify the setup first. The sample here is the data set that is going to be used to train SVM, right? If so, can you explain why large and small margins will be similar in the above situation?
From Q8-10 with 100 points, it looked like hypotheses with Ein=0 were limited to a narrow sliver. If you have two pairs of opposite-labeled points near the target boundary, they largely determine allowed solutions, so solutions can't differ much from each other.
Reply With Quote
  #4  
Old 02-21-2013, 11:24 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: SVMs and the input distribution

Quote:
Originally Posted by ilya239 View Post
From Q8-10 with 100 points, it looked like hypotheses with Ein=0 were limited to a narrow sliver. If you have two pairs of opposite-labeled points near the target boundary, they largely determine allowed solutions, so solutions can't differ much from each other.
Indeed. One way to look at this is that margins are basically regularizers. The more training points you have the less regularization that is needed, and the closer the regularized and unregularized solutions are to each other. Is this the main issue in your first post?

This observation does not affect the answers to Problems 8,9 one way or the other, since these problems only address which of the two methods is better, whether it is slightly better or significantly better.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 02-22-2013, 05:30 PM
ilya239 ilya239 is offline
Senior Member
 
Join Date: Jul 2012
Posts: 58
Default Re: SVMs and the input distribution

Quote:
Originally Posted by yaser View Post
Indeed. One way to look at this is that margins are basically regularizers. The more training points you have the less regularization that is needed, and the closer the regularized and unregularized solutions are to each other. Is this the main issue in your first post?
The issue was more that (1) I was expecting more of an improvement for SVM vs PLA, and was trying to understand why there wasn't; and (2) in a real problem, points near the true boundary would be rarer than points away -- both because the space around the boundary is a small fraction of total space, and because (hopefully) the + and - examples come from real-world distributions centered somewhat away from the boundary; so, was trying to understand what happens in such a case.

If few training points fall near the true boundary this could be because (1) dataset is too small, or (2) the underlying data distribution has low density near the boundary. If (1), then SVM has an advantage because it's more likely to track the true boundary than a random linear separator like PLA.
If (2), then SVM still does better near the boundary, but the density of points there is so small that E_out is not much improved by getting them right.
I guess in practice, (1) is more common?
Reply With Quote
  #6  
Old 02-27-2013, 12:12 AM
gah44 gah44 is offline
Invited Guest
 
Join Date: Jul 2012
Location: Seattle, WA
Posts: 153
Default Re: SVMs and the input distribution

In the problem, the points are uniform randomly distributed. With a smaller number of points, the gap is, statistically, larger. Given N points, the line that created the classification could be anywhere in the gap. The SVM solution should be close to the center of the gap. My guess is that PLA can also be anywhere in the gap.

Given that, you can see that the SVM solution should be closer more often,
though not so easily to guess how often.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 07:39 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.