03-28-2013, 07:57 AM
 sbgaucho
Probability estimate from soft margin SVMs

Apologies if this has already been covered either in the forums or in a lecture, but I don't recall it in any lecture and couldn't find anything in the forum.

It seems to me that it would be nice after using SVM to get a probability estimate that a given x (particularly for out of sample x's) corresponds to y=1. For noisy but non linearly separable data it seems like it would be ideal to combine the probabilistic output of a logistic regression with the power of SVM. I googled this and found a couple presentations/references, but it doesn't seem like there is a clear-cut answer. Am I way off base? If not what is the simplest/easiest direction to go in terms of learning about and implementing such a thing? Is it easiest just to use something like libsvm or weka?

Thanks
03-29-2013, 11:41 PM
 htlin
Re: Probability estimate from soft margin SVMs

SVM with probabilistic outputs is useful for some applications. The most popular technique was proposed from Platt. The technique basically runs a variant of logistic regression to post-process the outputs of SVM. An earlier work of myself improves Platt's proposed algorithm from an optimization perspective:

http://www.csie.ntu.edu.tw/~htlin/pa.../plattprob.pdf

Hope this helps.
03-30-2013, 12:47 AM
 yaser
Re: Probability estimate from soft margin SVMs

A related question asked by email

 Can you please tell me if the following would be a good idea for post-processing after performing SVM: use the same z-space but instead of maximizing the margin, use logistic regression (in z space) and also allow the width of the logistic function to be a free parameter (let the cross-entropy be the objective function and use gradient descent). The solution from SVM could be used as the initial guess. Would this be a good idea (ie. improve the SVM result)?
and the answer from htlin (my colleague Professor Hsuan-Tien Lin):

 Post-processing the outputs of SVM by logistic regression formulation has been explored for getting probabilistic (soft) outputs from SVMs. The formulation comes with two parameters: the width (scaling) of the SVM output as you suggest, and an additional "bias" term. You can check http://www.csie.ntu.edu.tw/~htlin/pa.../plattprob.pdf and the earlier work of John Platt for some additional information. Hope this helps.
