Quote:
Originally Posted by melipone
Might be offtopic but I'm not sure where it would go since there is no SVM chapter in the book.
I came across oneclass SVMs where support vectors are found w/o class separation. How could that be? What is the hyperplane?

There are two kinds of common oneclass SVM formulations for separating outliers and normal examples without any labeling information. The two are equivalent when using some kernels. They are different in expressing what an "outlier" is.
Perhaps a formulation that's more intuitive is to use the "smallest" hypersphere to bound the normal examples, and then examples falling out of the hypersphere are considered outliers. So roughly, we minimize (the size of the hypersphere + the penalty for being outside the ball).
http://dl.acm.org/citation.cfm?id=960109
The formulation can then be kernelized using the Langrange dual, like the binary SVM discussed in class.
The more popular formulation nowadays consider the "normal" examples as those "far from the origin", and outliers as those close to the origin. In a sense, the observed examples are treated as belonging to the positive class, and the origin is treated as the representative of the negative class. The two classes are separated by a hyperplane. So roughly, we minimize (1 / the margin to the origin + the pentalty for being on the wrong side of the hyperplane). The actual formulation proposed and implemented in solvers like LIBSVM is slightly more sophisticated than that.
http://dl.acm.org/citation.cfm?id=1119749
The formulation can also be kernelized.
Hope this helps.