My thoughts on skwong's post:

(1) There are reasons why SVM is a major workhorse of machine learning, while PLA is mainly found early in machine learning courses and books (and the RBF regular form is another method that is not popular. EDIT: thanks, Yaser for the information that it used to be used more). And it's not mere fashion! In realistic linearly separable situations, SVM gives better

**generalisation** than PLA. It also usually gives better generalisation than RBF regular form. It's

that really matters, not

! The advantage over PLA would extend to PLA with non-linear transforms. As well as that, for problems with not too many data points, SVM is computationally efficient.

Moreover, soft margin SVM is a major tool for classification where there is either not enough data or noise (it's a struggle to get useful results from PLA for these: the pocket algorithm is a bit like taking shots in the dark here, whereas SVM heads straight to the global optimum solution).

(2) see (1)

(3) The

**w** is simply a natural way of representing a hyperplane. The relationship to polynomials is that polynomial models become linear models when viewed in the transformed space (with dimensions for each power of

**x**). This is worth studying.

(4) Yes

(5) I think you are right. RBF regular form tends not to generalise as well as SVM in realistic scenarios, hence people use SVM (spot the recurring theme?

)