Sorry to annoy you again. Let me summarize my understanding point by point:

(1) In one sense, hard margin SVM is no different from simpler algorithm like

PLA for linearly separable data (albeit the result may be different,

they are the same in terms of generalization, Ein = 0, ...).

(2) Point (1) still apply for non-linear transformed data.

(3) In ML, in an attempt to find a separation plane (or line) is somehow

similar to find out the coefficient of an polynomial (in case a

polynomial is used as the hypothesis set), i.e., the

.

(3a) Although the coefficient of the polynomial will not be found in

explicit form, one can either view it as the data being transformed

to a different space (higher or lower dimensional (normally not necessary

to use lower dimension)) and separated linearly; alternatively, it can

be mapped back to the original space and interpret as an higher order

polynomial.

(4) This hold true for hard margin SVM, and for data explicity transformed

nonlinearly.

(5) From what I have done in Q14, with hard margin SVM + RBF kernel on 100

data points, it can always separate the data linearly (Ein = 0). And it

matches with my understanding.

Then, my question is: is RBF regular form not normally used for

supervised training ?

We learn a lot from the final exam paper about the RBF regular form.

As the performance is normally not as good as SVM, also we have no

cue about what is the best K. Does it mean, in supervised learning,

we normally will not consider to use RBF regular form ?