Intuition of the step of PLA
According to the book, the update rule for PLA is w(t+1) = w(t) + y(t)x(t), and the book mentions "this rule moves the boundary in the direction of classifying x(t) correctly".
I understand that there is a convergence proof for PLA. But it is hard for me to see why such rule (or step) moves the boundary in the direction of classifying x(t) correctly. The formula just adds actual outcome (i.e. y(t)) times the misclassified point (i.e. x(t)) to the current weight matrix (which is just a vector of coefficient of hypothesis equation).
Any pointer will help.
Thanks in advance!
