You can run PLA but it will not converge. A better alternative is the pocket algorithm which you will see in chapter 3. Chapter 3 also describes several other algorithms which can be used for nonseparable data, including linear regression, linear programming etc. These are algorithms that stick with the linear model and tolerate errors. The (hard margin) SVM also needs the data to be separable, but you could use the (soft margin) SVM.
The more complex algorithms you mention like neural networks can perform nonlinear separation, which can also be accomplished with the linear model and the nonlinear transform (see chapter 3).
Quote:
Originally Posted by netweavercn
In Page 7, seems PLA has a prerequisite: linear separable. In many cases, usually you have thousands of data points, it is almost impossible to be linear separable because of the noise or maybe the nature is nonlinear separable, in this case, does PLA work? if not, any suggestion? i.e. Nerual Network? SVM?
