Quote:
Originally Posted by magdon
Any one of these three can happen:
1) the linear regression weights are optimal
2) the linear regression weights are not optimal and the PLA/Pocket algorithm can improve the weights.
3) the linear regression weights are not optimal and the PLA/Pocket algorithm cannot improve the weights.
In practice, we will not know which case we are in because actually finding the optimal weights is an NPhard combinatorial optimization problem.
However, no matter which case we are in, other than some extra CPU cycles, there is no harm done in running the pocket algorithm on the regression weights to see if they can be improved.

Hi Professor, how to plot the separators with the training data if I use Logistic regression for classiﬁcation using gradient descent. In this way we could compute the probabilities for every point in the figure but how to plot a separator for them?