Quote:
Originally Posted by srport@pacbell.net
Question about the pocket algorithm described on p. 80 and illustrated on p. 83.
For data that is not linearly separable, I originally thought the pocket algorithm would run exactly like the PLA and simply take the best performing w vector over the entire iteration range and report it back. However, the description on p. 80 says that the w vector is not updated if Ein is worse.
Suppose after a certain number of iterations that the set of misclassified points is such that every misclassified point used to perform an update returns a worse performing w vector. In this case, the algorithm will never be able to find a better w vector.
My experience watching the PLA as it iterates is that it seems to jump about, and eventually (randomly) converges (with linearly separable data). I've seen proofs on the internet that it will converge, but do not really understand them. My point is that each iteration of the PLA does not necessarily improve g(x), but that in order to reach the best g(x), this jumping about is part of the process.
Can you please comment?

Hi,
There are two
vectors that the pocket algorithm keeps track of. The "pocket weight vector"
is the one that is not updated if
gets worse. However, the other weight vector
, which is identical to its counterpart in PLA, keeps getting updated with every iteration using a misclassified point. The algorithm then reports
as its final solution.