The proof essentially shows that the (normalized) inner product between and the separating weights will be larger and larger in each iteration. But the normalized inner product is upper bounded by and cannot be arbitrarily large. Hence PLA will converge.

Hi, I just want to check with you that the proof in this question assumes that the data is separable because since the proof relies on p and p relies on w*.
Thanks in advance.