It is a mistake to talk about

converging. It is the vector

**w** that converges.

**w** should not be

*normalized* after each update because doing so alters the relative scale of the error adjustments performed with each iteration. I suspect that this could result in cases where convergence would fail to occur even for a linearly separable training vector.