Regression then PLA
In the Homework 1 section there is discussion showing that PLA is
independent of alpha, as only the ratio to w0 counts.
It seems, though, when using regression before PLA, as
initial weights for the PLA, that alpha is important again.
Too big, and the initial weights don't do much, too small
and it converges slow.
