If I take output of linear regression, I usually have a little (1-5 of 100) points which are misclassified. But it does not help PLA, because first addition Xi to W destabilize "almost good" value of W. I can even invert sign of L.R. output, it doesn't affect number of iterations (despite number of misclassified items on first iteration changes from 1-5 to 95-99, I verified)

But if I use (L.R. output)*N, it does help for PLA, greatly reduces number of iterations.

Is it supposed, that usage of linear regression output as W should

**greatly** change number of iterations? If yes, does it means I should to search for other bugs in my code?

Output of L.R. should be used as is, or should be multiplied to make it "stronger" ?