LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 2 (http://book.caltech.edu/bookforum/forumdisplay.php?f=131)
-   -   Regression then PLA (http://book.caltech.edu/bookforum/showthread.php?t=2152)

 gah44 10-15-2012 07:11 PM

Regression then PLA

In the Homework 1 section there is discussion showing that PLA is
independent of alpha, as only the ratio to w0 counts.

It seems, though, when using regression before PLA, as
initial weights for the PLA, that alpha is important again.

Too big, and the initial weights don't do much, too small
and it converges slow.

 yaser 10-15-2012 07:39 PM

Re: Regression then PLA

Quote:
 Originally Posted by gah44 (Post 6402) In the Homework 1 section there is discussion showing that PLA is independent of alpha, as only the ratio to w0 counts. It seems, though, when using regression before PLA, as initial weights for the PLA, that alpha is important again. Too big, and the initial weights don't do much, too small and it converges slow.
Good point. :)

 Anne Paulson 01-17-2013 09:08 AM

Re: Regression then PLA

I'm coming up against this as well. Maybe I have a bug, but I'm finding that even though the regression finds an almost perfect line with, usually, very few points misclassified, I give the weights from the regression to PLA as initial weights and the PLA line bounces all over the place before settling down.

Scaling the regression weights up by a factor of 10 or 100 would speed up the PLA a lot, I think, by preventing the PLA update from moving the weights so much. That would have a similar effect to using a small alpha. But we're not supposed to do either thing, right?

 yaser 01-17-2013 09:27 AM

Re: Regression then PLA

Quote:
 Originally Posted by Anne Paulson (Post 8775) I'm coming up against this as well. Maybe I have a bug, but I'm finding that even though the regression finds an almost perfect line with, usually, very few points misclassified, I give the weights from the regression to PLA as initial weights and the PLA line bounces all over the place before settling down. Scaling the regression weights up by a factor of 10 or 100 would speed up the PLA a lot, I think, by preventing the PLA update from moving the weights so much. That would have a similar effect to using a small alpha. But we're not supposed to do either thing, right?
You are right, there is no scaling in Problem 7. Here, and in all homework problems, you are encouraged to explore outside the statement of the problem, like you have done here, but the choice of answer should follow the problem statement.

 gah44 01-20-2013 01:47 AM

Re: Regression then PLA

Seems to me that what it does is either give 0 iterations (if none are misclassified) or about as many as it did without the regression solution.

So what we are really counting is how often it gives 0 iterations, but yes, we have to follow the problem statement.

 Michael Reach 04-10-2013 12:17 PM

Re: Regression then PLA

Huh - I haven't done this problem yet, but this is really interesting. I had thought the answer to this question would be, like, 1, but now I see that the first step is likely to ruin the advantage that the linear regression gave you. The PLA isn't initially very subtle; the weights eventually get big and you have to wait till the size of the adjustments becomes small relative to the weights for the fine tuning.

 Rahul Sinha 04-10-2013 12:26 PM

Re: Regression then PLA

Quote:
 Originally Posted by Michael Reach (Post 10324) Huh - I haven't done this problem yet, but this is really interesting. I had thought the answer to this question would be, like, 1, but now I see that the first step is likely to ruin the advantage that the linear regression gave you. The PLA isn't initially very subtle; the weights eventually get big and you have to wait till the size of the adjustments becomes small relative to the weights for the fine tuning.

For 10 data points, it might very well converge quickly! My simulation converges at the speed of light!
1000 data points is another story.

 All times are GMT -7. The time now is 06:05 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.