LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 1 (http://book.caltech.edu/bookforum/forumdisplay.php?f=130)
-   -   PLA computing speed (http://book.caltech.edu/bookforum/showthread.php?t=4171)

Michael Reach 04-07-2013 08:48 PM

PLA computing speed
 
I'm a little concerned by how long my computations are taking (in R) on my home computer: ten minutes or more for the 100 points 1000 times. Is that normal, or is this just the tip of the iceberg, and the later assignments will take years?

yaser 04-07-2013 09:11 PM

Re: PLA computing speed
 
Quote:

Originally Posted by Michael Reach (Post 10216)
is this just the tip of the iceberg, and the later assignments will take years?

No danger of that!

Elroch 04-08-2013 05:15 PM

Re: PLA computing speed
 
Quote:

Originally Posted by Michael Reach (Post 10216)
I'm a little concerned by how long my computations are taking (in R) on my home computer: ten minutes or more for the 100 points 1000 times. Is that normal, or is this just the tip of the iceberg, and the later assignments will take years?

If it's any consolation, I'm using R as well, and I believe my program was slower than yours!

IsidroHidalgo 04-08-2013 05:27 PM

Re: PLA computing speed
 
My 1000 iterations of N=10 points have taked 45 min... R running in a Notebook:(
I'm choosing the first missclassified point. I wonder if taking the closest to f(x) in each iteration can improve the algorithm speed

kafar 04-08-2013 08:29 PM

Re: PLA computing speed
 
you probably want to switch to c++. it took just a couple minutes to run 10k N=100 runs.:D and i didn't really optimize the code.

pyguy 04-08-2013 11:17 PM

Re: PLA computing speed
 
I've been using Python and NumPy. My solution for N=100 with 1000 iterations was taking 4 minutes previously, and changing to PyPy and NumPyPy took it down to just 5 seconds. NumPyPy support isn't 100% yet, but it's getting there and the performance benefits of PyPy can be very nice without having to change your code.

jlaurentum 04-09-2013 07:53 AM

Re: PLA computing speed
 
Hello.

I used R too. I did 500 iterations for N=100 and it took forever. I used a while loop for the main part of the PLA. R is very inefficient with while loops and I don't see an easy way to vectorize the code. I also have the same question as Isidro - can the misclassification point chosen at each iteration affect the speed? Is there a better choice than just taking the first misclassified point you bump into?

Oh and btw, you guys did vectorize the classification for the N points with a good old sapply didnt you?

IsidroHidalgo 04-09-2013 08:06 AM

Re: PLA computing speed
 
I reach the code vectorization and it's great: 2 min for 1000 iterations with N=10 and 3 min for another 1000 with N=100.
jlaurentum: use apply to vectorization of sign: sign(apply(s.points, 1, "%*%", omega))

jlaurentum 04-09-2013 08:14 AM

Re: PLA computing speed
 
Isidro:

I used sapply on the sign function with an inner product argument. However, my code is very slow because of this (in pseudo code)

while there are misclassified points:
update the weight vector according to the correct output for that misclassified point
predict all the outputs according to the new weight vector (in just one instruction with
the sapply)
end while


My problem with speed is due to the while. Or maybe taking the first misclassified point is not the best thing to do...

Elroch 04-09-2013 08:21 AM

Re: PLA computing speed
 
Quote:

Originally Posted by jlaurentum (Post 10269)
Hello.

I used R too. I did 500 iterations for N=100 and it took forever. I used a while loop for the main part of the PLA. R is very inefficient with while loops and I don't see an easy way to vectorize the code. I also have the same question as Isidro - can the misclassification point chosen at each iteration affect the speed? Is there a better choice than just taking the first misclassified point you bump into?

Oh and btw, you guys did vectorize the classification for the N points with a good old sapply didnt you?

Good point about sapply!

In answer to your question, I have observed empirically that the way you choose the misclassified point has quite a large effect on the convergence. Beyond the scope of the actual problem, we could attempt to improve on randomisation by using the information in the value of w.x when sign(w.x) is wrong. But should we prefer large modulus of w.x or small?


All times are GMT -7. The time now is 02:33 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.