LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 1

Reply
 
Thread Tools Display Modes
  #31  
Old 04-10-2012, 12:01 AM
mbell mbell is offline
Junior Member
 
Join Date: Apr 2012
Posts: 3
Default Re: Perceptron Learning Algorithm

Quote:
Originally Posted by virginiatraweek@gmail.com View Post
How do you calculate the desired output (dj), as shown on the wikipedia page?
d_j is the same as y_j in our formulation - it's the known output value for x_j

A word of warning to others (although it sounds like everyone else on the forums is already past this point): You do in fact need to include w_0 (with x_i,0 = 1). I spent quite a while wrestling with my non-converging PLA until I realized this was the issue (and then some more time wondering why my P(f(x) != g(x)) values were so weird, until I realized that I'd failed to account for the extra w_0 when computing f(x) using a 2D x instead of the appropriate 3D version...)

I'd interpreted the value of w_0 from the in-class example as a threshold set by the bank for a credit-worthy customer, but now I realize that in fact it's one of the learned parameters, which makes perfect sense in retrospect.
Reply With Quote
  #32  
Old 04-10-2012, 05:58 AM
rukacity rukacity is offline
Member
 
Join Date: Apr 2012
Posts: 21
Default Re: Perceptron Learning Algorithm

While running through the population I first collect all the misfit points and then choose a random point to adjust the weights. The other approach could be to stop at the first misfit and correct the weights. What is practiced?
Reply With Quote
  #33  
Old 04-10-2012, 01:42 PM
mariovela mariovela is offline
Junior Member
 
Join Date: Apr 2012
Posts: 5
Default Re: Perceptron Learning Algorithm

I chose to correct the weights at the first misfit, increase the count of iterations and start the algorithm again from the first point... it works fast so far... but I'm wondering if I should continue with the algorithm without re-starting again and stopping only when all points satisfy h(x)=g(x)...
Reply With Quote
  #34  
Old 04-11-2012, 02:36 PM
shockwavephysics shockwavephysics is offline
Member
 
Join Date: Apr 2012
Posts: 17
Default Re: Perceptron Learning Algorithm

Quote:
Originally Posted by yaser View Post
Let me use the book notation to avoid confusion. You have two points {\bf x}_1 and {\bf x}_2 (which you called a1 and a2) and their target outputs (which you called assignment) are f({\bf x}_1)=+1 and f({\bf x}_2)=-1.

Either point, call it just {\bf x} for simplicity, is a vector that has d components {\bf x}=x_1,x_2,\cdots,x_d. Notice that bold {\bf x} denotes a full data point, while italic x denotes a component in that data point. We add a constant 1 component to each data point {\bf x} and call the component x_0 to simplify the expression for the perceptron. If the weight vector of the perceptron is {\bf w}=w_0,w_1,w_2,\cdots,w_d (where w_0 takes care of the threshold value of that perceptron), then the perceptron implements

{\rm sign}(w_0x_0+w_1x_1+w_2x_2+\cdots+w_dx_d)

where `{\rm sign()}' returns +1 if its argument is positive and returns -1 if its argument is negative.

Example: Say the first data point {\bf x}_1=(3,4) (two dimensional, so d=2). Add the constant component x_0=1 and you have {\bf x}_1=(1,3,4). Therefore, the percepton's output on this point is {\rm sign}(w_0+3w_1+4w_2). If this formula returns -1 which is different from the target output +1, the PLA adjusts the values of the weights trying to make the perceptron output agree with the target output for this point {\bf x}. It uses the specific PLA update rule to achieve that.
I have been trying to figure out why updating using w -> w + y_n * x_n works at all. I looked up the relevant section in the text, and there are a series of questions for the student that hint at the answer. I followed that logic to it's conclusion and it does seem to show that updating in that way will always give a w that is better (for the misclassified point) than the previous w. However, I cannot figure out how one comes up with this formulation in the first place. Is there a reference to a derivation I can read?
Reply With Quote
  #35  
Old 04-11-2012, 03:20 PM
julien julien is offline
Junior Member
 
Join Date: Apr 2012
Location: São Paulo, Brazil
Posts: 5
Default Re: Perceptron Learning Algorithm

I think it would be interesting to obtain a detailed example with the first few steps used to initialize and train the perceptron. Or a complete example if anybody is willing to share their code.

I was unable to implement correctly the perceptron before the deadline, and after spending an additional 5h in this exercise today, I still don't have a proper implementation.
Programming is not the issue, I'm a developer, but my issue is how to apply the theory.

I feel it's important to successfully code this algorithm, so I can successfully apply the theories we will learn in the next few lectures.
Reply With Quote
  #36  
Old 04-12-2012, 08:43 AM
mephistoyourfriend mephistoyourfriend is offline
Junior Member
 
Join Date: Apr 2012
Posts: 1
Default Re: Perceptron Learning Algorithm

Quote:
Originally Posted by shockwavephysics View Post
I have been trying to figure out why updating using w -> w + y_n * x_n works at all. I looked up the relevant section in the text, and there are a series of questions for the student that hint at the answer. I followed that logic to it's conclusion and it does seem to show that updating in that way will always give a w that is better (for the misclassified point) than the previous w. However, I cannot figure out how one comes up with this formulation in the first place. Is there a reference to a derivation I can read?
The scalar product between x and w+x is always greater than the scalar product between x and w (unless x is a zero vector), which is obvious once you express it in terms of angles between the vectors and their lengths. Similarly, the product <x,w-x> is always less than <x,w>. Since the sign of the scalar product is the only thing you're interested in, then changing the sign from -1 to +1 is achieved by making the scalar product more positive (rather, less negative), which you do by adding x to w. It's not guaranteed that you can accomplish that in one step (if the length of x is smaller than the length of w and the angle between them is 180 degrees, you certainly can't), but given a sufficient number of steps, you will be able to nudge it into submission. That's how I see it at least.
Reply With Quote
  #37  
Old 04-15-2012, 04:43 PM
zsero zsero is offline
Junior Member
 
Join Date: Apr 2012
Posts: 6
Default Re: Perceptron Learning Algorithm

Quote:
Originally Posted by mephistoyourfriend View Post
The scalar product between x and w+x is always greater than the scalar product between x and w (unless x is a zero vector), which is obvious once you express it in terms of angles between the vectors and their lengths. Similarly, the product <x,w-x> is always less than <x,w>. Since the sign of the scalar product is the only thing you're interested in, then changing the sign from -1 to +1 is achieved by making the scalar product more positive (rather, less negative), which you do by adding x to w. It's not guaranteed that you can accomplish that in one step (if the length of x is smaller than the length of w and the angle between them is 180 degrees, you certainly can't), but given a sufficient number of steps, you will be able to nudge it into submission. That's how I see it at least.
There is a point what I think is not clearly explained. It took me a long time to realize that it's not true that exactly one point gets corrected in each step. The true statement is that at most one point gets corrected in each step. The illustration in the lecture as well as various comments in this forum would suggest that the vector addition will always "correct" a vector. It is not true. The vector addition will always "tune" the weights such that one vector gets "better", but it doesn't mean that it will be correct after that step.
Reply With Quote
  #38  
Old 04-15-2012, 04:47 PM
zsero zsero is offline
Junior Member
 
Join Date: Apr 2012
Posts: 6
Default Re: Perceptron Learning Algorithm

Quote:
Originally Posted by julien View Post
I think it would be interesting to obtain a detailed example with the first few steps used to initialize and train the perceptron. Or a complete example if anybody is willing to share their code.

I was unable to implement correctly the perceptron before the deadline, and after spending an additional 5h in this exercise today, I still don't have a proper implementation.
Programming is not the issue, I'm a developer, but my issue is how to apply the theory.

I feel it's important to successfully code this algorithm, so I can successfully apply the theories we will learn in the next few lectures.
I've written the algorithm just based on the lecture and it worked. Here is the "core" of the algorithm (written in Python / Numpy):

Code:
w = np.zeros( 3 )
done = False

while not done:
    wrongpoints = 0
    for p in points:
        if np.sign( np.dot(w, p) ) != targetFunction( p ):
            w = np.add( w, targetFunction( p ) * p )
            wrongpoints += 1
            break
    if wrongpoints == 0:
        done = True
If anyone is interested in the Python implementation, here is my full code with plotting:
https://gist.github.com/2395395

and the one for the experiments:
https://gist.github.com/2395409
Reply With Quote
  #39  
Old 04-17-2012, 01:33 PM
rukacity rukacity is offline
Member
 
Join Date: Apr 2012
Posts: 21
Default Re: Perceptron Learning Algorithm

Quote:
Originally Posted by zsero View Post
I've written the algorithm just based on the lecture and it worked. Here is the "core" of the algorithm (written in Python / Numpy):

Code:
w = np.zeros( 3 )
done = False

while not done:
    wrongpoints = 0
    for p in points:
        if np.sign( np.dot(w, p) ) != targetFunction( p ):
            w = np.add( w, targetFunction( p ) * p )
            wrongpoints += 1
            break
    if wrongpoints == 0:
        done = True
If anyone is interested in the Python implementation, here is my full code with plotting:
https://gist.github.com/2395395

and the one for the experiments:
https://gist.github.com/2395409
Argh! I am new to Python and did not know about numpy. It looks so much more concise than my code that has all the vector multiplication and what not. One thing I noticed is that you keep adding to weights for each wrongpoint whereas Yaser had mentioned to adjust it only once per iteration.

Correction: My bad! Missed the 'break' statement
Reply With Quote
  #40  
Old 04-17-2012, 01:35 PM
zsero zsero is offline
Junior Member
 
Join Date: Apr 2012
Posts: 6
Default Re: Perceptron Learning Algorithm

Quote:
Originally Posted by rukacity View Post
Argh! I am new to Python and did not know about numpy. It looks so much more concise than my code that has all the vector multiplication and what not. One thing I noticed is that you keep adding to weights for each wrongpoint whereas Yaser had mentioned to adjust it only once per iteration.
No, that's what the break statement does. Once it found a wrong point, it corrects and then brakes the cycle and then the whole loop starts again.

Yes, numpy is the best thing ever made for Python! Pseudoinverse of a matrix? np.pniv :-)
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 12:54 AM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.