LFD Book Forum Perceptron Learning Algorithm
 Register FAQ Calendar Mark Forums Read

#31
04-10-2012, 01:01 AM
 mbell Junior Member Join Date: Apr 2012 Posts: 3
Re: Perceptron Learning Algorithm

Quote:
 Originally Posted by virginiatraweek@gmail.com How do you calculate the desired output (dj), as shown on the wikipedia page?
d_j is the same as y_j in our formulation - it's the known output value for x_j

A word of warning to others (although it sounds like everyone else on the forums is already past this point): You do in fact need to include w_0 (with x_i,0 = 1). I spent quite a while wrestling with my non-converging PLA until I realized this was the issue (and then some more time wondering why my P(f(x) != g(x)) values were so weird, until I realized that I'd failed to account for the extra w_0 when computing f(x) using a 2D x instead of the appropriate 3D version...)

I'd interpreted the value of w_0 from the in-class example as a threshold set by the bank for a credit-worthy customer, but now I realize that in fact it's one of the learned parameters, which makes perfect sense in retrospect.
#32
04-10-2012, 06:58 AM
 rukacity Member Join Date: Apr 2012 Posts: 21
Re: Perceptron Learning Algorithm

While running through the population I first collect all the misfit points and then choose a random point to adjust the weights. The other approach could be to stop at the first misfit and correct the weights. What is practiced?
#33
04-10-2012, 02:42 PM
 mariovela Junior Member Join Date: Apr 2012 Posts: 5
Re: Perceptron Learning Algorithm

I chose to correct the weights at the first misfit, increase the count of iterations and start the algorithm again from the first point... it works fast so far... but I'm wondering if I should continue with the algorithm without re-starting again and stopping only when all points satisfy h(x)=g(x)...
#34
04-11-2012, 03:36 PM
 shockwavephysics Member Join Date: Apr 2012 Posts: 17
Re: Perceptron Learning Algorithm

Quote:
 Originally Posted by yaser Let me use the book notation to avoid confusion. You have two points and (which you called a1 and a2) and their target outputs (which you called assignment) are and . Either point, call it just for simplicity, is a vector that has components . Notice that bold denotes a full data point, while italic denotes a component in that data point. We add a constant 1 component to each data point and call the component to simplify the expression for the perceptron. If the weight vector of the perceptron is (where takes care of the threshold value of that perceptron), then the perceptron implements where returns if its argument is positive and returns if its argument is negative. Example: Say the first data point (two dimensional, so ). Add the constant component and you have . Therefore, the percepton's output on this point is . If this formula returns which is different from the target output , the PLA adjusts the values of the weights trying to make the perceptron output agree with the target output for this point . It uses the specific PLA update rule to achieve that.
I have been trying to figure out why updating using w -> w + y_n * x_n works at all. I looked up the relevant section in the text, and there are a series of questions for the student that hint at the answer. I followed that logic to it's conclusion and it does seem to show that updating in that way will always give a w that is better (for the misclassified point) than the previous w. However, I cannot figure out how one comes up with this formulation in the first place. Is there a reference to a derivation I can read?
#35
04-11-2012, 04:20 PM
 julien Junior Member Join Date: Apr 2012 Location: São Paulo, Brazil Posts: 5
Re: Perceptron Learning Algorithm

I think it would be interesting to obtain a detailed example with the first few steps used to initialize and train the perceptron. Or a complete example if anybody is willing to share their code.

I was unable to implement correctly the perceptron before the deadline, and after spending an additional 5h in this exercise today, I still don't have a proper implementation.
Programming is not the issue, I'm a developer, but my issue is how to apply the theory.

I feel it's important to successfully code this algorithm, so I can successfully apply the theories we will learn in the next few lectures.
#36
04-12-2012, 09:43 AM
 mephistoyourfriend Junior Member Join Date: Apr 2012 Posts: 1
Re: Perceptron Learning Algorithm

Quote:
 Originally Posted by shockwavephysics I have been trying to figure out why updating using w -> w + y_n * x_n works at all. I looked up the relevant section in the text, and there are a series of questions for the student that hint at the answer. I followed that logic to it's conclusion and it does seem to show that updating in that way will always give a w that is better (for the misclassified point) than the previous w. However, I cannot figure out how one comes up with this formulation in the first place. Is there a reference to a derivation I can read?
The scalar product between x and w+x is always greater than the scalar product between x and w (unless x is a zero vector), which is obvious once you express it in terms of angles between the vectors and their lengths. Similarly, the product <x,w-x> is always less than <x,w>. Since the sign of the scalar product is the only thing you're interested in, then changing the sign from -1 to +1 is achieved by making the scalar product more positive (rather, less negative), which you do by adding x to w. It's not guaranteed that you can accomplish that in one step (if the length of x is smaller than the length of w and the angle between them is 180 degrees, you certainly can't), but given a sufficient number of steps, you will be able to nudge it into submission. That's how I see it at least.
#37
04-15-2012, 05:43 PM
 zsero Junior Member Join Date: Apr 2012 Posts: 6
Re: Perceptron Learning Algorithm

Quote:
 Originally Posted by mephistoyourfriend The scalar product between x and w+x is always greater than the scalar product between x and w (unless x is a zero vector), which is obvious once you express it in terms of angles between the vectors and their lengths. Similarly, the product is always less than . Since the sign of the scalar product is the only thing you're interested in, then changing the sign from -1 to +1 is achieved by making the scalar product more positive (rather, less negative), which you do by adding x to w. It's not guaranteed that you can accomplish that in one step (if the length of x is smaller than the length of w and the angle between them is 180 degrees, you certainly can't), but given a sufficient number of steps, you will be able to nudge it into submission. That's how I see it at least.
There is a point what I think is not clearly explained. It took me a long time to realize that it's not true that exactly one point gets corrected in each step. The true statement is that at most one point gets corrected in each step. The illustration in the lecture as well as various comments in this forum would suggest that the vector addition will always "correct" a vector. It is not true. The vector addition will always "tune" the weights such that one vector gets "better", but it doesn't mean that it will be correct after that step.
#38
04-15-2012, 05:47 PM
 zsero Junior Member Join Date: Apr 2012 Posts: 6
Re: Perceptron Learning Algorithm

Quote:
 Originally Posted by julien I think it would be interesting to obtain a detailed example with the first few steps used to initialize and train the perceptron. Or a complete example if anybody is willing to share their code. I was unable to implement correctly the perceptron before the deadline, and after spending an additional 5h in this exercise today, I still don't have a proper implementation. Programming is not the issue, I'm a developer, but my issue is how to apply the theory. I feel it's important to successfully code this algorithm, so I can successfully apply the theories we will learn in the next few lectures.
I've written the algorithm just based on the lecture and it worked. Here is the "core" of the algorithm (written in Python / Numpy):

Code:
w = np.zeros( 3 )
done = False

while not done:
wrongpoints = 0
for p in points:
if np.sign( np.dot(w, p) ) != targetFunction( p ):
w = np.add( w, targetFunction( p ) * p )
wrongpoints += 1
break
if wrongpoints == 0:
done = True
If anyone is interested in the Python implementation, here is my full code with plotting:
https://gist.github.com/2395395

and the one for the experiments:
https://gist.github.com/2395409
#39
04-17-2012, 02:33 PM
 rukacity Member Join Date: Apr 2012 Posts: 21
Re: Perceptron Learning Algorithm

Quote:
 Originally Posted by zsero I've written the algorithm just based on the lecture and it worked. Here is the "core" of the algorithm (written in Python / Numpy): Code: w = np.zeros( 3 ) done = False while not done: wrongpoints = 0 for p in points: if np.sign( np.dot(w, p) ) != targetFunction( p ): w = np.add( w, targetFunction( p ) * p ) wrongpoints += 1 break if wrongpoints == 0: done = True If anyone is interested in the Python implementation, here is my full code with plotting: https://gist.github.com/2395395 and the one for the experiments: https://gist.github.com/2395409
Argh! I am new to Python and did not know about numpy. It looks so much more concise than my code that has all the vector multiplication and what not. One thing I noticed is that you keep adding to weights for each wrongpoint whereas Yaser had mentioned to adjust it only once per iteration.

Correction: My bad! Missed the 'break' statement
#40
04-17-2012, 02:35 PM
 zsero Junior Member Join Date: Apr 2012 Posts: 6
Re: Perceptron Learning Algorithm

Quote:
 Originally Posted by rukacity Argh! I am new to Python and did not know about numpy. It looks so much more concise than my code that has all the vector multiplication and what not. One thing I noticed is that you keep adding to weights for each wrongpoint whereas Yaser had mentioned to adjust it only once per iteration.
No, that's what the break statement does. Once it found a wrong point, it corrects and then brakes the cycle and then the whole loop starts again.

Yes, numpy is the best thing ever made for Python! Pseudoinverse of a matrix? np.pniv :-)

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 05:09 PM.