PLA  Need Guidance
Greetings!
I am working on the Perceptron part of the homework, and having spent several hours on it, I'd like to know if I am proceeding in the right direction: 1) My implementation converges in 'N' iterations. This looks rather fishy. Any comments would be appreciated. (Otherwise I may have to start over :( maybe in a different programming language) 2) I don't understand the Pr( f(x) != g(x) ) expression  what exactly does this mean? Once the algorithm has converged, presumable f(x) matches g(x) on all data, so the difference is zero. Thanks. Samir 
Re: PLA  Need Guidance
If we try to evaluate Pr(f(x)!=g(x)) experimentaly how many random verification points should we use to get a significant answear?
I am tempted to believe that Hoeffding's inequality is applicable in this case to a single experiment but since we are averaging out over very many experiments I'm not sure on how to choose the amount of those verification data points (I ultimately worked with 10000 per experiment just to be sure). 
Re: PLA  Need Guidance
Quote:

Re: PLA  Need Guidance
How would you determine f(x) == g(x) exactly  since the set of possible hypotheses is infinite (3 reals), wouldn't Pr(f(x) != g(x)) == 1? Obviously you could choose some arbitrary epsilon but then that wouldn't be "exactly."

Re: PLA  Need Guidance
Quote:

Re: PLA  Need Guidance
Quote:

Re: PLA  Need Guidance
I still don't understand this Pr() function. Given two (linear) functions f and g, what is the Pr() of f and g?
Thanks...Neil 
Re: PLA  Need Guidance
Quote:
BTW, anyone who wants to refresh some of the prerequisite material for the course, here are some recommendations: http://book.caltech.edu/bookforum/showthread.php?t=3720 
Re: PLA  Need Guidance
Sir
If I understand correctly, we are using N = 10 training points out of a randomly generated x points according to the target function f for perceptron learning and Pr(f(x) != g(x)) should be calculated considering all x points and not just the training data. Am I right ? 
Re: PLA  Need Guidance
Quote:

Re: PLA  Need Guidance
For Pr(f(x) != g(x)), you would want to generate a lot of random points. However, I would not include the training data for this calculation. f & g would always agree on this training data, and wouldn't represent Pr(f(x) != g(x)) fairly.

Re: PLA  Need Guidance
Quote:
Can you kindly explain how we can calculate this number. How can we ensure that the number is "sufficiently large" Thanks, Bipin 
Re: PLA  Need Guidance
Hi all,
it would be great if any of you could tell me if my understanding of this algorithm is correct. 1)pick a target function(f)(i.e a line in the [1,1] * [1,1] plane) 2)generate a set of training examples of the form ((x_d,y_d),output) (I am confused from this point onwards,please correct me if I am wrong:) ) 3)we initially set H as a vector of 0.0s 4)apply the perceptron iteration till the h function 'linearly seperates' all the data points.(we call this g). 5)test this g on a set of new data points (test_data,f(test_data)). 6)compare this with h and calculate error probability. 
Re: PLA  Need Guidance
Quote:

Re: PLA  Need Guidance
Quote:

Re: PLA  Need Guidance
thank you professor:)so in reality, step 1 represents the given problem.

Re: PLA  Need Guidance
Hi Prof. Yaser,
I need help in implementation of PLA in JAVA . How i did it is generate two random points , and calculate the line in ax+by = 1 format. After this i generate 100 random points and calculate yn of each point and put them in a 100x3 matrix. I start with a weight vector of size 2 ( 1 for calculating a , other for b) with initial values of zero. Now i loop through the random point matrix and apply the PLA formula for the misclassified points . The program stops when there are no misclassified points. The problem i am facing is for 100 points its not converging quite frequently . And for 10 points it is converging in 23 iterations. So this is my problem professor , can you olease help me in catching what i am missing. Thanks, Trk. 
Re: PLA  Need Guidance
The weight vector should include a bias term. In this case dimension of input is 2, so weight vector will be of size 3. Right now you have ax + by = 1, while "1" never gets updated. I suspect when there are small number of pts, the algorithm could converge without the bias term, but with more pts (requiring higher accuracy for no pts to be misclassifed), it doesn't.

Re: PLA  Need Guidance
Quote:
Each line can crosses two of the sides of the square. (I suppose it could also go right through a corner, but not likely). Handling all the possible combinations of the two lines is a lot of work. In another thread I discussed how I did it, only counting, and computing the area of, cases where both lines go through the top and bottom. That is about 30% in my tests. By symmetry, there should also be 30% where both go through the left and right sides of the square The remaining cases might have a little less area, but I based my answer on just the lines going through the top and bottom of the square. Seemed more interesting than the choose random point method. 
Re: PLA  Need Guidance
Hi butterscotch,
Thanks for the tip now its converging consistently, but it is taking much lesser iterations , for 10 training points avg iterations are 1.5 and for 100 touchpoints it is around 10. These values are wrong according to the answer sheet.Can you help me with this. now i am using ax+by = c line equation. The bias term i am initiating it to the constant term of the line that i created initially from two random points. For misclassified points bias term is w = w+/ 1. Is this correct or still am i missing something. Thanks, Trk 
Re: PLA  Need Guidance
Does your program print out the number of iterations and probability of error after every individual trial?
If not, you might want to add that so that you can watch it calculate and compare those numbers to the mean you get at the end. It will help narrow down whether the problem is in your algorithm or a simple math error. I.e., if you're getting 1.5 for the average iterations per trial, then most of your individual trials should have numbers like 1 or 2 iterations. 
Re: PLA  Need Guidance
Hi all,
I did a little mistake in check for convergence , now i corrected it still its taking far lesser iterations to converge . I am checking how the final values seperate the trainig Points, it is matching perfectly . WHAT I THINK IS THAT I AM INITIATING W0 TO THE CONSTANT TERM OF THE ORIGINAL LINE so the algo is taking lesser iterations . But If i initiate it to 0. It is not converging. MY CONCERN IS IF W0 IS INITIATED TO 0. THEN IT IS NOT CONVERGING. SHOULD I INITIATE W0 WITH SOME RANDOM NUMBER . Fellow students please Help me here . I am stuck on this. 
Re: PLA  Need Guidance
w0 is definitely supposed to be initialized to zero, so there's a logic error in your code somewhere. If whether or not you converge depends on the initial value, you're probably not actually updating w0. w0 should update every iteration.
It would be helpful if you can run a 1trial test and manually examine the results of every iteration (see what the new weights are, how many points are currently misclassified.) This is really easy to do in the language I'm using but I'm not familiar with Java. You could probably print out the values or graph them visually. Either way, changing the trials from 1000 to 1 and stepping through the algorithm may show you an obvious source of error. Then once one trial gives you the results you're supposed to get, doing multiple trials should work out cleanly. If you still can't figure out what's going wrong, then you could post here the output of the program for one trial. 
Re: PLA  Need Guidance
I think I've got the PLA working correctly, but I'm not sure that I can see how w0 ends up being anything other than 1, 0 or 1. Is this correct?
if w = w + xn*yn, but x0 always is 1 and yn is 1 or 1, then w0 is only ever going to change by 1 or 1. No? (Correction: w0 only ever changes by 1 or 1) 
Re: PLA  Need Guidance
Quote:

All times are GMT 7. The time now is 03:21 PM. 
Powered by vBulletin® Version 3.8.3
Copyright ©2000  2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. AbuMostafa, Malik MagdonIsmail, and HsuanTien Lin, and participants in the Learning From Data MOOC by Yaser S. AbuMostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.