LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 1

Reply
 
Thread Tools Display Modes
  #1  
Old 07-11-2012, 04:48 PM
samirbajaj samirbajaj is offline
Member
 
Join Date: Jul 2012
Location: Silicon Valley
Posts: 48
Default PLA - Need Guidance

Greetings!

I am working on the Perceptron part of the homework, and having spent several hours on it, I'd like to know if I am proceeding in the right direction:

1) My implementation converges in 'N' iterations. This looks rather fishy. Any comments would be appreciated. (Otherwise I may have to start over :-( maybe in a different programming language)

2) I don't understand the Pr( f(x) != g(x) ) expression -- what exactly does this mean? Once the algorithm has converged, presumable f(x) matches g(x) on all data, so the difference is zero.


Thanks.

-Samir
Reply With Quote
  #2  
Old 07-11-2012, 06:13 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: PLA - Need Guidance

Quote:
Originally Posted by samirbajaj View Post
I don't understand the Pr( f(x) != g(x) ) expression -- what exactly does this mean? Once the algorithm has converged, presumable f(x) matches g(x) on all data, so the difference is zero
On all data, yes. However, the probability is with respect to {\bf x} over the entire input space, not restricted to {\bf x} being in the finite data set used for training.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 07-12-2012, 08:30 AM
jakvas jakvas is offline
Member
 
Join Date: Jul 2012
Posts: 17
Default Re: PLA - Need Guidance

If we try to evaluate Pr(f(x)!=g(x)) experimentaly how many random verification points should we use to get a significant answear?

I am tempted to believe that Hoeffding's inequality is applicable in this case to a single experiment but since we are averaging out over very many experiments I'm not sure on how to choose the amount of those verification data points (I ultimately worked with 10000 per experiment just to be sure).
Reply With Quote
  #4  
Old 07-12-2012, 10:56 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: PLA - Need Guidance

Quote:
Originally Posted by jakvas View Post
I am tempted to believe that Hoeffding's inequality is applicable in this case to a single experiment but since we are averaging out over very many experiments I'm not sure on how to choose the amount of those verification data points (I ultimately worked with 10000 per experiment just to be sure).
Indeed, the average helps smooth out statistical fuctuations. Your choice of 10000 points is pretty safe.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 07-16-2012, 10:19 PM
jtwang jtwang is offline
Junior Member
 
Join Date: Jul 2012
Posts: 1
Default Re: PLA - Need Guidance

How would you determine f(x) == g(x) exactly - since the set of possible hypotheses is infinite (3 reals), wouldn't Pr(f(x) != g(x)) == 1? Obviously you could choose some arbitrary epsilon but then that wouldn't be "exactly."
Reply With Quote
  #6  
Old 07-16-2012, 10:39 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: PLA - Need Guidance

Quote:
Originally Posted by jtwang View Post
How would you determine f(x) == g(x) exactly - since the set of possible hypotheses is infinite (3 reals), wouldn't Pr(f(x) != g(x)) == 1? Obviously you could choose some arbitrary epsilon but then that wouldn't be "exactly."
f({\bf x})=g({\bf x}) is per point {\bf x}. It may be true for some {\bf x}'s and false for others, hence the notion of probability that it's true (probability with respect to {\bf x}). We are not saying that f is identically equal to g.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #7  
Old 01-15-2013, 08:20 PM
gah44 gah44 is offline
Invited Guest
 
Join Date: Jul 2012
Location: Seattle, WA
Posts: 153
Default Re: PLA - Need Guidance

Quote:
Originally Posted by jtwang View Post
How would you determine f(x) == g(x) exactly - since the set of possible hypotheses is infinite (3 reals), wouldn't Pr(f(x) != g(x)) == 1? Obviously you could choose some arbitrary epsilon but then that wouldn't be "exactly."
There are two lines, the original line that determines the separation between +1 and -1, and the line determined by the PLA. The questions ask what fraction of the space is different between the two lines. If they don't cross, that is the area between the two lines (divided by four, the total area). If they do cross, it is the area of two triangles (or, sometimes, quadrilaterals).

Each line can crosses two of the sides of the square. (I suppose it could also go right through a corner, but not likely). Handling all the possible combinations of the two lines is a lot of work.

In another thread I discussed how I did it, only counting, and computing the area of, cases where both lines go through the top and bottom. That is about 30% in my tests. By symmetry, there should also be 30% where both go through the left and right sides of the square The remaining cases might have a little less area, but I based my answer on just the lines going through the top and bottom of the square. Seemed more interesting than the choose random point method.
Reply With Quote
  #8  
Old 01-14-2013, 08:18 PM
vbipin vbipin is offline
Member
 
Join Date: Jan 2013
Location: Shanghai
Posts: 18
Default Re: PLA - Need Guidance

Quote:
Originally Posted by yaser View Post
Indeed, the average helps smooth out statistical fuctuations. Your choice of 10000 points is pretty safe.
Dear Professor,

Can you kindly explain how we can calculate this number. How can we ensure that the number is "sufficiently large"

Thanks,
Bipin
Reply With Quote
  #9  
Old 01-14-2013, 08:40 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: PLA - Need Guidance

Quote:
Originally Posted by vbipin View Post
Dear Professor,

Can you kindly explain how we can calculate this number. How can we ensure that the number is "sufficiently large"

Thanks,
Bipin
The probability for a single random point to be misclassified is, say, \mu. Therefore the variance for one point (1 if misclassified, 0 if classified correctly) is \mu(1-\mu) which is at most 0.25 independently of \mu. If you average the misclassification value of 10000 points, the expected value will be \mu (which is what you want) and the variance will be at most 0.25/10000 (because of independence). The standard deviation which is the square root of this variance gives you an indication of the "error bar" around the expected value that you are likely to get in your estimate. In the multiple-choice setup, we want the error bar to be small enough to make it highly unlikely that your estimate will take you away from the correct answer to the nearest incorrect answer. This is why 10000 is "sufficiently large" in this case.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #10  
Old 01-08-2013, 03:15 PM
dobrokot dobrokot is offline
Junior Member
 
Join Date: Jan 2013
Posts: 3
Default Re: PLA - Need Guidance

Quote:
Originally Posted by jakvas View Post
I'm not sure on how to choose the amount of those verification data points (I ultimately worked with 10000 per experiment just to be sure).
Hoeffding inequality given in same lesson can help to choose number of points. g(x)!=f(x) can be thinked as red marble
Reply With Quote
Reply

Tags
convergence, iterations, perceptron, pla

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 09:22 AM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.