LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 1 (http://book.caltech.edu/bookforum/forumdisplay.php?f=130)
-   -   Computing Pr(f(x) != g(x)) (http://book.caltech.edu/bookforum/showthread.php?t=3833)

gah44 01-11-2013 04:37 PM

Computing Pr(f(x) != g(x))
 
I am not sure how others compute Pr(f(x) != g(x)).

It isn't so hard to do analytically, but there are a lot of cases to consider, depending on which boundary of the square the lines cross.

I now have one that just does the cases where both lines intersect the top and bottom of the square. By symmetry, the cases that intersect the left and right sides should be the same.

That leaves out the ones where lines intersect top or bottom and side. Those might have smaller area (probability) but maybe not too much smaller.

So far, it is close enough to one of the answers that I will go for that one. The other cases are enough harder to write that I won't try them.

butterscotch 01-12-2013 01:01 AM

Re: Computing Pr(f(x) != g(x))
 
Yes as you mentioned, covering all the cases of integrations can be tricky.
Another effective method is generating many random points over the X space and simply counting when f and g disagree over those points.

gah44 01-12-2013 04:46 AM

Re: Computing Pr(f(x) != g(x))
 
Quote:

Originally Posted by butterscotch (Post 8585)
Yes as you mentioned, covering all the cases of integrations can be tricky.
Another effective method is generating many random points over the X space and simply counting when f and g disagree over those points.

Yes that is another way. Somehow it doesn't seem as satisfying as the analytical solution, even an approximate analytical solution.

As the lines get closer, it takes more and more random points to get enough points to disagree.

Also, as the lines get closer, it is more likely that both go through the same edge of the square, and so the easier to compute integrations are more and more significant.

For the 100 point case, top and bottom gets about 30% of the trials. By symmetry, the left and right case should give the same answer. I separately compute the case where the lines cross inside the square and when they don't. (I could double the trials to make up for the not counted left-right.)

Then there are four (symmetric) cases where both lines go through a top or bottom side and a left or right side. That should get the majority of the rest of the cases.

palmipede 01-14-2013 06:46 PM

Re: Computing Pr(f(x) != g(x))
 
The distinction between exact and stochastic methods is not air tight because exact methods often have to take care of round-off errors and these may accumulate to noticeable quantities.

Complexity is the main issue in my opinion, the more complex (whether the code or the data) the more likely an implementation is buggy and the longer it takes to make it correct. I got shot for that reason just today. :o

vbipin 01-14-2013 07:52 PM

Re: Computing Pr(f(x) != g(x))
 
I computed the probabilities by approximating it by taking a large set of points. Here is what I did:

After the run of PLA we have the g function ( in the form of weights )
We also have the f ( since we generated it )
Now create a large number ( L ) of fresh random points in the space
Check how many of these points have f(x) != g(x); call it 'Error'

Compute the probability = Error/L

Note:
I don't know if this method is correct or not.
I don't know what is the value of L that is "sufficiently large";
I used L = 1000000
The computed probabilities will very ( You may want to take an average )
I got the answers correct :) ( May be by luck )

If any of you tried the approximate computing method, can you please share your ideas.

Thanks,
Bipin

palmipede 01-14-2013 08:19 PM

Re: Computing Pr(f(x) != g(x))
 
Hi Bipin,

All methods will give similar probability estimates, when programmed correctly. I prefer exact direct methods because complex stochastic methods tend to be very hard to debug, but the perceptron isn't complex.

I normally code complex algorithms in two different ways in order check the numbers. This time I got sloppy, only implemented a direct method and got spanked for it. It is fair.

You may be able to make your sampling more efficient using recursion which can also be used for relaxing an exact method, but it won't make much of a difference for this assignment even with N=1000.

Suhas Patil 01-14-2013 09:15 PM

Re: Computing Pr(f(x) != g(x))
 
Quote:

Originally Posted by vbipin (Post 8661)
I computed the probabilities by approximating it by taking a large set of points. Here is what I did:

After the run of PLA we have the g function ( in the form of weights )
We also have the f ( since we generated it )
Now create a large number ( L ) of fresh random points in the space
Check how many of these points have f(x) != g(x); call it 'Error'

Compute the probability = Error/L

Note:
I don't know if this method is correct or not.
I don't know what is the value of L that is "sufficiently large";
I used L = 1000000
The computed probabilities will very ( You may want to take an average )
I got the answers correct :) ( May be by luck )

If any of you tried the approximate computing method, can you please share your ideas.

Thanks,
Bipin

This looks the right way....I also got the answers right with this approach.

bwtaylor 01-15-2013 04:24 AM

Re: Computing Pr(f(x) != g(x))
 
I used a uniform grid of points instead of randomly selected points. Specifically, the 40401 points whose coordinates each range from -1.00 to 1.00 by 0.01 steps. The answers vary by orders of magnitude and are an average over N anyway, so this is more than adequate.

gah44 01-15-2013 03:34 PM

Re: Computing Pr(f(x) != g(x))
 
Hmm. You could also do a 2D Gaussian Quadrature.

GQ uses a specific non-uniform set of points that minimizes the error in a 2N degree polynomial.

(For a square, you usually consider each variable separately, but there are also some 2D arrays of points.)


All times are GMT -7. The time now is 03:28 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.