LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 1

Reply
 
Thread Tools Display Modes
  #1  
Old 07-17-2012, 04:24 PM
nellie nellie is offline
Junior Member
 
Join Date: Jul 2012
Posts: 4
Post Probability calculation in Q8 and Q10

I correctly approximated the P(f(x) != g(x)) by generating a lot of sample points. However, how do you calculate the probability exactly? Thank you in advance for your help!
Reply With Quote
  #2  
Old 07-17-2012, 07:22 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Probability calculation in Q8 and Q10

Quote:
Originally Posted by nellie View Post
I correctly approximated the P(f(x) != g(x)) by generating a lot of sample points. However, how do you calculate the probability exactly? Thank you in advance for your help!
You can compute the total area of misclassification divided by 4 (the total area of the input space). The area of misclassifcation will be that of points for which f and g give different values. In one of the theory lectures, there will be a figure depicting this area. It is easier to compute it numerically as you did.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 10-10-2012, 08:19 AM
ketchers ketchers is offline
Junior Member
 
Join Date: Oct 2012
Posts: 5
Default Re: Probability calculation in Q8 and Q10

I would be curious on how to calculate - or estimate - the average number of iterations for convergence - as well as the probability - perhaps these are really the same question in disguise. It is a very simple procedure, but there is enough random elements to it so that it leaves the realm of what I am familiar with calculating:

As I understand it ...

1) choose N random points in the unit square
2) choose a random line

This gives the target

3) start with w(0) = [0;0]
4) run algorithm where in computing w(n+1) a random selection from the mis-classified points is used. (perhaps the random choice here is not-relevant)


A pointer to where a similar calculation is made would be fine. Thanks!
Reply With Quote
  #4  
Old 10-10-2012, 08:49 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Probability calculation in Q8 and Q10

Quote:
Originally Posted by ketchers View Post
I would be curious on how to calculate - or estimate - the average number of iterations for convergence - as well as the probability
Calculating these quantities analytically is not tractable. Estimating them using Monte Carlo methods, i.e., by running many random instances of the problem and averaging, is what we are after here. As you point out, there are many sources of randomness and some will result in significant variation. However, repeating the experiment a large number of times will overcome that variance. The numbers given for this problem were chosen to achieve that.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 10-15-2012, 01:59 PM
gah44 gah44 is offline
Invited Guest
 
Join Date: Jul 2012
Location: Seattle, WA
Posts: 153
Default Re: Probability calculation in Q8 and Q10

Quote:
Originally Posted by yaser View Post
Calculating these quantities analytically is not tractable. Estimating them using Monte Carlo methods, i.e., by running many random instances of the problem and averaging, is what we are after here. As you point out, there are many sources of randomness and some will result in significant variation. However, repeating the experiment a large number of times will overcome that variance. The numbers given for this problem were chosen to achieve that.
It should be possible to give an order of magnitude estimate, though if that was easy then you wouldn't have to write the program to answer the problem.

It looks a little similar to the relaxation algorithms for solving partial differential equations. In that case, the theory is well studied.

Even without knowing that, you could find the scaling law. How the number of iterations changes, on the average, with the number of points. But once you have the MC program, it is easy enough to change the number of points.
Reply With Quote
  #6  
Old 10-15-2012, 02:04 PM
gah44 gah44 is offline
Invited Guest
 
Join Date: Jul 2012
Location: Seattle, WA
Posts: 153
Default Re: Probability calculation in Q8 and Q10

Quote:
Originally Posted by yaser View Post
You can compute the total area of misclassification divided by 4 (the total area of the input space). The area of misclassifcation will be that of points for which f and g give different values. In one of the theory lectures, there will be a figure depicting this area. It is easier to compute it numerically as you did.
The area is a sum of two figures that are either triangles or quadrilaterals. You have to figure out the intersection of each of the lines with the region boundary, then remember which is in and which is out, and compute the area of the triangle or quadrilateral. Getting all the different possible cases right is what would take time. (I didn't try actually doing it, just figuring out how I might do it.)
Reply With Quote
  #7  
Old 04-06-2013, 06:12 PM
mvellon mvellon is offline
Junior Member
 
Join Date: Apr 2013
Posts: 9
Default Re: Probability calculation in Q8 and Q10

Your value of g, assuming it doesn't exactly match f, will be a line with a slope and intercept close, though not exactly that of f. Imagine then, two intersecting lines. The area in between the lines correspond to values in X that g misclassifies. The ratio between that area and that bounded by the region [-1,-1]x[1,1] (area 4) will be the probability of error. Alas, calculating the area between the lines is ugly. The problem would have been much simpler if the domain for X had been points inside the circle at (0,0) with radius 1.
Reply With Quote
  #8  
Old 09-02-2015, 04:06 AM
henry2015 henry2015 is offline
Member
 
Join Date: Aug 2015
Posts: 31
Default Re: Probability calculation in Q8 and Q10

Quote:
Originally Posted by yaser View Post
You can compute the total area of misclassification divided by 4 (the total area of the input space). The area of misclassifcation will be that of points for which f and g give different values. In one of the theory lectures, there will be a figure depicting this area. It is easier to compute it numerically as you did.
Hi, could you elaborate why we need to divide the total area of misclassification by 4?

If the input space is 2D, I can see that there are two triangles formed by f(x) and g(x), and the total area of the two triangles include all the errors. So I thought the P(f(x) != g(x)) = area of error / total area of input space...no?

Thanks in advance.
Reply With Quote
  #9  
Old 09-02-2015, 09:22 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: Probability calculation in Q8 and Q10

Quote:
Originally Posted by henry2015 View Post
Hi, could you elaborate why we need to divide the total area of misclassification by 4?
The total area of the input space is 4 and that corresponds to total probability 1, so one needs to normalize areas by 4 in order to get the corresponding probabilities.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #10  
Old 09-02-2015, 09:48 PM
henry2015 henry2015 is offline
Member
 
Join Date: Aug 2015
Posts: 31
Default Re: Probability calculation in Q8 and Q10

Argh, I see.

Missed the boundaries stated in the problem. My bad.

Thanks.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 11:30 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.