LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   The Final (http://book.caltech.edu/bookforum/forumdisplay.php?f=138)

 DeanS 09-14-2012 03:29 PM

Q19

I was wondering if the problem assumes that some learning has been done to determine P(D|h=f) for some population or if the person with the heart attack is the only person in D. Obviously, I may not understand Bayesian analysis.

 yaser 09-14-2012 04:37 PM

Re: Q20

Quote:
 Originally Posted by DeanS (Post 5286) I was wondering if the problem assumes that some learning has been done to determine P(D|h=f) for some population or if the person with the heart attack is the only person in D. Obviously, I may not understand Bayesian analysis.
The set is the set of available data points, so in this case it is that one person with a heart attack. This problem will help you understand the Bayesian reasoning better.

 DeanS 09-15-2012 08:17 AM

Re: Q20

Thank you very much for the quick reply. This has been an amazing course!!

 fgpancorbo 09-16-2012 09:56 PM

Re: Q20

I am still a bit confused about the setup of the problem. Is it correct to assume that what we are trying to determine is the underlying probability of somebody picked at random from the population to have a heart attack out of a single sample? If so, shouldn't be relevant? If a single point is all we have, call it the binary variable - equal to 1 if the patient has a heart attach; 0 if he doesn't-, that would be the probability of generating a single point with a patient having a heart attack, given the underlying probability that a person has a heart attack, right? In that case, the posterior is going to have two cases and . The question refers only to case right?

 yaser 09-16-2012 10:12 PM

Re: Q20

Quote:
 Originally Posted by fgpancorbo (Post 5383) Is it correct to assume that what we are trying to determine is the underlying probability of somebody picked at random from the population to have a heart attack out of a single sample?

It should be based on rater than out of. A source of confusion here is that is a probability, but then we have a probability distribution over . Let us just call the fraction of heart attacks in the population. Then the problem is addressing the probability distribution of that fraction - Is the fraction more likely to be 0.1 or 0.5 or 0.9 etc. The prior is that that fraction is equally likely to be anything (uniform probability). The problem then asks how this probability is modified if we get a sample of a single patient and they happen to have a heart attack.

 fgpancorbo 09-16-2012 10:44 PM

Re: Q20

Quote:
 Originally Posted by yaser (Post 5385) (emphasis added) It should be based on rater than out of. A source of confusion here is that is a probability, but then we have a probability distribution over . Let us just call the fraction of heart attacks in the population. Then the problem is addressing the probability distribution of that fraction - Is the fraction more likely to be 0.1 or 0.5 or 0.9 etc. The prior is that that fraction is equally likely to be anything (uniform probability). The problem then asks how this probability is modified if we get a sample of a single patient and they happen to have a heart attack.
I see. If my understanding is correct, I think that I can safely assume that , in which is made of a single random variable say , has a Bernoulli distribution with parameter . Is that right?

 yaser 09-16-2012 10:59 PM

Re: Q20

Quote:
 Originally Posted by fgpancorbo (Post 5392) I see. If my understanding is correct, I think that I can safely assume that , in which is made of a single random variable say , has a Bernoulli distribution with parameter . Is that right?
Right. In terms of , that would be parameter .

 ilya19 03-19-2013 08:04 AM

Re: Q20

If I understand the problem correctly, P(X=1) is independent on P(h=f). Correct?

 Haowen 03-19-2013 09:36 AM

Re: Q20

Quote:
 Originally Posted by ilya19 (Post 10027) If I understand the problem correctly, P(X=1) is independent on P(h=f). Correct?
P(X=1) is defined over the full joint distribution, i.e. . The h is marginalized out by the summation. However it doesn't mean that X and h are independent.

The reason why you can ignore P(X=1) is because in Bayesian analysis you usually don't care about the absolute probability of the dataset since it is just a constant that all of your hypotheses are divided by, equally, so it doesn't affect which hypothesis is a-posteriori most probable.

 boulis 03-19-2013 04:09 PM

Re: Q20

These are some fine points here. We have to use with exact meaning of terms, using them loosely can create misunderstandings.
At a first reading I thought that Haowen's answer was not correct, and also ilya's remark was not correct too. On a second reading, Haowen's answer is correct, but I am not sure that it answers the initial question, since the initial question/remark by ilya was ambiguous. Let me explain.

When we talk about independency we can talk about it in probability terms, where we have specific rules on random variables being independent, or we can talk about it in more loose/everyday terms when we want to express that something affects something else.

Ilya's question is expressed loosely. It talks about independence of Probabilities not random events. It can be taken with several different meanings.
1) If you are really asking whether X=1 and h=f are independent events, then we can clearly say they are not. The choice of h clearly affects the probability of X=1. More specifically, the choice of h is the probability of X=1.
2) If you are asking whether the distribution of h=f affects P(X=1) for all possible h, (which can be taken as the more literal interpretation of what you are asking) then again: yes there is a connection and Haowen gives you the formula.
3) If you are asking in general "should we care about calculating the value for P(X=1)", then Haowen gives you the answer again.
4) If you are asking whether the event X=1 affects the probability of h=f, it depends whether you are really referring to the a-priori or the a-posteriori. It does not affect the apriori and it does affect the a-posteriori (and Q20 asks how).

 sptripathi 06-09-2013 10:33 AM

Re: Q20

Need help in verifying if below understanding is correct ?

The Bayesian:
P(h=f | D) = P(D | h=f) * P(h=f) / P(D)

For this Q, we are given:
P(h=f) is uniform in [0,1]
D: one-person-with-heart-attack
Pick f = c (constant)

To simplify, I assume that h and f are a discrete random-variables with 10 possible values from (0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0)
and each is equally likely with P=1/10. Essentially simplifying here to make P(h=f) a pmf which is actually a pdf.

Now:

P (D | h=f)
= Pr( one-person-with-heart-attack | h=f )
= Probability of one-person-with-heart-attack, given (h=f)
= c

( because if h=f were given, then the Prob of one picked person getting heart-attack is c, as defined by f )

Plug in above to get:
P(h=f | D) = c * P(h=f) / P(D)

Does above sound correct ?
Also P(D) =1 in this case ?

Thanks.

 Dorian 06-09-2013 02:15 PM

Re: Q20

I find this exercise simple but very useful. If one thinks of the series of following measurements (1s and 0s for heart attack or not) one can clearly form an idea how this transforms step-by-step from a uniform distribution to a Bernoulli one.

Does this mean that this example represents one of those cases where the initial prior is irrelevant and we can safely use it for learning? Also, is this some form of reinforcement learning?

thanks,
Dorian.

 yaser 06-09-2013 05:23 PM

Re: Q20

Quote:
 Originally Posted by Dorian (Post 11063) Does this mean that this example represents one of those cases where the initial prior is irrelevant and we can safely use it for learning? Also, is this some form of reinforcement learning?
In this case, with sufficient number of examples, the prior indeed fades away. Noisy examples blur the line between supervised and reinforcement learning somewhat as the information provided by the output is less definitive than in the noiseless case.

 nkatz 06-10-2013 08:36 PM

Re: Q20

I am very confused by this problem. Perhaps this questions will help:
Is P(D|h=f) a function of D or h or both? It looks to me like it's a function of D, but we need to convert it to a function of h to get the posterior...:clueless: Is this correct?

 yaser 06-10-2013 09:21 PM

Re: Q20

Quote:
 Originally Posted by nkatz (Post 11097) Is P(D|h=f) a function of D or h or both?
Let us first clarify the notions. The data set has one data point in it which is either (heart attack) or (no heart attack). Being a function of means being a function of that value ( ), so indeed is a function of , and it so happens in this problem that the value is fixed at . The probability is also a function of (which happens to be the same as according to what we are conditioning on).

Since is fixed, this leaves as a function of only.

 Elroch 06-11-2013 03:46 AM

Re: Q20

There is an analogy that may be enlightening, which I thought of because of the presentation of the first part of this course.

Suppose you have a large number of urns each containing a large number of black and white balls in varying proportions. You are told how many urns there are with each proportion.

Then you go up to one of the urns and take out a ball which you find is black. The question is how likely it is that specific urn has each particular fraction of black balls.

 All times are GMT -7. The time now is 10:55 PM.