LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   The Final (http://book.caltech.edu/bookforum/forumdisplay.php?f=138)
-   -   Q19 (http://book.caltech.edu/bookforum/showthread.php?t=1528)

DeanS 09-14-2012 03:29 PM

Q19
 
I was wondering if the problem assumes that some learning has been done to determine P(D|h=f) for some population or if the person with the heart attack is the only person in D. Obviously, I may not understand Bayesian analysis.

yaser 09-14-2012 04:37 PM

Re: Q20
 
Quote:

Originally Posted by DeanS (Post 5286)
I was wondering if the problem assumes that some learning has been done to determine P(D|h=f) for some population or if the person with the heart attack is the only person in D. Obviously, I may not understand Bayesian analysis.

The set {\cal D} is the set of available data points, so in this case it is that one person with a heart attack. This problem will help you understand the Bayesian reasoning better.

DeanS 09-15-2012 08:17 AM

Re: Q20
 
Thank you very much for the quick reply. This has been an amazing course!!

fgpancorbo 09-16-2012 09:56 PM

Re: Q20
 
I am still a bit confused about the setup of the problem. Is it correct to assume that what we are trying to determine is the underlying probability of somebody picked at random from the population to have a heart attack out of a single sample? If so, shouldn't P(\mathcal{D}|h=f) be relevant? If a single point is all we have, call it the binary variable x - equal to 1 if the patient has a heart attach; 0 if he doesn't-, that would be the probability of generating a single point with a patient having a heart attack, given the underlying probability that a person has a heart attack, right? In that case, the posterior is going to have two cases P(h=f|x=1) and P(h=f|x=0). The question refers only to case P(h=f|x=1) right?

yaser 09-16-2012 10:12 PM

Re: Q20
 
Quote:

Originally Posted by fgpancorbo (Post 5383)
Is it correct to assume that what we are trying to determine is the underlying probability of somebody picked at random from the population to have a heart attack out of a single sample?

(emphasis added)

It should be based on rater than out of. A source of confusion here is that f is a probability, but then we have a probability distribution over f. Let us just call f the fraction of heart attacks in the population. Then the problem is addressing the probability distribution of that fraction - Is the fraction more likely to be 0.1 or 0.5 or 0.9 etc. The prior is that that fraction is equally likely to be anything (uniform probability). The problem then asks how this probability is modified if we get a sample of a single patient and they happen to have a heart attack.

If I have not answered your question, please ask again perhaps in those terms.

fgpancorbo 09-16-2012 10:44 PM

Re: Q20
 
Quote:

Originally Posted by yaser (Post 5385)
(emphasis added)
It should be based on rater than out of. A source of confusion here is that f is a probability, but then we have a probability distribution over f. Let us just call f the fraction of heart attacks in the population. Then the problem is addressing the probability distribution of that fraction - Is the fraction more likely to be 0.1 or 0.5 or 0.9 etc. The prior is that that fraction is equally likely to be anything (uniform probability). The problem then asks how this probability is modified if we get a sample of a single patient and they happen to have a heart attack.

I see. If my understanding is correct, I think that I can safely assume that P(\mathcal{D}|h=f), in which \mathcal{D} is made of a single random variable say x, has a Bernoulli distribution with parameter p=f. Is that right?

yaser 09-16-2012 10:59 PM

Re: Q20
 
Quote:

Originally Posted by fgpancorbo (Post 5392)
I see. If my understanding is correct, I think that I can safely assume that P(\mathcal{D}|h=f), in which \mathcal{D} is made of a single random variable say x, has a Bernoulli distribution with parameter p=f. Is that right?

Right. In terms of h, that would be parameter p=h.

ilya19 03-19-2013 08:04 AM

Re: Q20
 
If I understand the problem correctly, P(X=1) is independent on P(h=f). Correct?

Haowen 03-19-2013 09:36 AM

Re: Q20
 
Quote:

Originally Posted by ilya19 (Post 10027)
If I understand the problem correctly, P(X=1) is independent on P(h=f). Correct?

P(X=1) is defined over the full joint distribution, i.e. P(X=1)=\sum_h P(X=1,h). The h is marginalized out by the summation. However it doesn't mean that X and h are independent.

The reason why you can ignore P(X=1) is because in Bayesian analysis you usually don't care about the absolute probability of the dataset since it is just a constant that all of your hypotheses are divided by, equally, so it doesn't affect which hypothesis is a-posteriori most probable.

boulis 03-19-2013 04:09 PM

Re: Q20
 
These are some fine points here. We have to use with exact meaning of terms, using them loosely can create misunderstandings.
At a first reading I thought that Haowen's answer was not correct, and also ilya's remark was not correct too. On a second reading, Haowen's answer is correct, but I am not sure that it answers the initial question, since the initial question/remark by ilya was ambiguous. Let me explain.

When we talk about independency we can talk about it in probability terms, where we have specific rules on random variables being independent, or we can talk about it in more loose/everyday terms when we want to express that something affects something else.

Ilya's question is expressed loosely. It talks about independence of Probabilities not random events. It can be taken with several different meanings.
1) If you are really asking whether X=1 and h=f are independent events, then we can clearly say they are not. The choice of h clearly affects the probability of X=1. More specifically, the choice of h is the probability of X=1.
2) If you are asking whether the distribution of h=f affects P(X=1) for all possible h, (which can be taken as the more literal interpretation of what you are asking) then again: yes there is a connection and Haowen gives you the formula.
3) If you are asking in general "should we care about calculating the value for P(X=1)", then Haowen gives you the answer again.
4) If you are asking whether the event X=1 affects the probability of h=f, it depends whether you are really referring to the a-priori or the a-posteriori. It does not affect the apriori and it does affect the a-posteriori (and Q20 asks how).


All times are GMT -7. The time now is 06:37 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.