LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 1 - The Learning Problem (http://book.caltech.edu/bookforum/forumdisplay.php?f=108)
-   -   Exercise 1.11 (http://book.caltech.edu/bookforum/showthread.php?t=4472)

tatung2112 02-07-2014 06:07 AM

Exercise 1.11
 
Thank you Prof. Yaser. Your book is really easy to follow. I have just started it for a week and I am trying to finish every exercises in the book.

About exercise 1.11, I don't know where to check the answer so I post it here. Could you please tell me whether my answers are right or wrong? Is there any place that I can check my answer on exercise by myself?

Ex 1.11:
Dataset D of 25 training examples.
X = R, Y = {-1, +1}
H = {h1, h2} where h1 = +1, h2 = -1
Learning algorithms:
S - choose the hypothesis that agrees the most with D
C - choose the hypothesis deliberately
P[f(x) = +1] = p

(a) Can S produce a hypothesis that is guaranteed to perform better than random on any point outside D?
Answer: No

In case that all examples in D have yn = +1
(b) Is it possible that the hypothesis that C produces turns out to be better than the hypothesis that S produces?
Answer: Yes

(c) If p = 0.9, what is the probability that S will produce a better hypothesis than C?
Answer: P[P(Sy = f) > P(Cy = f)] where Sy is the output hypothesis of S, Cy is the output hypothesis of C
+ Since yn = +1, Sy = +1. Moreover, P[f(x) = +1] = 0.9 --> P(Sy = f) = 0.9
+ We have, P(Cy = +1) = 0.5, P(Cy = -1) = 0.5, P[f(x) = +1] = 0.9, P[f(x) = -1] = 0.1
--> P[Cy = f] = 0.5*0.9 + 0.5*0.1 = 0.5
Since 0.9 > 0.5, P[P(Sy = f) > P(Cy = f)] = 1

(d) Is there any value of p for which it is more likely than not that C will produce a better hypothesis than S?
Answer: p < 0.5

I am not sure that my answer of (a) and for (c) is not conflict.

Thank You and Best Regards,

yaser 02-08-2014 05:17 AM

Re: Exercise 1.11
 
Quote:

Originally Posted by tatung2112 (Post 11644)
I am not sure that my answer of (a) and for (c) is not conflict.

Your answers to (a) and (c) are both correct. They are not in conflict since (a) is asking a deterministic question while (c) is asking a probabilistic question.

tatung2112 02-12-2014 04:50 PM

Re: Exercise 1.11
 
Prof. Yaser, thank you very much for your replying. I will keep studying. Thank you!

BojanVujatovic 07-10-2014 03:31 PM

Re: Exercise 1.11
 
Quote:

Originally Posted by tatung2112 (Post 11644)
(c) If p = 0.9, what is the probability that S will produce a better hypothesis than C?
Answer: P[P(Sy = f) > P(Cy = f)] where Sy is the output hypothesis of S, Cy is the output hypothesis of C
+ Since yn = +1, Sy = +1. Moreover, P[f(x) = +1] = 0.9 --> P(Sy = f) = 0.9
+ We have, P(Cy = +1) = 0.5, P(Cy = -1) = 0.5, P[f(x) = +1] = 0.9, P[f(x) = -1] = 0.1
--> P[Cy = f] = 0.5*0.9 + 0.5*0.1 = 0.5
Since 0.9 > 0.5, P[P(Sy = f) > P(Cy = f)] = 1

Can you please elaborate more on why P(Cy = -1) = 0.5, I cannot understand that part?
Here is my reasoning for the (c) part: the event S produces a better hypothesis than C means that E_{out}\left(\text(S)\right) is smaller than E_{out}\left(\text(C)\right), so

P\left[E_{out}(S(\mathcal{D})) < E_{out}(C(\mathcal{D}))  \right] = \\
P\left[E_{out}(h_1) < E_{out}(h_2)  \right] =

= P\left[P\left[f(x) \neq h1\right] < P\left[f(x) \neq h2\right]  \right] =
P\left[P\left[f(x) = -1\right] < P\left[f(x) =+1\right]  \right] =

= P\left[1-p< p \right] = P\left[0.1 < 0.9 \right] = 1

Andrew87 12-17-2015 08:57 AM

Re: Exercise 1.11
 
Hi,

according to the first post, I can't understand why the answer to the question (d) is p < 0.5.

Intuitively my answer is that there are no values of p that make probabilistically C better than S. That's why S try to minimize the error on the training data which should reflect the true distribution. In this case, C do better than S only if
(the majority of the examples are +1 GIVEN p < 0.5) OR (the majority of the examples are -1 GIVEN p > 0.5). However both the cases are less probable than the ones for which S works better. As a results, there are no value for p to reverse the situation.

Am I right ?

MaciekLeks 02-02-2016 05:47 AM

Re: Exercise 1.11
 
Quote:

Originally Posted by Andrew87 (Post 12229)
Hi,

according to the first post, I can't understand why the answer to the question (d) is p < 0.5.

Intuitively my answer is that there are no values of p that make probabilistically C better than S. That's why S try to minimize the error on the training data which should reflect the true distribution. In this case, C do better than S only if
(the majority of the examples are +1 GIVEN p < 0.5) OR (the majority of the examples are -1 GIVEN p > 0.5). However both the cases are less probable than the ones for which S works better. As a results, there are no value for p to reverse the situation.

Am I right ?

Referring to point (d): The crucial part is the assumption that y_n=+1 (see point (b)), C always chooses h_2, S always chooses h_1.

https://scontent-fra3-1.xx.fbcdn.net...fa&oe=572933FD

MaciekLeks 02-03-2016 05:21 AM

Re: Exercise 1.11
 
"(a) Can S produce a hypothesis that is guaranteed to perform better than random on any point outside D?"

Can anyone give me some tips on this part of the exercise:
(1) Should we calculate it to be sure that S guarantees/(does't guarantee) to beat random result? If so, any tip is appreciated to deal with this deterministic task.
(3) Does "any point" in this context mean "every point" or "some point"?

henry2015 05-29-2016 08:44 AM

Re: Exercise 1.11
 
Quote:

Originally Posted by yaser (Post 11645)
Your answers to (a) and (c) are both correct. They are not in conflict since (a) is asking a deterministic question while (c) is asking a probabilistic question.

For part c, I thought:

Given p = 0.9, h1 is a better hypothesis than h2.

Hence, the probability that S produces a better hypothesis than C is the probability that S picks h1 essentially as C will pick the other hypothesis that S doesn't pick.

In other words, P[S produces a better hypothesis than C] = P[S picks h1 based on the 25 training examples].

S will pick h1 if 13 out of 25 training examples give +1, so we will have:
P[S picks h1]
= P[13 or more out of 25 training examples give +1]
= \sum_{k = 13}^{25}\binom{25}{k}(.9)^{k}(.1)^{25-k}
= 0.9999998379165839813935344

It is quite different from tatung2112's explanation for c.

Could you comment further?

Thanks!

henry2015 06-04-2016 07:41 AM

Re: Exercise 1.11
 
Quote:

Originally Posted by henry2015 (Post 12373)
For part c, I thought:

Given p = 0.9, h1 is a better hypothesis than h2.

Hence, the probability that S produces a better hypothesis than C is the probability that S picks h1 essentially as C will pick the other hypothesis that S doesn't pick.

In other words, P[S produces a better hypothesis than C] = P[S picks h1 based on the 25 training examples].

S will pick h1 if 13 out of 25 training examples give +1, so we will have:
P[S picks h1]
= P[13 or more out of 25 training examples give +1]
= \sum_{k = 13}^{25}\binom{25}{k}(.9)^{k}(.1)^{25-k}
= 0.9999998379165839813935344

It is quite different from tatung2112's explanation for c.

Could you comment further?

Thanks!

I just noticed that the formula in my post actually is one form of the formula in Problem 1.7...

Now, I am even more confused.

htlin 06-04-2016 04:04 PM

Re: Exercise 1.11
 
Quote:

Originally Posted by henry2015 (Post 12382)
I just noticed that the formula in my post actually is one form of the formula in Problem 1.7...

Now, I am even more confused.

I think henry2015's detailed steps are the right way to go, while Yaser's old comments are just highlighting that (a) and (c) do not conflict with each other. Thanks for asking.


All times are GMT -7. The time now is 06:14 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.