LFD Book Forum  

Go Back   LFD Book Forum > Book Feedback - Learning From Data > Chapter 1 - The Learning Problem

Reply
 
Thread Tools Display Modes
  #1  
Old 02-07-2014, 06:07 AM
tatung2112 tatung2112 is offline
Junior Member
 
Join Date: Feb 2014
Posts: 4
Default Exercise 1.11

Thank you Prof. Yaser. Your book is really easy to follow. I have just started it for a week and I am trying to finish every exercises in the book.

About exercise 1.11, I don't know where to check the answer so I post it here. Could you please tell me whether my answers are right or wrong? Is there any place that I can check my answer on exercise by myself?

Ex 1.11:
Dataset D of 25 training examples.
X = R, Y = {-1, +1}
H = {h1, h2} where h1 = +1, h2 = -1
Learning algorithms:
S - choose the hypothesis that agrees the most with D
C - choose the hypothesis deliberately
P[f(x) = +1] = p

(a) Can S produce a hypothesis that is guaranteed to perform better than random on any point outside D?
Answer: No

In case that all examples in D have yn = +1
(b) Is it possible that the hypothesis that C produces turns out to be better than the hypothesis that S produces?
Answer: Yes

(c) If p = 0.9, what is the probability that S will produce a better hypothesis than C?
Answer: P[P(Sy = f) > P(Cy = f)] where Sy is the output hypothesis of S, Cy is the output hypothesis of C
+ Since yn = +1, Sy = +1. Moreover, P[f(x) = +1] = 0.9 --> P(Sy = f) = 0.9
+ We have, P(Cy = +1) = 0.5, P(Cy = -1) = 0.5, P[f(x) = +1] = 0.9, P[f(x) = -1] = 0.1
--> P[Cy = f] = 0.5*0.9 + 0.5*0.1 = 0.5
Since 0.9 > 0.5, P[P(Sy = f) > P(Cy = f)] = 1

(d) Is there any value of p for which it is more likely than not that C will produce a better hypothesis than S?
Answer: p < 0.5

I am not sure that my answer of (a) and for (c) is not conflict.

Thank You and Best Regards,
Reply With Quote
  #2  
Old 02-08-2014, 05:17 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,472
Default Re: Exercise 1.11

Quote:
Originally Posted by tatung2112 View Post
I am not sure that my answer of (a) and for (c) is not conflict.
Your answers to (a) and (c) are both correct. They are not in conflict since (a) is asking a deterministic question while (c) is asking a probabilistic question.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 02-12-2014, 04:50 PM
tatung2112 tatung2112 is offline
Junior Member
 
Join Date: Feb 2014
Posts: 4
Default Re: Exercise 1.11

Prof. Yaser, thank you very much for your replying. I will keep studying. Thank you!
Reply With Quote
  #4  
Old 07-10-2014, 03:31 PM
BojanVujatovic BojanVujatovic is offline
Member
 
Join Date: Jan 2013
Posts: 13
Default Re: Exercise 1.11

Quote:
Originally Posted by tatung2112 View Post
(c) If p = 0.9, what is the probability that S will produce a better hypothesis than C?
Answer: P[P(Sy = f) > P(Cy = f)] where Sy is the output hypothesis of S, Cy is the output hypothesis of C
+ Since yn = +1, Sy = +1. Moreover, P[f(x) = +1] = 0.9 --> P(Sy = f) = 0.9
+ We have, P(Cy = +1) = 0.5, P(Cy = -1) = 0.5, P[f(x) = +1] = 0.9, P[f(x) = -1] = 0.1
--> P[Cy = f] = 0.5*0.9 + 0.5*0.1 = 0.5
Since 0.9 > 0.5, P[P(Sy = f) > P(Cy = f)] = 1
Can you please elaborate more on why P(Cy = -1) = 0.5, I cannot understand that part?
Here is my reasoning for the (c) part: the event S produces a better hypothesis than C means that E_{out}\left(\text(S)\right) is smaller than E_{out}\left(\text(C)\right), so

P\left[E_{out}(S(\mathcal{D})) < E_{out}(C(\mathcal{D}))  \right] = \\
P\left[E_{out}(h_1) < E_{out}(h_2)  \right] =

= P\left[P\left[f(x) \neq h1\right] < P\left[f(x) \neq h2\right]  \right] =
P\left[P\left[f(x) = -1\right] < P\left[f(x) =+1\right]  \right] =

= P\left[1-p< p \right] = P\left[0.1 < 0.9 \right] = 1
Reply With Quote
  #5  
Old 12-17-2015, 08:57 AM
Andrew87 Andrew87 is offline
Junior Member
 
Join Date: Feb 2015
Posts: 6
Default Re: Exercise 1.11

Hi,

according to the first post, I can't understand why the answer to the question (d) is p < 0.5.

Intuitively my answer is that there are no values of p that make probabilistically C better than S. That's why S try to minimize the error on the training data which should reflect the true distribution. In this case, C do better than S only if
(the majority of the examples are +1 GIVEN p < 0.5) OR (the majority of the examples are -1 GIVEN p > 0.5). However both the cases are less probable than the ones for which S works better. As a results, there are no value for p to reverse the situation.

Am I right ?
Reply With Quote
  #6  
Old 02-02-2016, 05:47 AM
MaciekLeks MaciekLeks is offline
Member
 
Join Date: Jan 2016
Location: Katowice, Upper Silesia, Poland
Posts: 17
Default Re: Exercise 1.11

Quote:
Originally Posted by Andrew87 View Post
Hi,

according to the first post, I can't understand why the answer to the question (d) is p < 0.5.

Intuitively my answer is that there are no values of p that make probabilistically C better than S. That's why S try to minimize the error on the training data which should reflect the true distribution. In this case, C do better than S only if
(the majority of the examples are +1 GIVEN p < 0.5) OR (the majority of the examples are -1 GIVEN p > 0.5). However both the cases are less probable than the ones for which S works better. As a results, there are no value for p to reverse the situation.

Am I right ?
Referring to point (d): The crucial part is the assumption that y_n=+1 (see point (b)), C always chooses h_2, S always chooses h_1.

Reply With Quote
  #7  
Old 02-03-2016, 05:21 AM
MaciekLeks MaciekLeks is offline
Member
 
Join Date: Jan 2016
Location: Katowice, Upper Silesia, Poland
Posts: 17
Default Re: Exercise 1.11

"(a) Can S produce a hypothesis that is guaranteed to perform better than random on any point outside D?"

Can anyone give me some tips on this part of the exercise:
(1) Should we calculate it to be sure that S guarantees/(does't guarantee) to beat random result? If so, any tip is appreciated to deal with this deterministic task.
(3) Does "any point" in this context mean "every point" or "some point"?
Reply With Quote
  #8  
Old 05-29-2016, 08:44 AM
henry2015 henry2015 is offline
Member
 
Join Date: Aug 2015
Posts: 29
Default Re: Exercise 1.11

Quote:
Originally Posted by yaser View Post
Your answers to (a) and (c) are both correct. They are not in conflict since (a) is asking a deterministic question while (c) is asking a probabilistic question.
For part c, I thought:

Given p = 0.9, h1 is a better hypothesis than h2.

Hence, the probability that S produces a better hypothesis than C is the probability that S picks h1 essentially as C will pick the other hypothesis that S doesn't pick.

In other words, P[S produces a better hypothesis than C] = P[S picks h1 based on the 25 training examples].

S will pick h1 if 13 out of 25 training examples give +1, so we will have:
P[S picks h1]
= P[13 or more out of 25 training examples give +1]
= \sum_{k = 13}^{25}\binom{25}{k}(.9)^{k}(.1)^{25-k}
= 0.9999998379165839813935344

It is quite different from tatung2112's explanation for c.

Could you comment further?

Thanks!

Last edited by henry2015; 05-29-2016 at 08:48 AM. Reason: fixing latex syntax
Reply With Quote
  #9  
Old 06-04-2016, 07:41 AM
henry2015 henry2015 is offline
Member
 
Join Date: Aug 2015
Posts: 29
Default Re: Exercise 1.11

Quote:
Originally Posted by henry2015 View Post
For part c, I thought:

Given p = 0.9, h1 is a better hypothesis than h2.

Hence, the probability that S produces a better hypothesis than C is the probability that S picks h1 essentially as C will pick the other hypothesis that S doesn't pick.

In other words, P[S produces a better hypothesis than C] = P[S picks h1 based on the 25 training examples].

S will pick h1 if 13 out of 25 training examples give +1, so we will have:
P[S picks h1]
= P[13 or more out of 25 training examples give +1]
= \sum_{k = 13}^{25}\binom{25}{k}(.9)^{k}(.1)^{25-k}
= 0.9999998379165839813935344

It is quite different from tatung2112's explanation for c.

Could you comment further?

Thanks!
I just noticed that the formula in my post actually is one form of the formula in Problem 1.7...

Now, I am even more confused.
Reply With Quote
  #10  
Old 06-04-2016, 04:04 PM
htlin's Avatar
htlin htlin is offline
NTU
 
Join Date: Aug 2009
Location: Taipei, Taiwan
Posts: 558
Default Re: Exercise 1.11

Quote:
Originally Posted by henry2015 View Post
I just noticed that the formula in my post actually is one form of the formula in Problem 1.7...

Now, I am even more confused.
I think henry2015's detailed steps are the right way to go, while Yaser's old comments are just highlighting that (a) and (c) do not conflict with each other. Thanks for asking.
__________________
When one teaches, two learn.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 02:09 AM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.