LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 1 - The Learning Problem (http://book.caltech.edu/bookforum/forumdisplay.php?f=108)
-   -   Problem 1.9 (http://book.caltech.edu/bookforum/showthread.php?t=4625)

kongweihan 09-29-2015 02:14 AM

Problem 1.9
 
I'm working through this problem and stuck on (b).

Since P[u_n \geq \alpha] \leq e^{-s\alpha}U(s), we get
\prod_{n=1}^N P[u_n \geq \alpha] \leq (e^{-s\alpha}U(s))^N

We also know\prod_{n=1}^N P[u_n \geq \alpha] \leq P[\sum_{n=1}^N u_n \geq N\alpha] = P[u \geq \alpha]

Both terms in the desired inequality is bigger than the common term, so I don't know how these two inequalities can lead to the desired conclusion, what did I miss?

Also, in (c), why do we want to minimize with respect to s and use that in (d)?

MaciekLeks 03-21-2016 08:29 AM

Re: Problem 1.9
 
Quote:

Originally Posted by kongweihan (Post 12075)
I'm working through this problem and stuck on (b).

Since P[u_n \geq \alpha] \leq e^{-s\alpha}U(s), we get
\prod_{n=1}^N P[u_n \geq \alpha] \leq (e^{-s\alpha}U(s))^N

We also know\prod_{n=1}^N P[u_n \geq \alpha] \leq P[\sum_{n=1}^N u_n \geq N\alpha] = P[u \geq \alpha]

Both terms in the desired inequality is bigger than the common term, so I don't know how these two inequalities can lead to the desired conclusion, what did I miss?

Also, in (c), why do we want to minimize with respect to s and use that in (d)?


How do you know that \prod_{n=1}^N P[u_n \geq \alpha] \leq (e^{-s\alpha}U(s))^N? I think that is a problem in your proof that you assumed that the joint probability works with Problem 1.9(b) inequality.

To proof (b) I went this way:

1. I used Markov Inequality \mathbb{P}[u\geq\alpha]\leq\frac{\mathbb{E}_{u}[u]}{\alpha}

2. Problem 1.9(a) gave me this: \mathbb{P}[t\geq\alpha]=\mathbb{P}[e^{sNt}\geq e^{sN\alpha}]\leq\frac{\mathbb{E}[e^{sNt}]}{e^{sN\alpha}}, hence \mathbb{P}[u\geq\alpha]\leq\frac{\mathbb{E}_{u}[e^{sNu}]}{e^{sN\alpha}}

Using this the rest of the proof is quite nice to carry out.

waleed 05-12-2016 03:18 AM

Re: Problem 1.9
 
Quote:

Originally Posted by MaciekLeks (Post 12298)
How do you know that \prod_{n=1}^N P[u_n \geq \alpha] \leq (e^{-s\alpha}U(s))^N? I think that is a problem in your proof that you assumed that the joint probability works with Problem 1.9(b) inequality.

To proof (b) I went this way:

1. I used Markov Inequality \mathbb{P}[u\geq\alpha]\leq\frac{\mathbb{E}_{u}[u]}{\alpha}

2. Problem 1.9(a) gave me this: \mathbb{P}[t\geq\alpha]=\mathbb{P}[e^{sNt}\geq e^{sN\alpha}]\leq\frac{\mathbb{E}[e^{sNt}]}{e^{sN\alpha}}, hence \mathbb{P}[u\geq\alpha]\leq\frac{\mathbb{E}_{u}[e^{sNu}]}{e^{sN\alpha}}

Using this the rest of the proof is quite nice to carry out.

I don't think the condition right

svend 09-17-2016 11:43 AM

Re: Problem 1.9
 
Here's my take on Problem 1.9, part(b), which is following the same lines as the description of MaciekLeks above.

We have:

\begin{split}
P(u \geq \alpha ) & = P(\frac{1}{N}{}\sum_n x_n \geq \alpha)  \\
 & = P(\sum_n x_n \geq N \alpha) \\
 & = P(e^{s \sum_n x_n} \geq e^{s N \alpha})
\end{split}

Since e^{s t} is monotonically increasing in t.

Also, e^{s t} is non negative for all t, implying Markov inequality holds:

\begin{split}
P(e^{s \sum_n x_n} \geq e^{s N \alpha}) & \leq \frac{E(e^{s \sum_n x_n})}{e^{s N \alpha}} \\
& = \frac{E(\prod_n e^{s x_n})}{e^{s N \alpha}} \\
& = \frac{\prod_n E(e^{s x_n})}{e^{s N \alpha}}
\end{split}

The last line being true since [math]x_n[\math] are independent.

From there it directly follows that

P(u \geq \alpha ) \leq U(s)^N e^{-s N \alpha}

k_sze 03-03-2017 08:53 AM

Re: Problem 1.9
 
Quote:

Originally Posted by kongweihan (Post 12075)
Also, in (c), why do we want to minimize with respect to s and use that in (d)?

I also have trouble understanding (c).

Actually I don't even know how to tackle it. I think I'll need a lot of hand-holding through this one because my math got really rusty since I left school (I'm 34).

k_sze 03-03-2017 09:25 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12596)
I also have trouble understanding (c).

Actually I don't even know how to tackle it. I think I'll need a lot of hand-holding through this one because my math got really rusty since I left school (I'm 34).

Will I need to summon notions such as "moment generating function" for part (c) of this problem?

Sebasti 03-06-2017 04:38 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12596)
Actually I don't even know how to tackle it. I think I'll need a lot of hand-holding through this one because my math got really rusty since I left school

So I'm not the only one. Gee, thanks ;)

k_sze 03-15-2017 07:11 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12597)
Will I need to summon notions such as "moment generating function" for part (c) of this problem?

So I did end up using moment generating function. And I think the answer to (c) is s = \ln{\frac{\alpha}{1-\alpha}}, using calculus.

But now I'm stuck at (d).

Directly substituting \alpha = \mathbb{E}(u) + \epsilon is probably wrong? Because e^{s}U(s) can be simplified to the point where no logarithm appears (unless I made a really big mistake).

k_sze 03-16-2017 10:36 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12606)
So I did end up using moment generating function. And I think the answer to (c) is s = \ln{\frac{\alpha}{1-\alpha}}, using calculus.

But now I'm stuck at (d).

Directly substituting \alpha = \mathbb{E}(u) + \epsilon is probably wrong? Because e^{s}U(s) can be simplified to the point where no logarithm appears (unless I made a really big mistake).

*sigh*

I did end up getting \beta by substituting \mathbb{E}(u) + \epsilon for \alpha, but only after simplifying e^{-s\alpha}U(s) all the way down, until there is no more e or \ln, otherwise I get two powers of 2 with no obvious way to combine them.

So now the remaining hurdle is to prove that \beta > 0.

Yay

k_sze 04-01-2017 08:19 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12607)
*sigh*

I did end up getting \beta by substituting \mathbb{E}(u) + \epsilon for \alpha, but only after simplifying e^{-s\alpha}U(s) all the way down, until there is no more e or \ln, otherwise I get two powers of 2 with no obvious way to combine them.

So now the remaining hurdle is to prove that \beta > 0.

Yay

I finally worked out the proof for \beta > 0. Apparently using Jensen's inequality is supposed to be the simple way, but I simply don't get Jensen's inequality, so I used some brute force calculus (first and second derivatives, find the minimum, etc.)


All times are GMT -7. The time now is 08:54 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.