LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 1 - The Learning Problem (http://book.caltech.edu/bookforum/forumdisplay.php?f=108)
-   -   Problem 1.9 (http://book.caltech.edu/bookforum/showthread.php?t=4625)

kongweihan 09-29-2015 02:14 AM

Problem 1.9
 
I'm working through this problem and stuck on (b).

Since P[u_n \geq \alpha] \leq e^{-s\alpha}U(s), we get
\prod_{n=1}^N P[u_n \geq \alpha] \leq (e^{-s\alpha}U(s))^N

We also know\prod_{n=1}^N P[u_n \geq \alpha] \leq P[\sum_{n=1}^N u_n \geq N\alpha] = P[u \geq \alpha]

Both terms in the desired inequality is bigger than the common term, so I don't know how these two inequalities can lead to the desired conclusion, what did I miss?

Also, in (c), why do we want to minimize with respect to s and use that in (d)?

MaciekLeks 03-21-2016 08:29 AM

Re: Problem 1.9
 
Quote:

Originally Posted by kongweihan (Post 12075)
I'm working through this problem and stuck on (b).

Since P[u_n \geq \alpha] \leq e^{-s\alpha}U(s), we get
\prod_{n=1}^N P[u_n \geq \alpha] \leq (e^{-s\alpha}U(s))^N

We also know\prod_{n=1}^N P[u_n \geq \alpha] \leq P[\sum_{n=1}^N u_n \geq N\alpha] = P[u \geq \alpha]

Both terms in the desired inequality is bigger than the common term, so I don't know how these two inequalities can lead to the desired conclusion, what did I miss?

Also, in (c), why do we want to minimize with respect to s and use that in (d)?


How do you know that \prod_{n=1}^N P[u_n \geq \alpha] \leq (e^{-s\alpha}U(s))^N? I think that is a problem in your proof that you assumed that the joint probability works with Problem 1.9(b) inequality.

To proof (b) I went this way:

1. I used Markov Inequality \mathbb{P}[u\geq\alpha]\leq\frac{\mathbb{E}_{u}[u]}{\alpha}

2. Problem 1.9(a) gave me this: \mathbb{P}[t\geq\alpha]=\mathbb{P}[e^{sNt}\geq e^{sN\alpha}]\leq\frac{\mathbb{E}[e^{sNt}]}{e^{sN\alpha}}, hence \mathbb{P}[u\geq\alpha]\leq\frac{\mathbb{E}_{u}[e^{sNu}]}{e^{sN\alpha}}

Using this the rest of the proof is quite nice to carry out.

waleed 05-12-2016 03:18 AM

Re: Problem 1.9
 
Quote:

Originally Posted by MaciekLeks (Post 12298)
How do you know that \prod_{n=1}^N P[u_n \geq \alpha] \leq (e^{-s\alpha}U(s))^N? I think that is a problem in your proof that you assumed that the joint probability works with Problem 1.9(b) inequality.

To proof (b) I went this way:

1. I used Markov Inequality \mathbb{P}[u\geq\alpha]\leq\frac{\mathbb{E}_{u}[u]}{\alpha}

2. Problem 1.9(a) gave me this: \mathbb{P}[t\geq\alpha]=\mathbb{P}[e^{sNt}\geq e^{sN\alpha}]\leq\frac{\mathbb{E}[e^{sNt}]}{e^{sN\alpha}}, hence \mathbb{P}[u\geq\alpha]\leq\frac{\mathbb{E}_{u}[e^{sNu}]}{e^{sN\alpha}}

Using this the rest of the proof is quite nice to carry out.

I don't think the condition right

svend 09-17-2016 11:43 AM

Re: Problem 1.9
 
Here's my take on Problem 1.9, part(b), which is following the same lines as the description of MaciekLeks above.

We have:

\begin{split}
P(u \geq \alpha ) & = P(\frac{1}{N}{}\sum_n x_n \geq \alpha)  \\
 & = P(\sum_n x_n \geq N \alpha) \\
 & = P(e^{s \sum_n x_n} \geq e^{s N \alpha})
\end{split}

Since e^{s t} is monotonically increasing in t.

Also, e^{s t} is non negative for all t, implying Markov inequality holds:

\begin{split}
P(e^{s \sum_n x_n} \geq e^{s N \alpha}) & \leq \frac{E(e^{s \sum_n x_n})}{e^{s N \alpha}} \\
& = \frac{E(\prod_n e^{s x_n})}{e^{s N \alpha}} \\
& = \frac{\prod_n E(e^{s x_n})}{e^{s N \alpha}}
\end{split}

The last line being true since [math]x_n[\math] are independent.

From there it directly follows that

P(u \geq \alpha ) \leq U(s)^N e^{-s N \alpha}

k_sze 03-03-2017 08:53 AM

Re: Problem 1.9
 
Quote:

Originally Posted by kongweihan (Post 12075)
Also, in (c), why do we want to minimize with respect to s and use that in (d)?

I also have trouble understanding (c).

Actually I don't even know how to tackle it. I think I'll need a lot of hand-holding through this one because my math got really rusty since I left school (I'm 34).

k_sze 03-03-2017 09:25 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12596)
I also have trouble understanding (c).

Actually I don't even know how to tackle it. I think I'll need a lot of hand-holding through this one because my math got really rusty since I left school (I'm 34).

Will I need to summon notions such as "moment generating function" for part (c) of this problem?

Sebasti 03-06-2017 04:38 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12596)
Actually I don't even know how to tackle it. I think I'll need a lot of hand-holding through this one because my math got really rusty since I left school

So I'm not the only one. Gee, thanks ;)

k_sze 03-15-2017 07:11 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12597)
Will I need to summon notions such as "moment generating function" for part (c) of this problem?

So I did end up using moment generating function. And I think the answer to (c) is s = \ln{\frac{\alpha}{1-\alpha}}, using calculus.

But now I'm stuck at (d).

Directly substituting \alpha = \mathbb{E}(u) + \epsilon is probably wrong? Because e^{s}U(s) can be simplified to the point where no logarithm appears (unless I made a really big mistake).

k_sze 03-16-2017 10:36 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12606)
So I did end up using moment generating function. And I think the answer to (c) is s = \ln{\frac{\alpha}{1-\alpha}}, using calculus.

But now I'm stuck at (d).

Directly substituting \alpha = \mathbb{E}(u) + \epsilon is probably wrong? Because e^{s}U(s) can be simplified to the point where no logarithm appears (unless I made a really big mistake).

*sigh*

I did end up getting \beta by substituting \mathbb{E}(u) + \epsilon for \alpha, but only after simplifying e^{-s\alpha}U(s) all the way down, until there is no more e or \ln, otherwise I get two powers of 2 with no obvious way to combine them.

So now the remaining hurdle is to prove that \beta > 0.

Yay

k_sze 04-01-2017 08:19 AM

Re: Problem 1.9
 
Quote:

Originally Posted by k_sze (Post 12607)
*sigh*

I did end up getting \beta by substituting \mathbb{E}(u) + \epsilon for \alpha, but only after simplifying e^{-s\alpha}U(s) all the way down, until there is no more e or \ln, otherwise I get two powers of 2 with no obvious way to combine them.

So now the remaining hurdle is to prove that \beta > 0.

Yay

I finally worked out the proof for \beta > 0. Apparently using Jensen's inequality is supposed to be the simple way, but I simply don't get Jensen's inequality, so I used some brute force calculus (first and second derivatives, find the minimum, etc.)

RicLouRiv 06-23-2017 04:42 PM

Re: Problem 1.9
 
Quick question on Part D...

I agree that s = ln(a / (1-a)) is the value of s that minimizes e^(-sa)U(s), where I'm using "a" to represent "alpha."

I think that, if you plug this value of s in, you get 2^(-b), where I'm using "b" to represent "beta."

So, 2^(-b) < e^(-sa)U(s), by definition of minimization.

Then, raising to the power of N, we get:

2^(-bN) < (e^(-sa)U(s))^N, but this inequality is the wrong way.

Any tips?

RicLouRiv 06-24-2017 09:36 AM

Re: Problem 1.9
 
Ah, maybe it's because the inequality holds for any s, it must hold for the min.

htlin 06-29-2017 10:00 PM

Re: Problem 1.9
 
Quote:

Originally Posted by RicLouRiv (Post 12685)
Ah, maybe it's because the inequality holds for any s, it must hold for the min.

You got it. :-)

mygame182 10-22-2017 04:19 AM

Re: Problem 1.9
 
Quote:

Originally Posted by RicLouRiv (Post 12685)
Ah, maybe it's because the inequality holds for any s, it must hold for the min.

Excuse me, I can understand your last reply, but confuse in this one.
More tips please.

When 2^(-b) is the minimize of e^(-sa)U(s), i find that 1-a = 1/2 + e (e means epsilon), so P[u>=a] = P[u>= 1/2 - e] , but not P[u>=1/2+e].

mygame182 10-23-2017 08:23 AM

Re: Problem 1.9
 
When 2^(-b) equal to the minimize of e^(-sa)U(s), i try to assume that 1-a=1/2-e, so that a = 1/2 + e, P[u>=a]=P[u>=1/2+e]

According to (b), P[u>=1/2+e] = P[u>=a] <= (e^(-sa)U(s))^N for any s , even if the minimize of e^(-sa)U(s) when s = ln(a / (1-a)).

Hence P[u>=1/2+e] <= 2^(-bN)

don slowik 11-14-2017 03:14 PM

Re: Problem 1.9
 
That's correct. But it doesn't get us anywhere.

don slowik 11-14-2017 03:18 PM

Re: Problem 1.9
 
Ahhhm, That was a reply to kongweihan's OP.
It is correct as shown by the first line: Since...
But ends up barking up the wrong tree.
There is a nice description on wikipedia of Chernoff bound.
It is similar to maciekleks path.

Ulyssesyang 11-09-2018 07:02 PM

Re: Problem 1.9
 
Quote:

Originally Posted by svend (Post 12434)
Here's my take on Problem 1.9, part(b), which is following the same lines as the description of MaciekLeks above.

We have:

\begin{split}
P(u \geq \alpha ) & = P(\frac{1}{N}{}\sum_n x_n \geq \alpha)  \\
 & = P(\sum_n x_n \geq N \alpha) \\
 & = P(e^{s \sum_n x_n} \geq e^{s N \alpha})
\end{split}

Since e^{s t} is monotonically increasing in t.

Also, e^{s t} is non negative for all t, implying Markov inequality holds:

\begin{split}
P(e^{s \sum_n x_n} \geq e^{s N \alpha}) & \leq \frac{E(e^{s \sum_n x_n})}{e^{s N \alpha}} \\
& = \frac{E(\prod_n e^{s x_n})}{e^{s N \alpha}} \\
& = \frac{\prod_n E(e^{s x_n})}{e^{s N \alpha}}
\end{split}

The last line being true since [math]x_n[\math] are independent.

From there it directly follows that

P(u \geq \alpha ) \leq U(s)^N e^{-s N \alpha}

I just think you should note that there are two expectations, one is based on e^su_n, while other is based on u_n. Of course, you can refer the law of unconscious statisticians to prove that both are same.


All times are GMT -7. The time now is 10:18 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.