LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 3 - The Linear Model (http://book.caltech.edu/bookforum/forumdisplay.php?f=110)
-   -   Exercise 3.4 (http://book.caltech.edu/bookforum/showthread.php?t=4353)

xuewei4d 06-14-2013 09:08 AM

Exercise 3.4
 
:clueless:

I didn't get correct answer to Exercise 3.4 (c).

Exercise 3.4(b), I think the answer would be E_{\text{in}}(\mathbf{w}_{\text{in}})=\frac{1}{N}\epsilon^TH\epsilon

Exercise 3.4(c), by independence between different \epsilon_i, I have \mathbb{E}_{\mathcal{D}}[E_{\text{in}}(\mathbf{w}_{\text{in}})] = \frac{1}{N} \sum_i H_{ii} \mathbb{E}(\epsilon^2) = \frac{\sigma^2}{N}(d+1).

Where am I wrong?

htlin 06-17-2013 07:27 PM

Re: Exercise 3.4
 
You can consider double-checking your answer of 3.4(b). Hope this helps.

i_need_some_help 10-06-2013 01:16 PM

Re: Exercise 3.4
 
I am not sure how to approach part (a). Are we supposed to explain why that in-sample estimate intuitively makes sense, or (algebraically) manipulate expressions given earlier into it?

magdon 10-06-2013 08:18 PM

Re: Exercise 3.4
 
Algebraically manipulate earlier expressions and you should get 3.4(a). It is essentially a restatement of \hat{\mathbf{y}}=X{\mathbf{w}}_{lin}.

Sweater Monkey 10-06-2013 11:00 PM

Re: Exercise 3.4
 
I'm not sure if I'm going about part (e) correctly.

I'm under the impression that E_{\text{test}}(\mathbf{w}_{\text{lin}})=\frac{1}{N}||X{\mathbf{w}}_{lin}-\mathbf{y'}||^2

where \hat{\mathbf{y}}=X{\mathbf{w}}_{lin}=X\mathbf{w}^*+H\mathbf{\epsilon} as derived earlier
and \mathbf{y'}=\mathbf{w}^{*T}\mathbf{x}+\mathbf{\epsilon'}=X\mathbf{w}^*+\mathbf{\epsilon'}

This lead me to \frac{1}{N}||H\mathbf{\epsilon}-\mathbf{\epsilon'}||^2

I carried out the expansion of this expression and then simplified into the relevant terms but my final answer is \sigma^2(1+(d+1)) because the N term cancels out.

Am I starting out correctly up until this expansion or is my thought process off from the start? And if I am heading in the right direction is there any obvious reason that I may be expanding the expression incorrectly? Any help would be greatly appreciated.

ddas2 10-07-2013 12:46 AM

Re: Exercise 3.4
 
1. I got $y^{\prime}=y-\epsilon+\epsilon^{\prime}$.
and $\hat{y}-y^{\prime}=H\epsilon +\epsilon^{\prime}$.

magdon 10-07-2013 04:53 AM

Re: Exercise 3.4
 
You got it mostly right. Your error is assuming both term, the H term and the one without the H give an N to cancel the N in the denominator. One term gives an N and the other gives a (d+1).


Quote:

Originally Posted by Sweater Monkey (Post 11541)
I'm not sure if I'm going about part (e) correctly.

I'm under the impression that E_{\text{test}}(\mathbf{w}_{\text{lin}})=\frac{1}{N}||X{\mathbf{w}}_{lin}-\mathbf{y'}||^2

where \hat{\mathbf{y}}=X{\mathbf{w}}_{lin}=X\mathbf{w}^*+H\mathbf{\epsilon} as derived earlier
and \mathbf{y'}=\mathbf{w}^{*T}\mathbf{x}+\mathbf{\epsilon'}=X\mathbf{w}^*+\mathbf{\epsilon'}

This lead me to \frac{1}{N}||H\mathbf{\epsilon}-\mathbf{\epsilon'}||^2

I carried out the expansion of this expression and then simplified into the relevant terms but my final answer is \sigma^2(1+(d+1)) because the N term cancels out.

Am I starting out correctly up until this expansion or is my thought process off from the start? And if I am heading in the right direction is there any obvious reason that I may be expanding the expression incorrectly? Any help would be greatly appreciated.


Sweater Monkey 10-07-2013 08:09 AM

Re: Exercise 3.4
 
Quote:

Originally Posted by magdon (Post 11544)
You got it mostly right. Your error is assuming both term, the H term and the one without the H give an N to cancel the N in the denominator. One term gives an N and the other gives a (d+1).

Yes I realize that only one term should have the N so the issue must be in how I'm expanding the expression.

I think my problem is how I'm looking at the trace of the \mathbf{\epsilon}^T\mathbf{\epsilon} matrix.

I'm under the impression that \mathbf{\epsilon}^T\mathbf{\epsilon} produces an NxN matrix with a diagonal of all \sigma^2 values and 0 elsewhere. I come to this conclusion because the \epsilon are all independent so when multiplied together the covariance of any two should be zero while the covariance of any \epsilon_i\epsilon_i should be the variance of \sigma^2. So then the trace of this matrix should have a sum along the diagonal of N\sigma^2, shouldn't it? :clueless:

aaoam 10-07-2013 08:18 AM

Re: Exercise 3.4
 
I'm having a bit of difficulty with 3.4b. I take \hat(y) - y and multiply by (XX^T)^{-1}XX^T, which ends up reducing the expression to just H\epsilon. However, then I can't use 3.3c in simplifying 3.3c, which makes me think I did something wrong. Can somebody give me a pointer?

Also, it'd be great if there was instructions somewhere about how to post in math mode. Perhaps I just missed them?

magdon 10-07-2013 08:19 AM

Re: Exercise 3.4
 
Yes, that is right. You have to be more careful but use similar reasoning with

\epsilon^TH\epsilon

Quote:

Originally Posted by Sweater Monkey (Post 11545)
Yes I realize that only one term should have the N so the issue must be in how I'm expanding the expression.

I think my problem is how I'm looking at the trace of the \mathbf{\epsilon}^T\mathbf{\epsilon} matrix.

I'm under the impression that \mathbf{\epsilon}^T\mathbf{\epsilon} produces an NxN matrix with a diagonal of all \sigma^2 values and 0 elsewhere. I come to this conclusion because the \epsilon are all independent so when multiplied together the covariance of any two should be zero while the covariance of any \epsilon_i\epsilon_i should be the variance of \sigma^2. So then the trace of this matrix should have a sum along the diagonal of N\sigma^2, shouldn't it? :clueless:



All times are GMT -7. The time now is 09:20 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.