LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 6 (http://book.caltech.edu/bookforum/forumdisplay.php?f=135)
-   -   Homework#6 Q3 (http://book.caltech.edu/bookforum/showthread.php?t=1012)

invis 08-15-2012 12:36 PM

Homework#6 Q3
 
I have some problems with understanding the question #3.

Ok, we have a formula for linear regression with regularization:
w_{reg} = (Z^T Z- \lambda I)^{-1} Z^T y

But I cant catch this:
"add the term \lambda /N \sum_{i=0}^7 w_i^2 to the squared in sample error"

What w I suppose to use in this formula ? The w without regularization ? And why adding to squared error ? :confused:

yaser 08-15-2012 02:30 PM

Re: Homework#6 Q3
 
Quote:

Originally Posted by invis (Post 4095)
What w I suppose to use in this formula ? The w without regularization ? And why adding to squared error ? :confused:

The solution w_{\rm reg} = (Z^T Z+ \lambda I)^{-1} Z^T y (notice the plus sign) comes from solving the case of augmented error where the regularization term is added to the in-sample mean-squared error.

invis 08-16-2012 03:22 AM

Re: Homework#6 Q3
 
Just one more question :) ... at this time

Why when we computing w_{lin} there is no eye matrix ? We do the same divide to w by Z^T Z w.

But eye matrix appears when we divide (Z^T Z w + \lambda w) by w. That confusing me.

yaser 08-16-2012 04:34 AM

Re: Homework#6 Q3
 
Quote:

Originally Posted by invis (Post 4102)
Just one more question :) ... at this time

Why when we computing w_{lin} there is no eye matrix ? We do the same divide to w by Z^T Z w.

But eye matrix appears when we divide (Z^T Z w + \lambda w) by w. That confusing me.

The "divide by" is in fact multiplying by a matrix inverse. In the case of augmented error, in order to put things in terms of a matrix, we write Z^T Z w + \lambda w as Z^T Z w + \lambda I w, then factor out Z^T Z + \lambda I which becomes the matrix to be inverted. In the linear regression case, there is no I term (or you can think of it as \lambda=0 killing that term).

ESRogs 08-20-2012 08:32 AM

Re: Homework#6 Q3
 
In the lecture, the formula for w_{reg} was derived for the case where the weights were being multiplied by Legendre polynomials. Is it still valid when the set of transformations is different, such as those specified in the problem?

yaser 08-20-2012 01:01 PM

Re: Homework#6 Q3
 
Quote:

Originally Posted by ESRogs (Post 4182)
In the lecture, the formula for w_{reg} was derived for the case where the weights were being multiplied by Legendre polynomials. Is it still valid when the set of transformations is different, such as those specified in the problem?

The formula is valid for any nonlinear transformation into a {\cal Z}.

melipone 02-14-2013 06:16 PM

Re: Homework#6 Q3
 
Quote:

Originally Posted by yaser (Post 4097)
The solution w_{\rm reg} = (Z^T Z+ \lambda I)^{-1} Z^T y (notice the plus sign) comes from solving the case of augmented error where the regularization term is added to the in-sample mean-squared error.

I am confused. Are we supposed to rederive the result of Slide 11?

yaser 02-14-2013 06:38 PM

Re: Homework#6 Q3
 
Quote:

Originally Posted by melipone (Post 9393)
I am confused. Are we supposed to rederive the result of Slide 11?

No rederivation needed. You can always use the results given in the lectures.

Kekeli 05-13-2013 04:07 AM

Re: Homework#6 Q3
 
My answers for any value of K are order(s) of magnitude higher that the options.

w_reg = (Z^\top Z + λI)^-1 Z^T y

As a Python noob, perhaps someone can confirm that I need the linalg.inv of the first term in parentheses, because we cannot use the pinv method when the weight term is present.
thanks.

jlaurentum 05-13-2013 06:48 PM

Re: Homework#6 Q3
 
Kekeli:

I don't know about Python, but in R I was using the "chol2inv" function of the "Matrix" library to find the matrix inverse. It turns out this wasn't the right tool for the job. I ended up using "solve" from the base package to find the inverse. So in R, I used the following functions:
  1. t(M) for the transpose of an M matrix
  2. %*% for matrix-matrix or matrix-vector multiplication (or inner product)
  3. diag(...) to generate a diagonal matrix such as needed for the \lambda I part.
  4. solve(M) to find the inverse of an M matrix.

Kekeli 05-14-2013 04:20 AM

Re: Homework#6 Q3
 
Thank you for confirming the flow.
Turns out when transforming the test and training set vectors, I was leaving in the bias term, thus x1 and x2 were screwy before creating the polynomial. eep.

Michael Reach 05-14-2013 07:14 AM

Re: Homework#6 Q3
 
Kekeli, could you explain this? What do you mean by the bias term?
Thanks (I'm still trying to debug my own code.)

Kekeli 05-16-2013 03:58 AM

Re: Homework#6 Q3
 
Michael, I'm sorry I didn't see this until now.

While testing the file IO, I had inserted 1.0 into the vector of input variables after reading the contents of "in.dta". Then, when doing the nonlinear transformation, I was reading x1 as 1.0, and x2 as x1. Not surprisingly, the linear regression with or without weight decay didn't return the correct estimates for Ein and Eout.

In the end, print statements saved the day. I just wish I'd started in the transformation and not everywhere else in the code.

khohi 03-23-2016 06:02 AM

Re: Homework#6 Q3
 
Great job :)
خلطة بياض الثلج


All times are GMT -7. The time now is 05:59 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.