LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 6

Reply
 
Thread Tools Display Modes
  #1  
Old 02-15-2013, 08:59 AM
melipone melipone is offline
Senior Member
 
Join Date: Jan 2013
Posts: 72
Default Question on regularization for logistic regression

We have done regularization for linear regression. How do we get the gradients with regularization for logistic regression?
Reply With Quote
  #2  
Old 02-15-2013, 09:43 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,474
Default Re: Question on regularization for logistic regression

Quote:
Originally Posted by melipone View Post
We have done regularization for linear regression. How do we get the gradients with regularization for logistic regression?
For linear regression, both the unregularized and the (weight decay) regularized cases had closed-form solutions. For logistic regression, both are handled using an iterative method like gradient descent. You write down the error measure and add the regularization term, then carry out gradient descent (with respect to {\bf w}) on this augmented error. The gradient will be the sum of the gradients of the original error term given in the lecture and the weight-decay term which is quadratic in {\bf w} (hence its gradient will be linear in {\bf w}).
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 02-16-2013, 12:00 PM
melipone melipone is offline
Senior Member
 
Join Date: Jan 2013
Posts: 72
Default Re: Question on regularization for logistic regression

Thanks. Okay, so if I take the derivative of \frac{\lambda}{2N}w^Tw for the regularization, I just add \frac{\lambda}{N}w to the gradient in the update of each weight in stochastic gradient descent. Is that correct?

I was also looking into L1 and L2 regularization. That would be L2 regularization above. My understanding is that L1 regulation would just add a penalty term to the gradient regardless of the weight itself. Is my understanding correct?

TIA
Reply With Quote
  #4  
Old 02-16-2013, 09:49 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,474
Default Re: Question on regularization for logistic regression

Quote:
Originally Posted by melipone View Post
Thanks. Okay, so if I take the derivative of \frac{\lambda}{2N}w^Tw for the regularization, I just add \frac{\lambda}{N}w to the gradient in the update of each weight in stochastic gradient descent. Is that correct?

I was also looking into L1 and L2 regularization. That would be L2 regularization above. My understanding is that L1 regulation would just add a penalty term to the gradient regardless of the weight itself. Is my understanding correct?

TIA
Indeed, you add the linear term to get the new gradient. L2 and L1 define the regularization term based on squared value and absolute value, respectively. What is added to the gradient is the derivative of that.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 09:19 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.