LFD Book Forum Q7 - understanding co-ordinate descent
 Register FAQ Calendar Mark Forums Read

#1
05-06-2013, 01:10 AM
 bargava Junior Member Join Date: Apr 2013 Posts: 8
Q7 - understanding co-ordinate descent

I didn't entirely understand what co-ordinate descent meant. This is what I believe it to be: Instead of descending "simultaneously" along all the co-ordinates as in gradient descent(in this eg: both u and v), we first descend along u, find the new u and then find v. So, when computing v, the new value of u is to be used. Am I right?
#2
05-06-2013, 02:06 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Q7 - understanding co-ordinate descent

Quote:
 Originally Posted by bargava I didn't entirely understand what co-ordinate descent meant. This is what I believe it to be: Instead of descending "simultaneously" along all the co-ordinates as in gradient descent(in this eg: both u and v), we first descend along u, find the new u and then find v. So, when computing v, the new value of u is to be used. Am I right?
Correct. After each update along one coordinate, you compute the derivative at the new point, then descend along the other coordinate. This is not an efficient method, and is meant for comparison with gradient descent.
__________________
Where everyone thinks alike, no one thinks very much
#3
05-07-2013, 12:12 AM
 Darcy Daugela Junior Member Join Date: Apr 2013 Location: Edmonton Posts: 3
Re: Q7 - understanding co-ordinate descent

I am struggling to understand what I did wrong on this question.

The instructions are clear, I followed the method above (I think?), my answers to related questions (5 and 6) were correct, but my answer to question 7 is far far less than the correct answer. I got the answer level of accuracy in only 5 iterations (instead of 15), so I must have a serious problem with my algorithm.

I am wondering if I understand the term "only to reduce error". I took this to mean that after each step I recalculate the error, and if the error increased I do not apply the update. This helped rapid convergence significantly.

Upon researching why I got this answer wrong, I ran across some conflicting references that suggest "coordinate descent" can be much more efficient algorithm than GD because of some tricks to re-use parts of the calculation. I'm not sure what to think.
#4
05-07-2013, 12:55 AM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Q7 - understanding co-ordinate descent

Quote:
 Originally Posted by Darcy Daugela I am wondering if I understand the term "only to reduce error". I took this to mean that after each step I recalculate the error, and if the error increased I do not apply the update. This helped rapid convergence significantly.
I see where the misunderstanding is. The word 'only' is meant to qualify the previous part: move along the u coordinate only to reduce the error.' Having said that, evaluating the error then undoing the step is not indicated given the part that follows: '(assume first-order approximation holds like in gradient descent).'
__________________
Where everyone thinks alike, no one thinks very much

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 05:05 PM.