Q7  understanding coordinate descent
I didn't entirely understand what coordinate descent meant. This is what I believe it to be: Instead of descending "simultaneously" along all the coordinates as in gradient descent(in this eg: both u and v), we first descend along u, find the new u and then find v. So, when computing v, the new value of u is to be used. Am I right?

Re: Q7  understanding coordinate descent
Quote:

Re: Q7  understanding coordinate descent
I am struggling to understand what I did wrong on this question.
The instructions are clear, I followed the method above (I think?), my answers to related questions (5 and 6) were correct, but my answer to question 7 is far far less than the correct answer. I got the answer level of accuracy in only 5 iterations (instead of 15), so I must have a serious problem with my algorithm. I am wondering if I understand the term "only to reduce error". I took this to mean that after each step I recalculate the error, and if the error increased I do not apply the update. This helped rapid convergence significantly. Upon researching why I got this answer wrong, I ran across some conflicting references that suggest "coordinate descent" can be much more efficient algorithm than GD because of some tricks to reuse parts of the calculation. I'm not sure what to think. 
Re: Q7  understanding coordinate descent
Quote:

All times are GMT 7. The time now is 03:42 AM. 
Powered by vBulletin® Version 3.8.3
Copyright ©2000  2022, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. AbuMostafa, Malik MagdonIsmail, and HsuanTien Lin, and participants in the Learning From Data MOOC by Yaser S. AbuMostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.