Q7 - understanding co-ordinate descent

Originally Posted by Darcy Daugela View Post
I am wondering if I understand the term "only to reduce error". I took this to mean that after each step I recalculate the error, and if the error increased I do not apply the update. This helped rapid convergence significantly.
I see where the misunderstanding is. The word 'only' is meant to qualify the previous part: move along the u coordinate only to reduce the error.' Having said that, evaluating the error then undoing the step is not indicated given the part that follows: '(assume first-order approximation holds like in gradient descent).'
