Quote:
Originally Posted by barbacot
When choosing the minimum as the point where the derivative (gradient) is zero, how can we be sure that we don't run over a local minimum, or worse, a local maximum?
Does the quadratic form of the error measure ensures Ein does not have such local maxima and minima, just one global minimum? Indeed, it seems that grad(Ein) = 0 has only one solution. Is this really the case?
If it really is the case, than choosing another error measure could yield such local minima/maxima. How can we avoid getting stuck into one of those points?

It is indeed a global minimum in this case. Once easy way to see that, as you observe, is that there is only one solution (when the matrix is invertible) and since that quadratic form is always nonnegative, it has to have a minimum so that must be the unique, global minimum.
Other error measures may result in local minima and often no closedform solution, and the situation is tackled with methods like gradient descent and heuristcs to reduce the impact of local minima. Finding the global minimum is a general problem in optimization that is unlikely to have an exact, tractable solution since it is
NPhard in the general case.