Quote:
Originally Posted by Kais_M
look at the complex parameter as a 2D vector of real numbers and compute derivative wrt that vector.. this why the # of parameters doubles. is this an "engineering" solution?? or is it really mathematically correct.

Say you apply the same principle of GD; that you are moving in the parameter space by a fixedlength step (in the direction that gives you the biggest change in your objective function, under linear approximation). If you take the size of a complex step to be the Euclidean size (magnitude of a complex number measured as the square root of the sum of its squared real and imaginary parts), then the approach quoted above would be the principled implementation of GD.