#1




Gradient Descent on complex parameters (weights)
is it possible to use G.D. to optimize a complex parameter vector? still linear model with mean square error measure, but the parameters are complex not real. I did some derivation, but not sure my derivatives wrt complex numbers are correct.. would like to hear from people here how they dealt (would deal) with this problem.
many thanks, 
#2




Re: Gradient Descent on complex parameters (weights)
Quote:
__________________
Where everyone thinks alike, no one thinks very much 
#3




Re: Gradient Descent on complex parameters (weights)
The purpose of a quality function for comparing options, such as an error function, requires at the very least a total order on the range. It would be reasonable for this to be topological, rather than explicitly metric: this would do no harm to the ideas of minima, or comparisons between errors. [One indication of this was in the lectures, where an error function was replaced by its logarithm, in the knowledge that this would preserve the total order].
But as Yaser indicates, complex numbers lack such an order (at least a natural one), so can't be the range of an error function. 
#4




Re: Gradient Descent on complex parameters (weights)
thank you for the quick reply. I am using a real error measure, sum of squared errors, but it is a function of complex parameters. When deriving the equations for the error and the update rule for gradient descent you will hit a point unless I'm making the same mistake every time where you have to compute the derivative wrt a complex parameter. I do not have any intuition into that... seems that Dr Yaser is saying that you have to look at the complex parameter as a 2D vector of real numbers and compute derivative wrt that vector.. this why the # of parameters doubles. is this an "engineering" solution?? or is it really mathematically correct.. there seems to be much more to this than meets the eye..

#5




Re: Gradient Descent on complex parameters (weights)
Quote:

#6




Re: Gradient Descent on complex parameters (weights)
actually there is multiplication of complex numbers; one complex number is a parameter we are trying to optimize, the other is the data. The data is represented in the Fourier domain, that's why it's complex. When taking the derivative wrt the complex parameter and propagating it inside the formula for sum of squared errors you eventually have to take the derivative of the complex parameter multiplied by the complex data wrt the complex parameter... e.g. the complex parameters could be values in a transfer function, complex data is Fourier transform of real signal.

#7




Re: Gradient Descent on complex parameters (weights)
Say you apply the same principle of GD; that you are moving in the parameter space by a fixedlength step (in the direction that gives you the biggest change in your objective function, under linear approximation). If you take the size of a complex step to be the Euclidean size (magnitude of a complex number measured as the square root of the sum of its squared real and imaginary parts), then the approach quoted above would be the principled implementation of GD.
__________________
Where everyone thinks alike, no one thinks very much 
#8




Re: Gradient Descent on complex parameters (weights)
Quote:
If you have , you know everything about the function regardless of whether you think of as a complex number or not. Specifically, you know the relative value at one point to another and the minimum. So you can choose to forget it ever was a complex function, think of it as a real function and do the optimisation you want. This is enough, right? 
Thread Tools  
Display Modes  

