![]() |
Gradient Descent on complex parameters (weights)
is it possible to use G.D. to optimize a complex parameter vector? still linear model with mean square error measure, but the parameters are complex not real. I did some derivation, but not sure my derivatives wrt complex numbers are correct.. would like to hear from people here how they dealt (would deal) with this problem.
many thanks, |
Re: Gradient Descent on complex parameters (weights)
Quote:
|
Re: Gradient Descent on complex parameters (weights)
The purpose of a quality function for comparing options, such as an error function, requires at the very least a total order on the range. It would be reasonable for this to be topological, rather than explicitly metric: this would do no harm to the ideas of minima, or comparisons between errors. [One indication of this was in the lectures, where an error function was replaced by its logarithm, in the knowledge that this would preserve the total order].
But as Yaser indicates, complex numbers lack such an order (at least a natural one), so can't be the range of an error function. |
Re: Gradient Descent on complex parameters (weights)
thank you for the quick reply. I am using a real error measure, sum of squared errors, but it is a function of complex parameters. When deriving the equations for the error and the update rule for gradient descent you will hit a point -unless I'm making the same mistake every time- where you have to compute the derivative wrt a complex parameter. I do not have any intuition into that... seems that Dr Yaser is saying that you have to look at the complex parameter as a 2D vector of real numbers and compute derivative wrt that vector.. this why the # of parameters doubles. is this an "engineering" solution?? or is it really mathematically correct.. there seems to be much more to this than meets the eye..
|
Re: Gradient Descent on complex parameters (weights)
actually there is multiplication of complex numbers; one complex number is a parameter we are trying to optimize, the other is the data. The data is represented in the Fourier domain, that's why it's complex. When taking the derivative wrt the complex parameter and propagating it inside the formula for sum of squared errors you eventually have to take the derivative of the complex parameter multiplied by the complex data wrt the complex parameter... e.g. the complex parameters could be values in a transfer function, complex data is Fourier transform of real signal.
|
Re: Gradient Descent on complex parameters (weights)
Quote:
|
Re: Gradient Descent on complex parameters (weights)
Quote:
If you have ![]() ![]() Specifically, you know the relative value at one point to another and the minimum. So you can choose to forget it ever was a complex function, think of it as a real function and do the optimisation you want. This is enough, right? |
All times are GMT -7. The time now is 08:49 PM. |
Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.