LFD Book Forum Gradient Descent on complex parameters (weights)
 Register FAQ Calendar Mark Forums Read

#1
05-29-2013, 01:03 PM
 Kais_M Junior Member Join Date: Apr 2013 Posts: 3
Gradient Descent on complex parameters (weights)

is it possible to use G.D. to optimize a complex parameter vector? still linear model with mean square error measure, but the parameters are complex not real. I did some derivation, but not sure my derivatives wrt complex numbers are correct.. would like to hear from people here how they dealt (would deal) with this problem.

many thanks,
#2
05-29-2013, 01:53 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Gradient Descent on complex parameters (weights)

Quote:
 Originally Posted by Kais_M is it possible to use G.D. to optimize a complex parameter vector? still linear model with mean square error measure, but the parameters are complex not real. I did some derivation, but not sure my derivatives wrt complex numbers are correct.. would like to hear from people here how they dealt (would deal) with this problem. many thanks,
If the error function itself is real-valued, then this can be done by considering every complex parameter as two parameters (real and imaginary parts) and carrying out GD with respect to these (twice as many) real parameters. If the error function is complex, then there needs to be a definition of what the objective is since a "minimum complex number" is not a well-defined notion.
__________________
Where everyone thinks alike, no one thinks very much
#3
05-29-2013, 04:26 PM
 Elroch Invited Guest Join Date: Mar 2013 Posts: 143
Re: Gradient Descent on complex parameters (weights)

The purpose of a quality function for comparing options, such as an error function, requires at the very least a total order on the range. It would be reasonable for this to be topological, rather than explicitly metric: this would do no harm to the ideas of minima, or comparisons between errors. [One indication of this was in the lectures, where an error function was replaced by its logarithm, in the knowledge that this would preserve the total order].

But as Yaser indicates, complex numbers lack such an order (at least a natural one), so can't be the range of an error function.
#4
05-30-2013, 08:35 AM
 Kais_M Junior Member Join Date: Apr 2013 Posts: 3
Re: Gradient Descent on complex parameters (weights)

thank you for the quick reply. I am using a real error measure, sum of squared errors, but it is a function of complex parameters. When deriving the equations for the error and the update rule for gradient descent you will hit a point -unless I'm making the same mistake every time- where you have to compute the derivative wrt a complex parameter. I do not have any intuition into that... seems that Dr Yaser is saying that you have to look at the complex parameter as a 2D vector of real numbers and compute derivative wrt that vector.. this why the # of parameters doubles. is this an "engineering" solution?? or is it really mathematically correct.. there seems to be much more to this than meets the eye..
#5
05-30-2013, 09:25 AM
 Elroch Invited Guest Join Date: Mar 2013 Posts: 143
Re: Gradient Descent on complex parameters (weights)

Quote:
 Originally Posted by Kais_M thank you for the quick reply. I am using a real error measure, sum of squared errors, but it is a function of complex parameters. When deriving the equations for the error and the update rule for gradient descent you will hit a point -unless I'm making the same mistake every time- where you have to compute the derivative wrt a complex parameter. I do not have any intuition into that... seems that Dr Yaser is saying that you have to look at the complex parameter as a 2D vector of real numbers and compute derivative wrt that vector.. this why the # of parameters doubles. is this an "engineering" solution?? or is it really mathematically correct.. there seems to be much more to this than meets the eye..
Don't worry, it's just as simple as it appears. For this purpose, a complex parameter is simply two real parameters, since there is no multiplication by complex numbers involved.

#6
05-30-2013, 09:39 AM
 Kais_M Junior Member Join Date: Apr 2013 Posts: 3
Re: Gradient Descent on complex parameters (weights)

Quote:
 Originally Posted by Elroch
actually there is multiplication of complex numbers; one complex number is a parameter we are trying to optimize, the other is the data. The data is represented in the Fourier domain, that's why it's complex. When taking the derivative wrt the complex parameter and propagating it inside the formula for sum of squared errors you eventually have to take the derivative of the complex parameter multiplied by the complex data wrt the complex parameter... e.g. the complex parameters could be values in a transfer function, complex data is Fourier transform of real signal.
#7
05-30-2013, 12:02 PM
 yaser Caltech Join Date: Aug 2009 Location: Pasadena, California, USA Posts: 1,478
Re: Gradient Descent on complex parameters (weights)

Quote:
 Originally Posted by Kais_M look at the complex parameter as a 2D vector of real numbers and compute derivative wrt that vector.. this why the # of parameters doubles. is this an "engineering" solution?? or is it really mathematically correct.
Say you apply the same principle of GD; that you are moving in the parameter space by a fixed-length step (in the direction that gives you the biggest change in your objective function, under linear approximation). If you take the size of a complex step to be the Euclidean size (magnitude of a complex number measured as the square root of the sum of its squared real and imaginary parts), then the approach quoted above would be the principled implementation of GD.
__________________
Where everyone thinks alike, no one thinks very much

 Thread Tools Display Modes Hybrid Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 04:24 AM.