View Single Post
Old 05-06-2013, 01:11 PM
yaser's Avatar
yaser yaser is offline
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,478
Default Re: Q8 implementation question

Originally Posted by pyguy View Post
I'm trying to compute the stochastic gradient descent for linear regression, and the formula I'm using is:

\nabla e_n(\mathbf w) = \frac{-y_n\mathbf x_n}{1 + e^{y_n\mathbf{w^T}\mathbf{x_n}}}

I'm running the experiment with what I believe to be the correct inputs, but I'm not getting the what I expect to be the output, so I'm trying to trace my steps and see where I went wrong. I was looking at the formula, and one part that I was uncertain about was the \mathbf{w^T}\mathbf{x_n} part. Aren't \mathbf{w} and \mathbf{x_n} both 1x3 row vectors? If I transpose \mathbf{w}, and multiply, I'd get a 3x3 matrix which didn't make sense to me in the calculation, so I'm essentially multiplying them right now as if they were just 1x3 row vectors.
The convention is that vectors are column vectors, so \mathbf{w}^{\rm T}\mathbf{x_n} is an inner product, not an outer product.
Where everyone thinks alike, no one thinks very much
Reply With Quote