I'm trying to compute the stochastic gradient descent for linear regression, and the formula I'm using is:

I'm running the experiment with what I believe to be the correct inputs, but I'm not getting the what I expect to be the output, so I'm trying to trace my steps and see where I went wrong. I was looking at the formula, and one part that I was uncertain about was the

part. Aren't

and

both 1x3 row vectors? If I transpose

, and multiply, I'd get a 3x3 matrix which didn't make sense to me in the calculation, so I'm essentially multiplying them right now as if they were just 1x3 row vectors.