![]() |
#1
|
|||
|
|||
![]()
Let's say we are using SGD with a gradient function
-(yn*xn)/(1 + e^(yn*w*xn)) where xn and w are 3-element vectors. When I evaluate this function, can I evaluate it 3 times, once for each corresponding x[i] and w[i], and thus get a 3-element gradient vector, then update w from that vector? Something like: Code:
double gradient(double yn, double xn, double wn) { double num = -1.0*yn*xn; double denom = 1 + exp(yn*wn*xn); return num / denom; } double g0 = gradient(yn, 1.0, w0); double g1 = gradient(yn, x1n, w1); double g2 = gradient(yn, x2n, w2); w0 = w0 - eta*g0; w1 = w1 - eta*g1; w2 = w2 - eta*g2; |
#2
|
|||
|
|||
![]()
I think your approach is wrong for the denominator. My understanding is that w*x is a inner product that makes a scalar, so the only vector part of the gradient is the yn*xn on the top of the fraction.
(In particular, all 3 w and x terms would be used in the exponential in the denominator for each of the 3 values in the gradient. But your numerator would use the 3 different x's for the 3 different terms.) |
#3
|
|||
|
|||
![]()
Basically you have to calculate the partial derivative of the error function with respect to each dimension of w vector.
Then update each dimension of w using its partial derivative and the original w vector, and also the -eta. This will help: http://www.youtube.com/watch?v=U7HQ_...hannel&list=UL |
#4
|
|||
|
|||
![]()
Holland, I think you are right. I thought the same thing, and I ended up trying it with the whole dot product in the denominator, and I am getting a much better final set of w values.
Instead of just plugging in individual ![]() ![]() Code:
double altgradient(double yn, double xn, double wtx) { double num = -1.0*yn*xn; double denom = 1 + exp(yn*wtx); return num / denom; } double calcWTX(double x1, double x2, double w0, double w1, double w2) { double sum = w0 + w1*x1 + w2*x2; return sum; } double wtx = calcWTX(x1n, x2n, w0, w1, w2); double g0 = altgradient(yn, 1.0, wtx); double g1 = altgradient(yn, x1n, wtx); double g2 = altgradient(yn, x2n, wtx); Thanks! |
#5
|
|||
|
|||
![]()
I realized that the numerator term (y_n*x_n) is a size 3 x n matrix that doesn't change for a particular data-set. So I evaluate it before I start the logistic regression loop. Inside the loop, the denominator has to be evaluated because w is changing each time.
|
![]() |
Thread Tools | |
Display Modes | |
|
|