#1




Evaluating a gradient function with vectors
Let's say we are using SGD with a gradient function
(yn*xn)/(1 + e^(yn*w*xn)) where xn and w are 3element vectors. When I evaluate this function, can I evaluate it 3 times, once for each corresponding x[i] and w[i], and thus get a 3element gradient vector, then update w from that vector? Something like: Code:
double gradient(double yn, double xn, double wn) { double num = 1.0*yn*xn; double denom = 1 + exp(yn*wn*xn); return num / denom; } double g0 = gradient(yn, 1.0, w0); double g1 = gradient(yn, x1n, w1); double g2 = gradient(yn, x2n, w2); w0 = w0  eta*g0; w1 = w1  eta*g1; w2 = w2  eta*g2; 
#2




Re: Evaluating a gradient function with vectors
I think your approach is wrong for the denominator. My understanding is that w*x is a inner product that makes a scalar, so the only vector part of the gradient is the yn*xn on the top of the fraction.
(In particular, all 3 w and x terms would be used in the exponential in the denominator for each of the 3 values in the gradient. But your numerator would use the 3 different x's for the 3 different terms.) 
#3




Re: Evaluating a gradient function with vectors
Basically you have to calculate the partial derivative of the error function with respect to each dimension of w vector.
Then update each dimension of w using its partial derivative and the original w vector, and also the eta. This will help: http://www.youtube.com/watch?v=U7HQ_...hannel&list=UL 
#4




Re: Evaluating a gradient function with vectors
Holland, I think you are right. I thought the same thing, and I ended up trying it with the whole dot product in the denominator, and I am getting a much better final set of w values.
Instead of just plugging in individual values in the exponential, I go ahead and calculate the whole dot product and use the same denominator for each element of the gradient. Code:
double altgradient(double yn, double xn, double wtx) { double num = 1.0*yn*xn; double denom = 1 + exp(yn*wtx); return num / denom; } double calcWTX(double x1, double x2, double w0, double w1, double w2) { double sum = w0 + w1*x1 + w2*x2; return sum; } double wtx = calcWTX(x1n, x2n, w0, w1, w2); double g0 = altgradient(yn, 1.0, wtx); double g1 = altgradient(yn, x1n, wtx); double g2 = altgradient(yn, x2n, wtx); Thanks! 
#5




Re: Evaluating a gradient function with vectors
I realized that the numerator term (y_n*x_n) is a size 3 x n matrix that doesn't change for a particular dataset. So I evaluate it before I start the logistic regression loop. Inside the loop, the denominator has to be evaluated because w is changing each time.

Thread Tools  
Display Modes  

