Re: Basic logistic regression question
Okay I think I see how that works but I'm still struggling to understand Q8. In the question I set the weights to 0. Then the first time through the loop I will calculate the gradient of E_in using the formula in step 3. Because w is all zeros the denominator will end up as 1+ e^0 == 2. The numerator can at most be +/1/ So the biggest change in gradient is +/0.5 for each weight.
Then in step 4 I update the weights
w(1) = w0  learningRate*gradient.
w(1) = 0,0,0  0.01(0.5,0.5,0.5)
w(1) = (0.005,0.005,0.005)
Now the question states stop the algorithm when w(t1) and w(t) < 0.01. So:
sqrt((00.005)^2+(00.005)^2+(00.005)^2) = 0.008
So based on the values above the algorithm will stop after the first iteration because the difference in weights is < 0.01.
Have I misunderstood the gradient of E_in formula? Or am I calculating my error incorrectly? I've tried using batch gradient descent and see the above results (I have 100 data points but the error still ends up less than 0.01.) I've also tried stochastic gradient descent and have similar problems.
I've watched lecture 9 a couple of times now and seem to understand how the algorithm works but I guess my understanding isn't complete.
Any suggestions would be most appreciated.
