*ANSWER* q8/9
I got both of these questions wrong and I'm not sure what I've done wrong. Is anyone who got these right willing to post their code so that I can compare it with mine to work out what I've done wrong. (I could post my code but it seems cruel to ask others to understand it.)
Any language is good but prefer python or c/c++/java. R or Octave are okay too though. 
Re: * answer * q8/9
Quote:

Re: * answer * q8/9
This is my code in Octave (it is not correct but maybe you could help me find what is wrong):
Code:
Nonetheless, I´m missing something that I just can´t quite pin out. Any help is appreciated. 
Re: * answer * q8/9
I think I fixed the problem. I was confused and generated random points for each epoch and what was really required was a permutation over the original training points.
This is my fix: Code:
function [N, theta, Eout] = trainLogisticRegression(eta, mfunc, bfunc, numPointsPerEpoch) Eout1 is given me an average of 0.3 which is not the required answer Eout2 is given me an average of 0.09 which is close to the final answer I´m wondering if there is something still wrong with what I´m doing...:clueless: 
Re: *ANSWER* q8/9
Thanks for the suggestion Elroch. Here are the steps I am following:
1) Generate a random set of data points. Values between 1 and +1. I am certain that this bit of code is working correctly. 2) Set weight = [0,0,0], eta = 0.01 3) Do do the stochastic gradient descent. 3a) Shuffle the 100 data points. (The shuffling is definitely working.) 3b) On the first shuffled data point get the gradient using the initial weight. To calculate the gradient I am using the code below, this could be wrong? Code:
def gradient_descent(weight, x, y): Code:
error = gradient_descent(weight, x, y) 3d) I repeat 3b/3c for all data points, using the updated weight for each new data point. 4) Once I have updated the weight based on all the data points I then compare the final weight of the iteration with the initial weight of the iteration. 4a) To compare the weights I use the following function which finds the sqrt of the sum of the differences squared. Perhaps this is wrong? Code:
def calc_error(new_weights, old_weights): 4c) If error is still too large then go to 3 and use the new weight as the new initial weight. Well if anyone can spot what mistake I've made or even something that doesn't look right then please say something. 
Re: * answer * q8/9
Quote:

Re: *ANSWER* q8/9
Hi arcticblue, my R code yields the expected average number of epochs / outofsample error, now that (thanks to you) I've implemented the correct exit condition for the SGD. Have you found the issue with your implementation? The approach you're outlining above seems correct. What kind of results are you getting? I'll have a look at your code if you post it.

Re: * answer * q8/9
Hi apbarraza, the problem seems to be with the way you compute the magnitude of the difference between the weight vector at the beginning and at the end of each epoch. Use norm (with p = "fro") instead of abs. Also Eout1 is the way to go. Good luck!

Re: *ANSWER* q8/9
arctic blue, both the description of what you intended to do and code fragments 3b, 3c and 4a look fine to me. Sherlock Holmes famous maxim must apply:
"when you have eliminated the impossible, whatever remains, however improbable, must be the truth". [i.e. it must be in what you haven't posted] 
Re: * answer * q8/9
Quote:
YES !! Thank you sooo much I have been bagging my head and I can´t believe I missed this. Changed while to: Code:
while ((norm(thetatheta_last, "fro")>0.01)) Thank you. 
All times are GMT 7. The time now is 06:36 AM. 
Powered by vBulletin® Version 3.8.3
Copyright ©2000  2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. AbuMostafa, Malik MagdonIsmail, and HsuanTien Lin, and participants in the Learning From Data MOOC by Yaser S. AbuMostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.