![]() |
#1
|
|||
|
|||
![]()
Hi, can anybody send me their R code for Q 8/9 or look at mine (appended below)? My gradient descent never runs for more than 1 epoch. I can't figure out what I'm doing wrong, I've pored over my code for so long I'm feeling cross-eyed
# Learning from Data HW5 (items 8 and 9) # Logistic Regression with Stochastic Gradient Descent nIter = 100; nIn = 100; nOut = 10000; eta = 0.01; eOut = rep(NA, nIter); nEpochs = rep(NA, nIter); set.seed(123); for (i in (1:nIter)) { # Initialization of f(x) and of matching In-sample nEpoch = nEpoch +1; }} # E(Eout) avgEOut = mean(eOut); # E(NEpoch) avgNEpoch = mean(nEpochs); Last edited by catherine; 05-07-2013 at 11:55 PM. Reason: code formatting |
#2
|
|||
|
|||
![]()
Hi Catherine,
It would help if you can wrap your code up in the code tags. The tags can be found with the other formatting options when posting a message on the forum. This will make it easier for us to read and remove the sad face, although I thought it was appropriate. |
#3
|
|||
|
|||
![]()
Sorry, I hadn't noticed the advanced editing option. First time I'm posting something to the forum.
|
#4
|
||||
|
||||
![]()
Welcome aboard.
__________________
Where everyone thinks alike, no one thinks very much |
#5
|
|||
|
|||
![]()
I think the following bit of code has a bug. You are updating w and w0 every time through the loop and then comparing the two w's at the end. I believe you should be comparing the value of w from before the loop to the value of w at the end of the loop.
So the first time you have w=[0,0,0] then you loop through all points updating the weight. Then after that you compare w=[0,0,0] with weight you have at the end of the loop. Then you use the updated weight as the initial weight for the next time through the loop. Code:
for (j in permuted.ind) { w0 = w; v = (-YIn[j] * XIn[j,]) / (1 + exp(YIn[j] * (XIn[j,] %*% w0))); w = w0 - eta * v; } # Stop gradient descent as soon as delta(w, w0) < 0.01 if (sqrt(sum((w - w0)^2)) < 0.01) break() |
#6
|
|||
|
|||
![]()
Good spotting! Thank you so much arcticblue! I had simply misread the definition of the exit condition
![]() |
#7
|
|||
|
|||
![]()
Hello all:
I answered questions 8 and 9 incorrectly. My R code ran, but it is giving wrong values. Here it is: Code:
#Q8 #we need a function for determining where a point falls #given the 2 points p1 and p2 that determine the line out <- function(p1,p2,p) { m <- (p2-p1)[2]/(p2-p1)[1] if (p[2] < m*(p[1]-p1[1])+p1[2]) return(-1) else return(+1) } #function for generating the training data. Size=N generate_training_set <- function(p1,p2,N) { tm <- matrix(data=runif(min=-1,max=1,n=2*N),nrow=N,ncol=2) y <- sapply(1:N,function(i) out(p1,p2,tm[i,])) return( as.matrix(data.frame(x0=rep(1,N),x1=tm[,1],x2=tm[,2],y=y)) ) } N <- 100 eta <- 0.1 runs <- 100 #the number of epochs and the Eout for each run #will be stored in the respective vectors epochs <- numeric(0); Eouts <- numeric(0) for (r in 1:runs) { #Generate the 2 points for the line p1 <- runif(min=-1,max=+1,n=2) p2 <- runif(min=-1,max=+1,n=2) #generate the training set of size N training_data <- generate_training_set(p1,p2,N) w <- c(0,0,0); wp <- c(1,1,1); epoch <- 0 while (sqrt(sum((w-wp)^2)) > 0.01) { wp <- w perm <- sample(1:N,size=N) for (j in 1:N) { i <- perm[j] grad <- (-training_data[i,4])*training_data[i,1:3]/ (1+exp(training_data[i,4]*w%*%training_data[i,1:3])) w <- w - eta*grad } epoch <- epoch + 1 } epochs <- c(epochs,epoch) #Evaluate Eout, generate a new test set 10 times larger test_data <- generate_training_set(p1,p2,N*10) s <- sapply(1:(N*10),function(i) log(1+exp(test_data[i,1:3]%*%w*(-test_data[i,4]))) ) Eout <- mean(s) Eouts <- c(Eouts,Eout) print(paste(r,epoch,Eout)) #so I can see what the program is doing } |
#8
|
|||
|
|||
![]()
Yes, I now get average number of epochs = 340.17 and average error out = 0.1036295 after correcting the exit condition for the gradient descent:
Code:
# Gradient descent repeat { wbase = w; nEpoch = nEpoch +1; # Permutate In-sample and update weight vector with each data point in turn permuted.ind = sample(nIn, nIn, replace=FALSE); for (j in permuted.ind) { v = (-YIn[j] * XIn[j,]) / (1 + exp(YIn[j] * (XIn[j,] %*% w))); w = w - eta * v; } # Stop gradient descent as soon as delta(w, wbase) < 0.01 if (sqrt(sum((w - wbase)^2)) < 0.01) break() } |
#9
|
|||
|
|||
![]()
OOps! Thanks for pointing that out, catherine! That's how I blew away 2 points on that quiz!
|
![]() |
Thread Tools | |
Display Modes | |
|
|