LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 5 (http://book.caltech.edu/bookforum/forumdisplay.php?f=134)

 catherine 05-07-2013 10:59 PM

*ANSWER* HW5 Q8/9 - Help with R code

Hi, can anybody send me their R code for Q 8/9 or look at mine (appended below)? My gradient descent never runs for more than 1 epoch. I can't figure out what I'm doing wrong, I've pored over my code for so long I'm feeling cross-eyed

# Learning from Data HW5 (items 8 and 9)
# Logistic Regression with Stochastic Gradient Descent

nIter = 100;
nIn = 100;
nOut = 10000;
eta = 0.01;
eOut = rep(NA, nIter);
nEpochs = rep(NA, nIter);

set.seed(123);
for (i in (1:nIter)) {
# Initialization of f(x) and of matching In-sample
x1 = runif(min=-1,max=1,2);
x2 = runif(min=-1,max=1,2);
f = lm(x2 ~ x1);
dataIn = data.frame(x0 = rep(1,nIn), x1 = runif(min=-1,max=1,nIn), x2 = runif(min=-1,max=1,nIn));
XIn = as.matrix(dataIn);
YIn = ifelse((dataIn\$x2 > predict(f,newdata=dataIn)),1,-1);
w = c(0,0,0);
nEpoch = 0;

repeat {
nEpoch = nEpoch +1;
# Permutate In-sample and update weight vector with each data point in turn
permuted.ind = sample(nIn, nIn, replace=FALSE);
for (j in permuted.ind) {
w0 = w;
v = (-YIn[j] * XIn[j,]) / (1 + exp(YIn[j] * (XIn[j,] %*% w0)));
w = w0 - eta * v;
}
# Stop gradient descent as soon as delta(w, w0) < 0.01
if (sqrt(sum((w - w0)^2)) < 0.01) break()
}
# Save number of epoch cycles required to reach exit condition
nEpochs[i] = nEpoch;

# Initalize Out-sample
outCoords = runif(min=-1,max=1,2*nOut);
outX1.ind = sample(2*nOut,nOut);
outX1 = outCoords[outX1.ind];
outX2 = outCoords[-outX1.ind];
dataOut = data.frame(x0 = rep(1,nOut), x1 = outX1, x2 = outX2);
XOut = as.matrix(dataOut);
YOut = ifelse((dataOut\$x2 > predict(f,newdata=dataOut)),1,-1);
# Compute and store out-sample error matching weight vector w obtained through gradient descent
eOut[i] = sum(sapply(1 + sapply(-1 * YOut * (XOut %*% w), exp), log)) / nOut;
}
# E(Eout)
avgEOut = mean(eOut);
# E(NEpoch)
avgNEpoch = mean(nEpochs);

 arcticblue 05-07-2013 11:20 PM

Re: *ANSWER* HW5 Q8/9 - Help with R code

Hi Catherine,

It would help if you can wrap your code up in the code tags. The tags can be found with the other formatting options when posting a message on the forum. This will make it easier for us to read and remove the sad face, although I thought it was appropriate.

 catherine 05-07-2013 11:53 PM

Re: *ANSWER* HW5 Q8/9 - Help with R code

Sorry, I hadn't noticed the advanced editing option. First time I'm posting something to the forum.

 yaser 05-07-2013 11:54 PM

Re: *ANSWER* HW5 Q8/9 - Help with R code

Quote:
 Originally Posted by catherine (Post 10766) First time I'm posting something to the forum.
Welcome aboard.

 arcticblue 05-08-2013 02:31 AM

Re: *ANSWER* HW5 Q8/9 - Help with R code

I think the following bit of code has a bug. You are updating w and w0 every time through the loop and then comparing the two w's at the end. I believe you should be comparing the value of w from before the loop to the value of w at the end of the loop.

So the first time you have w=[0,0,0] then you loop through all points updating the weight. Then after that you compare w=[0,0,0] with weight you have at the end of the loop. Then you use the updated weight as the initial weight for the next time through the loop.

Code:

for (j in permuted.ind) {
w0 = w;
v = (-YIn[j] * XIn[j,]) / (1 + exp(YIn[j] * (XIn[j,] %*% w0)));
w = w0 - eta * v;
}
# Stop gradient descent as soon as delta(w, w0) < 0.01
if (sqrt(sum((w - w0)^2)) < 0.01) break()

 catherine 05-08-2013 04:04 AM

Re: *ANSWER* HW5 Q8/9 - Help with R code

Good spotting! Thank you so much arcticblue! I had simply misread the definition of the exit condition :)

 jlaurentum 05-08-2013 06:49 AM

Re: *ANSWER* HW5 Q8/9 - Help with R code

Hello all:

I answered questions 8 and 9 incorrectly. My R code ran, but it is giving wrong values. Here it is:

Code:

#Q8
#we need a function for determining where a point falls
#given the 2 points p1 and p2 that determine the line

out <- function(p1,p2,p) {
m <- (p2-p1)[2]/(p2-p1)[1]
if (p[2] < m*(p[1]-p1[1])+p1[2]) return(-1) else return(+1)
}

#function for generating the training data. Size=N

generate_training_set <- function(p1,p2,N) {
tm <- matrix(data=runif(min=-1,max=1,n=2*N),nrow=N,ncol=2)
y <- sapply(1:N,function(i) out(p1,p2,tm[i,]))
return( as.matrix(data.frame(x0=rep(1,N),x1=tm[,1],x2=tm[,2],y=y)) )
}

N <- 100
eta <- 0.1
runs <- 100
#the number of epochs and the Eout for each run
#will be stored in  the respective vectors
epochs <- numeric(0); Eouts <- numeric(0)
for (r in 1:runs) {
#Generate the 2 points for the line
p1 <- runif(min=-1,max=+1,n=2)
p2 <- runif(min=-1,max=+1,n=2)
#generate the training set of size N
training_data <- generate_training_set(p1,p2,N)
w <- c(0,0,0); wp <- c(1,1,1); epoch <- 0
while (sqrt(sum((w-wp)^2)) > 0.01) {
wp <- w
perm <- sample(1:N,size=N)
for (j in 1:N) {
i <- perm[j]
(1+exp(training_data[i,4]*w%*%training_data[i,1:3]))
}
epoch <- epoch + 1
}
epochs <- c(epochs,epoch)
#Evaluate Eout, generate a new test set 10 times larger
test_data <- generate_training_set(p1,p2,N*10)
s <- sapply(1:(N*10),function(i) log(1+exp(test_data[i,1:3]%*%w*(-test_data[i,4]))) )
Eout <- mean(s)
Eouts <- c(Eouts,Eout)
print(paste(r,epoch,Eout))    #so I can see what the program is doing

}

Did you manage to get the answers correct, Catherine?

 catherine 05-08-2013 06:25 PM

Re: *ANSWER* HW5 Q8/9 - Help with R code

Yes, I now get average number of epochs = 340.17 and average error out = 0.1036295 after correcting the exit condition for the gradient descent:

Code:

repeat {
wbase = w;
nEpoch = nEpoch +1;
# Permutate In-sample and update weight vector with each data point in turn
permuted.ind = sample(nIn, nIn, replace=FALSE);
for (j in permuted.ind) {
v = (-YIn[j] * XIn[j,]) / (1 + exp(YIn[j] * (XIn[j,] %*% w)));
w = w - eta * v;
}
# Stop gradient descent as soon as delta(w, wbase) < 0.01
if (sqrt(sum((w - wbase)^2)) < 0.01) break()
}

There is nothing wrong with your code ... except for the eta value you're using: 0.1 instead of 0.01. Sorry ...

 jlaurentum 05-09-2013 09:32 AM

Re: *ANSWER* HW5 Q8/9 - Help with R code

OOps! Thanks for pointing that out, catherine! That's how I blew away 2 points on that quiz!

 All times are GMT -7. The time now is 09:05 PM.