LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 6

Reply
 
Thread Tools Display Modes
  #1  
Old 05-13-2013, 11:08 PM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default *ANSWER* questions w linear regression & weight decay

I have been running the weight decay examples Q2-6, but haven't seen any real improvement in the out-of-sample error compared to no regularization at all. Is that just a feature of this particular problem, or should I recheck my calculations?

Unfortunately (or not), the answers I've been getting do appear as options on the multiple choices.
Reply With Quote
  #2  
Old 05-14-2013, 12:05 AM
Ziad Hatahet Ziad Hatahet is offline
Member
 
Join Date: Apr 2013
Location: San Francisco, CA
Posts: 23
Default Re: *ANSWER* questions w linear regression & weight decay

You should be seeing a change in the out-of-sample error when you vary k (for certain values of k at least). Are you using classification error as your error measure?
Reply With Quote
  #3  
Old 05-14-2013, 06:29 AM
jlaurentum jlaurentum is offline
Member
 
Join Date: Apr 2013
Location: Venezuela
Posts: 41
Default Re: *ANSWER* questions w linear regression & weight decay

Are you using the correct formula for the one step solution?

I was using (Z^T Z - \lambda I)\ldots instead of (Z^T Z + \lambda I)\ldots and so regularization didnt make sense at all. I caught the error because I saw in another post that Professor Yaser corrected a student on the plus sign.
Reply With Quote
  #4  
Old 05-14-2013, 06:54 AM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default Re: *ANSWER* questions w linear regression & weight decay

Quote:
Originally Posted by Ziad Hatahet View Post
You should be seeing a change in the out-of-sample error when you vary k (for certain values of k at least). Are you using classification error as your error measure?
Well, I am seeing a change, just not really a reduction. Some are the same, some are bigger. I was expecting a dramatic drop in the out-of-sample error.

And yes, I have been using classification error, but that is a good point - I started using the regression residuals and such, but that mistake at least I caught.
Reply With Quote
  #5  
Old 05-14-2013, 12:56 PM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default Re: *ANSWER* questions w linear regression & weight decay

As I suspected, all my answers on these were wrong. Does anyone have code (R if possible) to show, that I could use for comparison? I'm suspecting my problem was something dumb; even the original linear regression was wrong, and I compared that one with the same answer from the R lm() function.

I'm especially concerned since HW 7 uses all the same data again - so I really need to track this down.
Reply With Quote
  #6  
Old 05-14-2013, 03:10 PM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: *ANSWER* questions w linear regression & weight decay

Quote:
Originally Posted by Michael Reach View Post
As I suspected, all my answers on these were wrong. Does anyone have code (R if possible) to show, that I could use for comparison? I'm suspecting my problem was something dumb; even the original linear regression was wrong, and I compared that one with the same answer from the R lm() function.

I'm especially concerned since HW 7 uses all the same data again - so I really need to track this down.
Are you using lambda from 0.001 to 1000? I suppose it might be possible to forget to calculate the power. If you do this, the added term in the matrix described in this thread can hardly fail to have a significant effect.
Reply With Quote
  #7  
Old 05-14-2013, 04:06 PM
jlaurentum jlaurentum is offline
Member
 
Join Date: Apr 2013
Location: Venezuela
Posts: 41
Default Re: *ANSWER* questions w linear regression & weight decay

Michael:

Here you go:

Code:
#READ IN THE FILES.
datos1 <- read.table("in.dta")
names(datos1) <- c("X1","X2","Y")
datos2 <- read.table("out.dta")
names(datos2) <- c("X1","X2","Y")
#FOR THE FOLLOWING QUESTIONS, SET UP THE MATRIXES
Z <- with(datos1,
			cbind(rep(1,nrow(datos1)),X1,X2,
						X1^2,X2^2,X1*X2,abs(X1-X2),abs(X1+X2)) )
Z <- as.matrix(Z)
Zout <- with(datos2,
					cbind(rep(1,nrow(datos2)),X1,X2,
						X1^2,X2^2,X1*X2,abs(X1-X2),abs(X1+X2)) )
Zout <- as.matrix(Zout)
#NOW FIT WITH WEIGHT DECAY USING LAMBDA=10^-3
lambda <- 10^(-3)

M <- t(Z)%*%Z + diag(rep(8,1))*lambda
w <- solve(M)%*%t(Z)%*%datos1$Y
Ym <- as.numeric(sign(Z%*%w))
Ein <- mean(datos1$Y!=Ym)
Ym <- as.numeric(sign(Zout%*%w))
Eout <- mean(datos2$Y!=Ym)
Reply With Quote
  #8  
Old 05-16-2013, 07:16 PM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default Re: *ANSWER* questions w linear regression & weight decay

Thanks!

Yes, Elroch, I used the full range of lambda. I think my mistake is elsewhere.
Reply With Quote
  #9  
Old 05-17-2013, 04:28 PM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: *ANSWER* questions w linear regression & weight decay

Quote:
Originally Posted by Michael Reach View Post
Thanks!

Yes, Elroch, I used the full range of lambda. I think my mistake is elsewhere.
If you're like me, you've probably made a silly error which has nothing to do with understanding the method.

ok, I'm going to expose most of my insult to the art of programming for these questions. Don't use it as a style guide (especially that nasty bit of unvectorised code. Also I suspect the as.matrix's may be superfluous.) The data format should be clear, I hope.
Code:
WeightDecayLinearRegressionSolver <- function(inputs, outputs, lambda) {  
  # note inputs have bias co-ordinate 
  # inputs is a matrix of 2d points (with a bias)
  # outputs is a vector providing a real valued function of those points
  if (isTRUE(all.equal(var(outputs), 0))) {  
    # This is the completely degenerate case, which occurs when trying to classify data of a single class
    result <- c(outputs[1], 0, 0)
  }
  else {
    result <- PseudoInverse(t(as.matrix(inputs)) %*% as.matrix(inputs) + diag(rep(lambda, length(inputs[1,]))) ) %*% t(as.matrix(inputs)) %*% outputs
  }
  result
}

PseudoInverse <- function(mat) {
  tmat <- t(as.matrix(mat))
  inv(tmat %*% as.matrix(mat)) %*% tmat
}

ClassificationError <- function(actual, predicted) {
  result = 0
  for(i in 1:length(actual)) {
    if(abs(actual[i] - predicted[i]) > 0.5) {
      result <- result + 1
    }
  }
  result/length(actual)
}
Reply With Quote
  #10  
Old 05-19-2013, 07:21 AM
warren warren is offline
Junior Member
 
Join Date: May 2013
Location: Tulsa, OK
Posts: 4
Default Re: *ANSWER* questions w linear regression & weight decay

I am really stuck starting with problem 2 on homework 6. I want to find out where I went wrong before I start on homework 7, since I got 3/10 on homework 6. Is there anybody here who reads Clojure who can tell me where I went wrong?
Code:
(ns hw6.core
  (:require [clojure.java.io :as io]
            [clatrix.core :as m] ))

(defn pseudo-inverse [M]
  (m/* (m/i (m/* (m/t M) M)) (m/t M)))

(defn read-dataset [url]
  (m/matrix (with-open [r (io/reader url)]
              (doall (map
                      (comp
                       (partial map read-string)
                       (partial re-seq #"\S+"))
                      (line-seq r))))))

(defn augment-dataset [M]
  (let [[x1s x2s] (m/cols M)
        [n] (m/size M)]
    (m/hstack (m/ones n 1)
              x1s
              x2s
              (m/mult x1s x1s)
              (m/mult x2s x2s)
              (m/mult x1s x2s)
              (m/abs (m/- x1s x2s))
              (m/abs (m/+ x1s x2s)))))

(defn ys [M]
  (let [[_ _ ys] (m/cols M)] ys))

(defn read-in-sample []
  (read-dataset "http://work.caltech.edu/data/in.dta"))

(defn read-out-of-sample []
  (read-dataset "http://work.caltech.edu/data/out.dta"))

(defn read-setup []
  (let [in (read-in-sample)
        out (read-out-of-sample)]
    {
     :x-ins (augment-dataset in)
     :x-outs (augment-dataset out)
     :y-ins (ys in)
     :y-outs (ys out)
     }
    ))

(defn weights [probset]
  (m/* (pseudo-inverse (:x-ins probset)) (:y-ins probset)))

(defn e-in [probset]
  (let [the-diff (m/- (m/* (:x-ins probset) (weights probset)) (:y-ins probset))
        matches (count (filter (partial > 0.5) (m/mult the-diff the-diff)))
        n (count (m/rows the-diff))]
    (/ (- n matches) n)))

(defn e-out [probset]
  (let [the-diff (m/- (m/* (:x-outs probset) (weights probset)) (:y-outs probset))
        matches (count (filter (partial > 0.5) (m/mult the-diff the-diff)))
        n (count (m/rows the-diff))]
    (/ (- n matches) n)))

(defn problem-6-2-eval [x y]
  (+ (* (- 3/35 x)
        (- 3/35 x))
     (* (- 21/125 y)
        (- 21/125 y))))

;;--------------------------------------------------------------------------

hw6.core> (seq (weights (read-setup)))
(-1.6470670613492875 -0.14505926927976592 0.10154120500179364 -2.032968443227123 -1.8280437313439264 2.4815294496056963 4.158938609024668 0.31651714084678323)
hw6.core> (e-in (read-setup))
3/35
hw6.core> (e-out (read-setup))
21/125
hw6.core> (problem-6-2-eval 0.03, 0.08)
0.010848081632653063
hw6.core> (problem-6-2-eval 0.03, 0.10)
0.007728081632653061
hw6.core> (problem-6-2-eval 0.04, 0.09)
0.008173795918367349
hw6.core> (problem-6-2-eval 0.04, 0.11)
0.005453795918367348
hw6.core> (problem-6-2-eval 0.05, 0.10)
0.005899510204081633
hw6.core>
This seems to show none of the answers for problem 2 being terribly close in the euclidean distance, but D being closest among them. According to the answer key, the correct answer is A.

Many many thanks in advance to whomever can straighten me out!!
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 05:39 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.