View Single Post
  #10  
Old 05-19-2013, 08:21 AM
warren warren is offline
Junior Member
 
Join Date: May 2013
Location: Tulsa, OK
Posts: 4
Default Re: *ANSWER* questions w linear regression & weight decay

I am really stuck starting with problem 2 on homework 6. I want to find out where I went wrong before I start on homework 7, since I got 3/10 on homework 6. Is there anybody here who reads Clojure who can tell me where I went wrong?
Code:
(ns hw6.core
  (:require [clojure.java.io :as io]
            [clatrix.core :as m] ))

(defn pseudo-inverse [M]
  (m/* (m/i (m/* (m/t M) M)) (m/t M)))

(defn read-dataset [url]
  (m/matrix (with-open [r (io/reader url)]
              (doall (map
                      (comp
                       (partial map read-string)
                       (partial re-seq #"\S+"))
                      (line-seq r))))))

(defn augment-dataset [M]
  (let [[x1s x2s] (m/cols M)
        [n] (m/size M)]
    (m/hstack (m/ones n 1)
              x1s
              x2s
              (m/mult x1s x1s)
              (m/mult x2s x2s)
              (m/mult x1s x2s)
              (m/abs (m/- x1s x2s))
              (m/abs (m/+ x1s x2s)))))

(defn ys [M]
  (let [[_ _ ys] (m/cols M)] ys))

(defn read-in-sample []
  (read-dataset "http://work.caltech.edu/data/in.dta"))

(defn read-out-of-sample []
  (read-dataset "http://work.caltech.edu/data/out.dta"))

(defn read-setup []
  (let [in (read-in-sample)
        out (read-out-of-sample)]
    {
     :x-ins (augment-dataset in)
     :x-outs (augment-dataset out)
     :y-ins (ys in)
     :y-outs (ys out)
     }
    ))

(defn weights [probset]
  (m/* (pseudo-inverse (:x-ins probset)) (:y-ins probset)))

(defn e-in [probset]
  (let [the-diff (m/- (m/* (:x-ins probset) (weights probset)) (:y-ins probset))
        matches (count (filter (partial > 0.5) (m/mult the-diff the-diff)))
        n (count (m/rows the-diff))]
    (/ (- n matches) n)))

(defn e-out [probset]
  (let [the-diff (m/- (m/* (:x-outs probset) (weights probset)) (:y-outs probset))
        matches (count (filter (partial > 0.5) (m/mult the-diff the-diff)))
        n (count (m/rows the-diff))]
    (/ (- n matches) n)))

(defn problem-6-2-eval [x y]
  (+ (* (- 3/35 x)
        (- 3/35 x))
     (* (- 21/125 y)
        (- 21/125 y))))

;;--------------------------------------------------------------------------

hw6.core> (seq (weights (read-setup)))
(-1.6470670613492875 -0.14505926927976592 0.10154120500179364 -2.032968443227123 -1.8280437313439264 2.4815294496056963 4.158938609024668 0.31651714084678323)
hw6.core> (e-in (read-setup))
3/35
hw6.core> (e-out (read-setup))
21/125
hw6.core> (problem-6-2-eval 0.03, 0.08)
0.010848081632653063
hw6.core> (problem-6-2-eval 0.03, 0.10)
0.007728081632653061
hw6.core> (problem-6-2-eval 0.04, 0.09)
0.008173795918367349
hw6.core> (problem-6-2-eval 0.04, 0.11)
0.005453795918367348
hw6.core> (problem-6-2-eval 0.05, 0.10)
0.005899510204081633
hw6.core>
This seems to show none of the answers for problem 2 being terribly close in the euclidean distance, but D being closest among them. According to the answer key, the correct answer is A.

Many many thanks in advance to whomever can straighten me out!!
Reply With Quote