LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 4 (http://book.caltech.edu/bookforum/forumdisplay.php?f=133)
-   -   *ANSWER* HW4 #4: graphical hint (ok, more than a hint) (http://book.caltech.edu/bookforum/showthread.php?t=4253)

dlammerts 04-29-2013 06:36 PM

*ANSWER* HW4 #4: graphical hint (ok, more than a hint)
 
Based on 1,000 runs. Solid straight line is the resulting average hypothesis.

http://www.venturephilosopher.com/?attachment_id=136

Dirk

jcmorales1564 04-30-2013 03:25 AM

Re: *ANSWER* HW4 #4: graphical hint (ok, more than a hint)
 
Very nice plot. You can clearly see that the hypotheses are restricted to rotation about the origin. If you used R or Matlab (or Octave), could you please post the code to generate the plot?

Thank you.

Juan

jlevy 04-30-2013 03:49 AM

Re: *ANSWER* HW4 #4: graphical hint (ok, more than a hint)
 
Quote:

Originally Posted by jcmorales1564 (Post 10665)
Very nice plot. You can clearly see that the hypotheses are restricted to rotation about the origin.

Not a big surprise, since the hypothesis states a zero intercept

dlammerts 04-30-2013 08:47 AM

Re: *ANSWER* HW4 #4: graphical hint (ok, more than a hint)
 
Juan et al.,
Here is the R code. It plots the graph, prints slope and intercept, and also calculates bias and variance. You will see three code sections - one for each of the hypothesis classes discussed in class and in this homework:
h(x) = b (class)
h(x) = ax+b (class)
h(x) = ax (HW)

Dirk

#############################################

# Caltech Machine Learning April 2013
# HW4 - Bias and variance (problems #4-7)

# Clear workspace
rm(list = ls())

# Set seed for random number generation
set.seed(2013)

# Set sample size
N_sample = 1000

# Create sample and estimate expected value for the hypothesis using h(x) = b (same as in lecture example 1)
# Initialize variable to store regression parameter for regression h(x) = ax+b (slope a will be forced to zero)
a <- NA
b <- NA
variance <- NA
# Plot f(x)
plot.new()
# Plot f(x) = sin(pi*x)
curve(sin(pi*x), -1, 1, main='h(x) = b', col="black")
for (i in 1:N_sample) {
x_values <- runif(2, -1.0, 1.0)
data_set <- data.frame(x = x_values, y = sin(pi * x_values))
a[i] <- 0.0
b[i] <- 0.5 * (data_set[1,"y"] + data_set[2,"y"])
# Calculate variance for the specific hypothesis for this particular data set using g_bar(x) = 0.0
integrand <- function(x) {(0.0-a[i]*x+b[i])^2}
# Dividing definitive integral by 2 to calculate exepcted value relative to x since range of x is [-1,1]
variance[i] <- 0.5 * (integrate(integrand,-1,1)$value)
# Plot current h(x)
abline(b[i], a[i], col="grey75")
}
# Plot expected value for hypothesis g
abline(mean(b), mean(a), col="black")
# Plot f(x) = sin(pi*x) again to overlay on top of chart
par(new=T)
curve(sin(pi*x), -1, 1, main='h(x) = b', col="black")
# Print average regression parameters as exepcted value for hypothesis g
mean(a)
mean(b)
# Calculate bias using g_bar(x) = 0.0
integrand <- function(x) {(0.0-sin(pi*x))^2}
# Dividing definitive integral by 2 to calculate exepcted value relative to x since range of x is [-1,1]
bias <- 0.5 * (integrate(integrand,-1,1)$value)
bias
# Calculate variance using g_bar(x) = 0.77*x
# Average over data set specific variances
mean(variance)


# Create sample and estimate expected value for the hypothesis using h(x) = ax+b (same as in lecture example 2)
# Initialize variable to store regression parameter for regression h(x) = ax+b
a <- NA
b <- NA
variance <- NA
# Plot f(x)
plot.new()
# Plot f(x) = sin(pi*x)
curve(sin(pi*x), -1, 1, main='h(x) = ax+b', col="black")
for (i in 1:N_sample) {
x_values <- runif(2, -1.0, 1.0)
data_set <- data.frame(x = x_values, y = sin(pi * x_values))
regression_params <- lm(formula = data_set$y ~ data_set$x)
a[i] <- regression_params$coefficients[2]
b[i] <- regression_params$coefficients[1]
# Calculate variance for the specific hypothesis for this particular data set using g_bar(x) = 0.77*x
integrand <- function(x) {(0.77*x-a[i]*x+b[i])^2}
# Dividing definitive integral by 2 to calculate exepcted value relative to x since range of x is [-1,1]
variance[i] <- 0.5 * (integrate(integrand,-1,1)$value)
# Plot current h(x)
abline(b[i], a[i], col="grey75")
}
# Plot expected value for hypothesis g
abline(mean(b), mean(a), col="black")
# Plot f(x) = sin(pi*x) again to overlay on top of chart
par(new=T)
curve(sin(pi*x), -1, 1, main='h(x) = ax+b', col="black")
# Print average regression parameters as exepcted value for hypothesis g
mean(a)
mean(b)
# Calculate bias using g_bar(x) = 0.77*x
integrand <- function(x) {(0.77*x-sin(pi*x))^2}
# Dividing definitive integral by 2 to calculate exepcted value relative to x since range of x is [-1,1]
bias <- 0.5 * (integrate(integrand,-1,1)$value)
bias
# Calculate variance using g_bar(x) = 0.77*x
# Average over data set specific variances
mean(variance)


# Create sample and estimate expected value for the hypothesis using h(x) = ax (HW problem)
# Initialize variable to store regression parameter for regression h(x) = ax+b (intercept b will be forced to zero)
a <- NA
b <- NA
variance <- NA
# Plot f(x)
plot.new()
# Plot f(x) = sin(pi*x)
curve(sin(pi*x), -1, 1, main='h(x) = ax', col="black")
for (i in 1:N_sample) {
x_values <- runif(2, -1.0, 1.0)
data_set <- data.frame(x = x_values, y = sin(pi * x_values))
# Force intercept to zero through y ~ 0 + x model term
regression_params <- lm(formula = data_set$y ~ 0 + data_set$x)
a[i] <- regression_params$coefficients
b[i] <- 0.0
# Calculate variance for the specific hypothesis for this particular data set using g_bar(x) = 1.40*x
integrand <- function(x) {(1.40*x-a[i]*x+b[i])^2}
# Dividing definitive integral by 2 to calculate exepcted value relative to x since range of x is [-1,1]
variance[i] <- 0.5 * (integrate(integrand,-1,1)$value)
# Plot current h(x)
abline(b[i], a[i], col="grey75")
}
# Plot expected value for hypothesis g
abline(mean(b), mean(a), col="black")
# Plot f(x) = sin(pi*x) again to overlay on top of chart
par(new=T)
curve(sin(pi*x), -1, 1, main='h(x) = ax', col="black")
# Print average regression parameters as exepcted value for hypothesis g
mean(a)
mean(b)
# Calculate bias using g_bar(x) = 1.40*x
integrand <- function(x) {(1.40*x-sin(pi*x))^2}
# Dividing definitive integral by 2 to calculate exepcted value relative to x since range of x is [-1,1]
bias <- 0.5 * (integrate(integrand,-1,1)$value)
bias
# Calculate variance using g_bar(x) = 1.40*x
# Average over data set specific variances
mean(variance)

jcmorales1564 05-06-2013 01:37 AM

Re: *ANSWER* HW4 #4: graphical hint (ok, more than a hint)
 
Dirk,

Thanks very much! It works like a charm.

Juan


All times are GMT -7. The time now is 03:49 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.