Here are the old homework problems (not required in this course, and there is no technical support).

Backpropagation:

Following the class notes, implement the backpropagation algorithm

that takes as input a network architecture (

)

and a set of examples

where

and

, and produces as output the network weights.

The algorithm should perform gradient descent on one example at a time,

but should also keep track of the average error for all the examples in

each epoch. Try your algorithm on the data set in

http://work.caltech.edu/train.dat
(the first two columns are the input and the third column is the output).

Test the convergence behavior for architectures with one hidden layer

(

) and 1 to 5 neurons (

), with combinations of the following

parameters:

(i) The initial weight values chosen independently and randomly

from the range (-0.02,0.02), the range (-0.2,0.2), or the range (-2,2).

(ii) The learning rate

fixed at

,

or

.

(iii) Sufficient number of epochs to get the training error

to converge (within reason).

Turn in your code and a single parameter combination that resulted

in good convergence for the above architectures.

Generalization:

Using your backpropagation program and data from the above problem, train different

neural networks with

(an input layer, one `hidden' layer, and an

output layer) where the number of neurons in the hidden layer is 1, 2, 3, 4, or 5.

Use the following out-of-sample data to test your networks:

http://work.caltech.edu/test.dat
Plot the training and test errors for each network as a function of

the epoch number (hence the `intermediate' networks are evaluated using the

test data, but the test data is not used in the backpropagation).

Repeat the experiment by reversing the roles of the training and test

sets (you may need to readjust the parameter combination from the previous problem), and plot

the training and test errors again. Briefly analyze the results you get.