Quote:
Originally Posted by tcristo
I noticed that when running the Linear Regression on a training data set followed by running the PLA using the same data and LRA weights, that the Learning Rate (Alpha) of the PLA seems to significantly effect the rate of convergence. I am assuming that the optimal size of alpha is directly related to the size of the convergence errors from the Linear Regression.
Is there a way to model this mathematically such that the Alpha parameter can automatically be calculated for each training set?

For PLA, I cannot recall any. For some more general models like Neural Networks, there are efforts (in terms of optimization) for adaptively changing the
value. BTW, I think the homework problem asks you to take no
(or a naive choice of
) Hope this helps.