Quote:
Originally Posted by ripande
1. I calculated the value of "a" for which the which minimizes the least square for two points ( x, sin(pi*x) ), x being between 1 and 1.
2. Repeated the above for 100 times and hence got 100 values of "a"

These steps are correct (with
instead of
in step 1) in calculating the final hypothesis
for 100 different sets
.
Quote:
3. Then I chose a fresh point x3 between [1, 1] and calculated the value of y3 = a*x3 for all 100 points

This step evaluates
for each
in the 100 runs. If
is fixed for all 100 runs, this step can be used to evaluate the bias and variance at the point
(namely
and
).
Quote:
4. Calculated average value of y3 for 100 points, say y_avg.

If the 100 points are the same
with different
, then the average approximates
. If the points are different, I am not sure about the utility of this quantity for the calculation of bias and variance.
Quote:
5. Calculated "a" for avg hypothesis as : y_avg/x3

You already have the different values of
for different data sets
(these are the values of
that you used to calculated
from
). Because the formula for the hypothesis is linear in
, you can directly calculate
of the average hypothesis by averaging all the
's. What you are suggesting is equivalent
in this case.