Quote:
Originally Posted by ripande
1. I calculated the value of "a" for which the which minimizes the least square for two points ( x, sin(pi*x) ), x being between -1 and 1.
2. Repeated the above for 100 times and hence got 100 values of "a"
|
These steps are correct (with

instead of

in step 1) in calculating the final hypothesis

for 100 different sets

.
Quote:
3. Then I chose a fresh point x3 between [-1, 1] and calculated the value of y3 = a*x3 for all 100 points
|
This step evaluates

for each

in the 100 runs. If

is fixed for all 100 runs, this step can be used to evaluate the bias and variance at the point

(namely

and

).
Quote:
4. Calculated average value of y3 for 100 points, say y_avg.
|
If the 100 points are the same

with different

, then the average approximates

. If the points are different, I am not sure about the utility of this quantity for the calculation of bias and variance.
Quote:
5. Calculated "a" for avg hypothesis as : y_avg/x3
|
You already have the different values of

for different data sets

(these are the values of

that you used to calculated

from

). Because the formula for the hypothesis is linear in

, you can directly calculate

of the average hypothesis by averaging all the

's. What you are suggesting is equivalent
in this case.