(a) For this problem if you are given a linear hypothesis it should be possible to analytically compute
. However, if you computed it on a test set T, it is fine.
(b) Yes. It is also true that Etest=bias+var. Why? (because we showed this for every x).
(c) The var is computed using the same data sets on which you learned and computed the average function. The
average variance is computed over the distribution of the inputs. In the case you a test set, the average is taken over the test set. Just like bias(x), var(x) is also a function of x that captures how variable your prediction is
at a point x. You take all your predictions on x learned from different data sets and compute the variance of those (just like you take the average of those to get the average function.
Remember that the only purpose of the test set or the input distribution P(x) is to compute an average over (x) of all these quantities. If you had a single test point as discussed in class, everything works there too.
Quote:
Originally Posted by mileschen
Though I have solved this problem, I still a little bit confusing.
(a) Eout. whether it is the test error Etest based on the test data set T, with size N, of a particular hypothesis g that's learnt from a particular training data set D (two points).
(b) Should the bias be computed based on the same test data set T? That is, bias = Ex[bias(x)] = 1/N * sum(bias(xi)) = 1/N * sum((g_x(xi)  f(xi))^2) for each xi in T, where g_x() is the average function.
(c) Should the var be computed based on the K data sets that learn the average function g_(x) and based on the test data set T? That is, var = Ex[var(x)] = 1/N * sum[1/k * sum((gk(xi)  g_x(xi))^2)].
for Eout, bias, and var, should the be computed based on the same test data set?
