Quote:
Originally Posted by yaser
First, just to make sure, the inside 'E' is not an expectation, but the value of the insample error that corresponds to the weight vector . The (outside) expected value is with respect to the training data set, and it means the average value (of the insample error) as you train with different data sets.

Training data has d dimensions in the x's. If one ignored some of the dimensions and did linear regression with reduced number d' of dimensions one would have larger insample errors presumably, compared to considering all d dimensions?
Why then is the expected insample error averaged over all data sets increasing with the number of dimensions?