Originally Posted by magdon
This is because there is some "best" h^* and for any N, the final output g will be "scattered" around this h^*, sometimes predicting above h^* on a particular x and sometimes below, on average giving the prediction of h^*. This results in \bar g being approximately h^* for any N.
Just one more question: in the quote above, when you said "there's some best h^*", did you mean the best h^* in current hypothesis set \cal H for the current error measure, independent of N? For example, if \cal H consists of linear models and the error measure is mean squared error, then h^* would be the LMMSE estimate? Thanks a lot!
