Quote:
Originally Posted by yaser
To answer the first question, if you choose to omit some of the input variables, you will indeed get a larger (at least not smaller) in-sample error. Not sure I understand the second question, but having different training sets does not change the number of input variables. It is a hypothetical situation where you assume the availability of different data sets on the same variables.
|
My bad for the second question -- I had a typo in my handwritten expression for the expectation. The correct expression does have expected in-sample error decreasing as d is increasing.