Quote:
Originally Posted by binchen.bin@gmail.com
On question 4, I tried to fit each of two sample points through (i) h = ax, and (ii) h = ax+b. I found that hypothesis (i) gave me an average of "a" quite different to any of the answers, but hypothesis (ii) gave me an average of "a" very close to one of the answer options and the average of b is virtually close to 0. If average of b is 0 in (ii), why the average of a are different in (i) and (ii)? Can anyone help me explaining these?
|
Let me address why they can be different. The model

can fit both points in the training set

perfectly, while the model

finds a compromise that minimizes the mean-squared error on those points. Because of this, we have different fits that can have different averages. Symmetry dictates that

will average to zero for the first model, but that does not mean that the average

should be the same as the second model.
Having said that, you should get an answer that matches one of the 5 choices when you fit the

model properly.