Thanks very much for the detailed reply!

This is a helpful distinction. The idea of being "led astray" has also been nice for intuition.

This makes perfect sense as well, and is how I had been thinking of the major impact of deterministic noise in causing overfitting. What spurred me to think about this is in fact the exercise on page 125, and the hint that, as

becomes more complex, there are two factors affecting overfitting. The bias/variance trade-off -- and thus the indirect impact of deterministic noise -- is clear, but that deterministic noise (bias) would

*directly* cause overfitting is a little confusing.

What I am curious about is how we can be "led astray" if

and

must stay fixed, and in my mind, I keep coming back to the precise definition of

: if

(size of training data set) is very small, variance will suffer, but also

will differ from the best hypothesis in

, leading to higher deterministic noise; if

is big enough,

will match the best hypothesis closely, and both variance and deterministic noise will shrink. So, even in cases of very large deterministic noise, if

is very big and gives us a near-perfect shape of the target function, we are not "led astray" at all (and indeed

would track

very well). It seems like that wiggle room in the deterministic noise tracks a bigger change in the variance. Does this make sense?