Quote:
Originally Posted by elkka
I am also confused about this term "in general". Does it mean  in absolutely any situation. Or does it mean in most situations? Or  in all reasonable situations, excluding cases when we try to fit 10 degree polynomials to 10 points, as in this lecture's example?
mikesakiandcp, I think N has to do with deterministic noise, at least as described in the lecture. Yes, it is the ability of the hypothesis set to fit the target function, measured as expected difference between the "best" hypothesis and the target. But the way we defined the expected hypothesis, as an expectation over infinite number of data sets of specific size N  that depends on N very much. Slide 14, Lecture 11, illustrates the connection.

You are right, N is related to the deterministic noise. What I meant to say is that we have no control over N (since it is the number of inputs in our training set, which we have no control over). Given a fixed training set (and thus a fixed N), we are interested in how well the hypothesis set can approximate the target function.