Re: question target distribution
My understanding is that Noisy target = y, but we want to decompose this noisy target into the deterministic component and the noise. The equations from the lectures simply measure the noise component as the difference between the noisy target and the deterministic target (i.e. y  f(x)). We can't really say much more about what the noise is because it depends on the situation in which you're applying the learning algorithm.
As for the examples in the lectures and the assignment where we have to add noise to a deterministic target function, we're only doing so because the sample data that we generated are from the deterministic target function and would therefore not contain any noise. But I don't think any real world situations would present a training set with no noise. We need to manually add noise to the "ideal" data we generated ourselves in order to mimic real world data.
