Questions on target distribution and others of lecture four
I don't understand the target function mentioned in slide 16, why the probability of y given x is used to represent the target distribution instead of the probability of y alone? so is the deterministic target.Another question is that what is the smoothness of square error curve affects the error measure? The last one is that is there any empircal principle for choosing error measure alternatives, namely, in which cases plausible measures are more superior to friendly ones? All comments are appreciated.
