Hoeffding inequality and noisy targets
I found the jump from learning a deterministic target function to learning a probability distribution a big jump. The treatment of this concept in the book was a bit too fast for me and not detailed. Also the "intuitive" justification of hoeffding in this case also was not clear to me at all  Hoeffding seems to be a tricky concept in the sense that it's application is prone to error if one is not careful. Is there a more stepbystep explanation of this section somewhere?
One starter question in this regard is that in the basic hoeffding derivation, we have used a binary classifier i.e. the target function returns +/1 (or possibly a multiclass classifier). In the noisy target case should the understanding be that it returns a number 'p' signifying the probability of +1 at x?
