Quote:
Originally Posted by samirbajaj
I don't understand the Pr( f(x) != g(x) ) expression  what exactly does this mean? Once the algorithm has converged, presumable f(x) matches g(x) on all data, so the difference is zero

On all data, yes. However, the probability is with respect to
over the entire input space, not restricted to
being in the finite data set used for training.