Quote:
Originally Posted by GraceLAX
Thanks for the clarification. That helped quite a bit.
If an algorithm always converges would the Pr(f(x) ne g(x)) = 0?

I think, the probability depends on the number of data. Just as our Hypothesis function h converses towards the Target Function f, once all of the training data (x) agrees with the training values (y), the iteration stops, before it actually reaches f (simply because our sample training data satisfies to the final h, which is g)
For example: if the target function was a 45degree st. line (x2=x1), and there was only one training data say ((x1(1),x2(1)),y(1))=((1,2),+1), and if our first hypothesis function was the horizontal axis itself, the Perceptron Algorithm would stop in its first iteration, and conclude the the horizontalaxis to be close to the Target function (which was actually 45degree line through origin). But if we increase the number of data, the hypothesis function is forced to converge towards the target function, with more iterations.
But there will always be some discrepancy i guess, unless you are very lucky.