because
is defined as an expectation with respect to data sets of g(x). The average over data sets approximates this expectation.
Yes,
is not a valid hypothesis: it may not be in your hypothesis set; it may not even be binary. It is never used as a classifier. It is just used to represent "what would happen on average after learning", and this abstract function plays a role in defining the bias in the bias variance decomposition.
Quote:
Originally Posted by Newbrict
I think because it's computed over a finite set of points, whereas the actual value for is an exact solution
