For classificatiion, the maximum likelihood error is not very easy to define for a hypothesis that returns
. However you can define a maximum likelihood error for logistic regression. You can compute the weights which maximize the likelihood in logistic regression and indeed these weights will be quite good insample for the classification problem where the insample error is the number of mistakes. Unfortunately, however, there is no known relationship between the number of mistakes made by the logistic regression solution and the minimum number of mistakes possible.
In general, for example using the linear perceptron, there is no know algorithm to efficiently compute the hypothesis which makes the minimum number of mistakes insample.
Quote:
Originally Posted by hsolo
The learning bounds we learnt in the course relate the generalization error of h, where h is the best hypothesis in terms of insample error, to the VC dimension.
In practice very often the best hypothesis h' is computed/estimated using Max Likelihood.
Is there a connection between the insample error of h' (the max likelihood hypothesis) and the minimum insample error possible?
