For classificatiion, the maximum likelihood error is not very easy to define for a hypothesis that returns

. However you can define a maximum likelihood error for logistic regression. You can compute the weights which maximize the likelihood in logistic regression and indeed these weights will be quite good in-sample for the classification problem where the in-sample error is the number of mistakes. Unfortunately, however, there is no known relationship between the number of mistakes made by the logistic regression solution and the minimum number of mistakes possible.
In general, for example using the linear perceptron, there is no know algorithm to efficiently compute the hypothesis which makes the minimum number of mistakes in-sample.
Quote:
Originally Posted by hsolo
The learning bounds we learnt in the course relate the generalization error of h, where h is the best hypothesis in terms of in-sample error, to the VC dimension.
In practice very often the best hypothesis h' is computed/estimated using Max Likelihood.
Is there a connection between the in-sample error of h' (the max likelihood hypothesis) and the minimum in-sample error possible?
|