Not a direct followup to this question but I was wondering how Multinomial Softmax (generalization of logistic for multiple classes) compares with SVMs for multiclass problem, in practice.
One difference is the softmax directly gives the probability whereas SVM probabilities are 'indirect'
I was wondering how does this comparison pan out in practice?

In my experience oneversusall linear SVM and multinomial logistic regression are somewhat comparable in practice.