LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   General Discussion of Machine Learning (http://book.caltech.edu/bookforum/forumdisplay.php?f=105)
-   -   SVMs, kernels and logistic regression (http://book.caltech.edu/bookforum/showthread.php?t=4349)

Elroch 06-11-2013 01:47 PM

SVMs, kernels and logistic regression
 
This course has been great for getting us to think about issues about the application of machine learning techniques. Sometimes this has led to realisations, other times to unresolved questions! I'd like to mention a few for open discussion.

SVMs were what got me here in a sense. After enthusiastic experimentation with neural nets over the years, I got the impression in the last year that SVMs were something even better. I found my way to LIBSVM and while I managed to get some encouraging results, I also came to the conclusion that I didn't really know what I was doing, and needed to learn. I was right. [I am sure there is no connection with the fact that most of my silly errors in this course seem to have been with things like forgetting to pass a parameter to LIBSVM or misreading the number of support vectors - my fault entirely!]

One issue I have with SVMs is a way in which what they typically do is different to what I at one time assumed. I thought there was a hidden linear component in the kernel which would allow an arbitrary linear rescaling before the application of the kernel function.

The reason I am interested in this idea is because it is far from clear in some applications that all dimensions are equal. Typically, they get rescaled depending on the range, and then the kernel gets applied with the assumption that the range in every dimension is of identical relevance to the way the function varies. It's certainly a reasonable default, but it is not difficult to imagine examples where it would be desired to have high resolution in some dimensions, and low resolution in others. So the idea of a kernel which combines a linear and gaussian transform to give more hypotheses is interesting. Of course it has the cost of greater power, but perhaps some form of regularization and cross-validation would tame this.

Secondly, I have an intuitive feeling that the reason SVM gives good results may be because it is a computationally efficient approximation to what I have learnt is called kernel logistic regression. The concept of this occurred to me a while back, and it is good to see that some very smart people are working on it. I'll be very interested to see how the import vector machine concept develops, and whether it might even be somewhat superior to SVMs as a general tool. It's nice to have probabilistic interpretation and multiple class classification being intrinsic. [At present I suspect it mainly comes down to computational demands, with both methods giving very good learning - is this so?]

Any comments, especially from those familiar with the way the field is developing will be most welcome.

htlin 06-12-2013 03:32 PM

Re: SVMs, kernels and logistic regression
 
Quote:

Originally Posted by Elroch (Post 11128)
One issue I have with SVMs is a way in which what they typically do is different to what I at one time assumed. I thought there was a hidden linear component in the kernel which would allow an arbitrary linear rescaling before the application of the kernel function.

The reason I am interested in this idea is because it is far from clear in some applications that all dimensions are equal. Typically, they get rescaled depending on the range, and then the kernel gets applied with the assumption that the range in every dimension is of identical relevance to the way the function varies. It's certainly a reasonable default, but it is not difficult to imagine examples where it would be desired to have high resolution in some dimensions, and low resolution in others. So the idea of a kernel which combines a linear and gaussian transform to give more hypotheses is interesting. Of course it has the cost of greater power, but perhaps some form of regularization and cross-validation would tame this.

There is a rich literature of ongoing works on multiple-kernel learning (MKL) that may match your thoughts here. MKL learns a convex combination of kernels which equivalently rescales some of the transforms hidden under the kernels. From my limited experience, it is very difficult to control the greater power in MKL, though.

Quote:

Originally Posted by Elroch (Post 11128)
Secondly, I have an intuitive feeling that the reason SVM gives good results may be because it is a computationally efficient approximation to what I have learnt is called kernel logistic regression. The concept of this occurred to me a while back, and it is good to see that some very smart people are working on it. I'll be very interested to see how the import vector machine concept develops, and whether it might even be somewhat superior to SVMs as a general tool. It's nice to have probabilistic interpretation and multiple class classification being intrinsic. [At present I suspect it mainly comes down to computational demands, with both methods giving very good learning - is this so?]

From my experience, SVM, KLR and other related approaches indeed lead to similar performance *if well-tuned*. I don't think any approach is particularly "superior" to others in terms of practical performance. The "if well-tuned" is a pretty big assumption, though. In the past ten years the development of the many tools, along with the efforts in making large-scale training computationally feasible, makes it easier to tune SVM than the other approaches. That may be one important reason for the success of SVM.

Hope this helps.

Elroch 06-13-2013 10:22 AM

Re: SVMs, kernels and logistic regression
 
Thanks for drawing my attention to multiple kernel learning. Yet another interesting machine learning theme to read up on!


All times are GMT -7. The time now is 11:38 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.