LFD Book Forum  

Go Back   LFD Book Forum > General > General Discussion of Machine Learning

Reply
 
Thread Tools Display Modes
  #1  
Old 06-11-2013, 01:47 PM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default SVMs, kernels and logistic regression

This course has been great for getting us to think about issues about the application of machine learning techniques. Sometimes this has led to realisations, other times to unresolved questions! I'd like to mention a few for open discussion.

SVMs were what got me here in a sense. After enthusiastic experimentation with neural nets over the years, I got the impression in the last year that SVMs were something even better. I found my way to LIBSVM and while I managed to get some encouraging results, I also came to the conclusion that I didn't really know what I was doing, and needed to learn. I was right. [I am sure there is no connection with the fact that most of my silly errors in this course seem to have been with things like forgetting to pass a parameter to LIBSVM or misreading the number of support vectors - my fault entirely!]

One issue I have with SVMs is a way in which what they typically do is different to what I at one time assumed. I thought there was a hidden linear component in the kernel which would allow an arbitrary linear rescaling before the application of the kernel function.

The reason I am interested in this idea is because it is far from clear in some applications that all dimensions are equal. Typically, they get rescaled depending on the range, and then the kernel gets applied with the assumption that the range in every dimension is of identical relevance to the way the function varies. It's certainly a reasonable default, but it is not difficult to imagine examples where it would be desired to have high resolution in some dimensions, and low resolution in others. So the idea of a kernel which combines a linear and gaussian transform to give more hypotheses is interesting. Of course it has the cost of greater power, but perhaps some form of regularization and cross-validation would tame this.

Secondly, I have an intuitive feeling that the reason SVM gives good results may be because it is a computationally efficient approximation to what I have learnt is called kernel logistic regression. The concept of this occurred to me a while back, and it is good to see that some very smart people are working on it. I'll be very interested to see how the import vector machine concept develops, and whether it might even be somewhat superior to SVMs as a general tool. It's nice to have probabilistic interpretation and multiple class classification being intrinsic. [At present I suspect it mainly comes down to computational demands, with both methods giving very good learning - is this so?]

Any comments, especially from those familiar with the way the field is developing will be most welcome.
Reply With Quote
  #2  
Old 06-12-2013, 03:32 PM
htlin's Avatar
htlin htlin is offline
NTU
 
Join Date: Aug 2009
Location: Taipei, Taiwan
Posts: 601
Default Re: SVMs, kernels and logistic regression

Quote:
Originally Posted by Elroch View Post
One issue I have with SVMs is a way in which what they typically do is different to what I at one time assumed. I thought there was a hidden linear component in the kernel which would allow an arbitrary linear rescaling before the application of the kernel function.

The reason I am interested in this idea is because it is far from clear in some applications that all dimensions are equal. Typically, they get rescaled depending on the range, and then the kernel gets applied with the assumption that the range in every dimension is of identical relevance to the way the function varies. It's certainly a reasonable default, but it is not difficult to imagine examples where it would be desired to have high resolution in some dimensions, and low resolution in others. So the idea of a kernel which combines a linear and gaussian transform to give more hypotheses is interesting. Of course it has the cost of greater power, but perhaps some form of regularization and cross-validation would tame this.
There is a rich literature of ongoing works on multiple-kernel learning (MKL) that may match your thoughts here. MKL learns a convex combination of kernels which equivalently rescales some of the transforms hidden under the kernels. From my limited experience, it is very difficult to control the greater power in MKL, though.

Quote:
Originally Posted by Elroch View Post
Secondly, I have an intuitive feeling that the reason SVM gives good results may be because it is a computationally efficient approximation to what I have learnt is called kernel logistic regression. The concept of this occurred to me a while back, and it is good to see that some very smart people are working on it. I'll be very interested to see how the import vector machine concept develops, and whether it might even be somewhat superior to SVMs as a general tool. It's nice to have probabilistic interpretation and multiple class classification being intrinsic. [At present I suspect it mainly comes down to computational demands, with both methods giving very good learning - is this so?]
From my experience, SVM, KLR and other related approaches indeed lead to similar performance *if well-tuned*. I don't think any approach is particularly "superior" to others in terms of practical performance. The "if well-tuned" is a pretty big assumption, though. In the past ten years the development of the many tools, along with the efforts in making large-scale training computationally feasible, makes it easier to tune SVM than the other approaches. That may be one important reason for the success of SVM.

Hope this helps.
__________________
When one teaches, two learn.
Reply With Quote
  #3  
Old 06-13-2013, 10:22 AM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: SVMs, kernels and logistic regression

Thanks for drawing my attention to multiple kernel learning. Yet another interesting machine learning theme to read up on!
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 04:42 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.