LFD Book Forum on the right track?

#1
03-14-2013, 06:18 PM
 Sendai Member Join Date: Jan 2013 Location: Minnesota Posts: 29
on the right track?

Since probably most of us are writing our own regular RBF implementations by hand, I thought it would be helpful to compare the results of a couple simple test cases to make sure our implmentations are correct.

data set = (0, 0) (0, 1) (1, 0) (1, 1)
labels = 1 -1 -1 1
centers = (0, 0.2) (1, 0.7)

Case 1:
=1

weights = (-7.848, 7.397, 7.823)
(first weight is the bias)

Case 2:
=100

weights = (-1.0, 109.196, 16206.168)
#2
03-14-2013, 09:44 PM
 hemphill Member Join Date: Jan 2013 Posts: 18
Re: on the right track?

I concur with those results.
#3
03-14-2013, 11:00 PM
 jdreaver Junior Member Join Date: Jan 2013 Location: Riverside, CA Posts: 8
Re: on the right track?

I agree with those results as well.
#4
03-14-2013, 11:29 PM
 ripande Senior Member Join Date: Jan 2013 Posts: 71
Re: on the right track?

My results concur with these figures too
#5
03-15-2013, 05:42 PM
 kirill Member Join Date: Jan 2013 Posts: 14
Re: on the right track?

Same results here.
#6
03-15-2013, 08:00 PM
 boulis Member Join Date: Feb 2013 Location: Sydney, Australia Posts: 29
Re: on the right track?

Quote:
 Originally Posted by Sendai Since probably most of us are writing our own regular RBF implementations by hand, I thought it would be helpful to compare the results of a couple simple test cases to make sure our implmentations are correct. data set = (0, 0) (0, 1) (1, 0) (1, 1) labels = 1 -1 -1 1 centers = (0, 0.2) (1, 0.7)
I assume you mean that the centres are given as such and they are not computed by Lloyd's algorithm.
I believe that Lloyd's algorithm would produce unstable results with this configuration of points and number of centers.
Indeed, these are the 3 different cases that I get from running my algorithm (the results depend on the random starting point of course):

centres = [[ 0., 0.], [ 0.66666667, 0.66666667]]
clusters = [[[0, 0]], [[0, 1], [1, 0], [1, 1]]]

centres = [[ 0.66666667, 0.33333333], [ 0., 1.]]
clusters = [[[0, 0], [1, 0], [1, 1]], [[0, 1]]]

centres = [[ 1. , 0.5], [ 0. , 0.5]]
clusters = [[[1, 0], [1, 1]], [[0, 0], [0, 1]]]

These seem pretty reasonable to me. Do other people get the same results?
#7
03-16-2013, 06:23 PM
 ripande Senior Member Join Date: Jan 2013 Posts: 71
Re: on the right track?

I have run my algo on the last dataset that you have provided and I get the same results
#8
03-16-2013, 06:28 PM
 heer2351 Member Join Date: Feb 2013 Posts: 13
Re: on the right track?

I get the same weights. But do not understand how you get to an Ein of 0 with these points and clusters.

I would say point 1 and 2 belong to center 1, these points have opposite labels. Same for point 3 and 4 which belong to center 2. So how can Ein be zero?
#9
03-17-2013, 02:16 AM
 boulis Member Join Date: Feb 2013 Location: Sydney, Australia Posts: 29
Re: on the right track?

Quote:
 Originally Posted by heer2351 I get the same weights. But do not understand how you get to an Ein of 0 with these points and clusters. I would say point 1 and 2 belong to center 1, these points have opposite labels. Same for point 3 and 4 which belong to center 2. So how can Ein be zero?
I get the same weights too if I take the centres given as input. And I also get Ein = 0

It is not strange to have Ein = 0, I am not sure why you are confused about it. You are correct that the clusters are as you name them, but the sum of the weighted RBFs is such that we achieve the right sign in the right place (and in fact almost the right value, not just the sign). The key in this case I believe is the bias (W0 or b). This is negative so it gives the field a negative start(sign). Then the two RBFs work to make it positive. The points that are close to the centres are getting affected most and they become positive. The other two are staying negative.

Can someone please revisit my comment earlier? (That these are not the centres when Lloyd's algorithm is applied.)

One more question: Q14 asks the % of times that we get non-separable data by the RBF kernel (i.e. SVM hard margin with RBF kernel). In about 1000 runs that I have tried I never encountered non-separable data. Is this normal? I am using libsvm so it's harder to make a mistake about it... but who knows. I have checked that it identifies non separable data correctly (the 4 point example given in this thread is one such case).
#10
03-17-2013, 10:33 AM
 heer2351 Member Join Date: Feb 2013 Posts: 13
Re: on the right track?

Thanks for your reply, reviewed the lecture again and spotted my thinking error.

I changed my code and it now gives Ein=0 for the examples. I still have doubts however whether my code is correct because when I use Lloyds with these examples my Ein is never zero. As a matter of fact my Ein is rather large all the time; I am trying to find the flaw in my code.

What Ein values do you get in these examples when using Lloyds?

I agree that the centers are not calculated according to Lloyds and also agree with your centers. However I think that there are more centers than you have, for example:

Code:
[1  , 1]   [1/3, 1/3]
[1/2, 1]   [1/2, 0]
[1  , 0]   [1/3, 2/3]
I have the same experience with LibSVM Ein is zero all the time. I attribute this to the fact that this is hard margin SVM and only the number of support vectors goes up. We are basically overfitting which shows in the Eout.

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 07:37 AM.