LFD Book Forum on the right track?
 Register FAQ Calendar Mark Forums Read

#11
03-17-2013, 09:45 AM
 Sendai Member Join Date: Jan 2013 Location: Minnesota Posts: 29
Re: on the right track?

Quote:
 Originally Posted by boulis I assume you mean that the centres are given as such and they are not computed by Lloyd's algorithm.
Correct.

Quote:
 centres = [[ 0., 0.], [ 0.66666667, 0.66666667]] clusters = [[[0, 0]], [[0, 1], [1, 0], [1, 1]]] centres = [[ 0.66666667, 0.33333333], [ 0., 1.]] clusters = [[[0, 0], [1, 0], [1, 1]], [[0, 1]]] centres = [[ 1. , 0.5], [ 0. , 0.5]] clusters = [[[1, 0], [1, 1]], [[0, 0], [0, 1]]]
I get these too, plus the ones with x1 and x2 reversed.
#12
03-17-2013, 09:48 AM
 heer2351 Member Join Date: Feb 2013 Posts: 13
Re: on the right track?

@Sendai

What values for Ein do you get when running your algorithm with Lloyds centers?

I think I am still somewhere off the track
#13
03-17-2013, 10:03 AM
 Sendai Member Join Date: Jan 2013 Location: Minnesota Posts: 29
Re: on the right track?

centres = [[ 0., 0.], [ 0.66666667, 0.66666667]]
Ein = 0

centres = [[ 0.66666667, 0.33333333], [ 0., 1.]]
Ein = 0

centres = [[ 1. , 0.5], [ 0. , 0.5]]
Ein = 0.5
(The weight are all zeros -- RBF can't break the symmetry and fails.)
#14
03-17-2013, 03:38 PM
 Anne Paulson Senior Member Join Date: Jan 2013 Location: Silicon Valley Posts: 52
Re: on the right track?

I had a difficult bug to find here, in R. Beware, other R users. I ran the first example, and it was just fine. I ran the second example, and got nothing like the right answer. What? said me.

I had been using my own linear regression program, since it was conveniently to hand. But I knew it worked. So I tried the builtin R linear regression, which gave the correct answer. I noticed that in my own linear regression, I was calling ginv, the pseudo-inverse function. I called solve, the actual inverse function (why you get the inverse of a square matrix by calling "solve" is beyond me, but I digress). That worked.

Finally I realized that our little example is quite a nasty matrix. The tolerance for ginv was something like 1E-8. Not small enough! Once I lowered the tolerance, everything was dandy.
#15
03-17-2013, 04:29 PM
 boulis Member Join Date: Feb 2013 Location: Sydney, Australia Posts: 29
Re: on the right track?

Thanks heer2351 and Sendai. Indeed there are 6 different centre-configurations, the ones you gave complete the picture. You can also see this with paper&pencil.

Quote:
 Originally Posted by heer2351 @Sendai What values for Ein do you get when running your algorithm with Lloyds centers? I think I am still somewhere off the track
With this 4-point example the only options for Ein are: 0, 0.25, 0.5
Anything else you get, you know your code has a bug.

I have not tried it with my code, but Sendai's results seem correct. So you should get Ein =0 for the centres that have 1/3 or 2/3 in them (when the clusters are 3-1 points) and Ein = 0.5 when the centres have 1/2 in them (the case were the clusters are 2-2points)

I also think that it might be impossible to get Ein= 0.25 for any centre configuration, providing that the weights are chosen optimally (i.e., using the pseudo inverse method). So you either make no mistakes, or 2 points are misclassified.
#16
03-18-2013, 06:21 AM
 melipone Senior Member Join Date: Jan 2013 Posts: 72
Re: on the right track?

Geez, I don't get the same weights! I do get Ein=0 though. Could you post your phi matrix so that I can locate my error? thanks.

Quote:
 Originally Posted by Sendai Since probably most of us are writing our own regular RBF implementations by hand, I thought it would be helpful to compare the results of a couple simple test cases to make sure our implmentations are correct. data set = (0, 0) (0, 1) (1, 0) (1, 1) labels = 1 -1 -1 1 centers = (0, 0.2) (1, 0.7) Case 1: =1 weights = (-7.848, 7.397, 7.823) (first weight is the bias) Case 2: =100 weights = (-1.0, 109.196, 16206.168)
#17
03-18-2013, 09:04 AM
 Sendai Member Join Date: Jan 2013 Location: Minnesota Posts: 29
Re: on the right track?

Quote:
 Originally Posted by melipone geez, i don't get the same weights! I do get ein=0 though. Could you post your phi matrix so that i can locate my error? Thanks.
Here's the phi matrix I get for case 1:

Code:
[[ 1.          0.96078944  0.22537266]
[ 1.          0.35345468  0.61262639]
[ 1.          0.52729242  0.33621649]
[ 1.          0.19398004  0.91393119]]
#18
03-18-2013, 04:53 PM
 heer2351 Member Join Date: Feb 2013 Posts: 13
Re: on the right track?

Sendai thanks for starting this thread, it enabled me to find a flaw in my code and even better fix it

All answers I found with my fixed code were correct.

Special thanks to boulis for his first response, I was overthinking the problem.
#19
03-18-2013, 06:22 PM
 melipone Senior Member Join Date: Jan 2013 Posts: 72
Re: on the right track?

Ah ha, I see how you got that but I think that you need to take the square root after you add up the distance of each data point to the cluster centers. You did not take the square root. I think you have to IMHO.

Quote:
 Originally Posted by Sendai Here's the phi matrix I get for case 1: Code: [[ 1. 0.96078944 0.22537266] [ 1. 0.35345468 0.61262639] [ 1. 0.52729242 0.33621649] [ 1. 0.19398004 0.91393119]]
#20
03-18-2013, 08:41 PM
 boulis Member Join Date: Feb 2013 Location: Sydney, Australia Posts: 29
Re: on the right track?

Quote:
 Originally Posted by melipone Ah ha, I see how you got that but I think that you need to take the square root after you add up the distance of each data point to the cluster centers. You did not take the square root. I think you have to IMHO.
If you see the formula, you'll notice that the norm is getting squared. So no need to take the square root in the first place.

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 07:02 AM.