LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   The Final (http://book.caltech.edu/bookforum/forumdisplay.php?f=138)
-   -   on the right track? (http://book.caltech.edu/bookforum/showthread.php?t=4104)

 Sendai 03-17-2013 10:45 AM

Re: on the right track?

Quote:
 Originally Posted by boulis (Post 9942) I assume you mean that the centres are given as such and they are not computed by Lloyd's algorithm.
Correct.

Quote:
 centres = [[ 0., 0.], [ 0.66666667, 0.66666667]] clusters = [[[0, 0]], [[0, 1], [1, 0], [1, 1]]] centres = [[ 0.66666667, 0.33333333], [ 0., 1.]] clusters = [[[0, 0], [1, 0], [1, 1]], [[0, 1]]] centres = [[ 1. , 0.5], [ 0. , 0.5]] clusters = [[[1, 0], [1, 1]], [[0, 0], [0, 1]]]
I get these too, plus the ones with x1 and x2 reversed.

 heer2351 03-17-2013 10:48 AM

Re: on the right track?

@Sendai

What values for Ein do you get when running your algorithm with Lloyds centers?

I think I am still somewhere off the track :(

 Sendai 03-17-2013 11:03 AM

Re: on the right track?

centres = [[ 0., 0.], [ 0.66666667, 0.66666667]]
Ein = 0

centres = [[ 0.66666667, 0.33333333], [ 0., 1.]]
Ein = 0

centres = [[ 1. , 0.5], [ 0. , 0.5]]
Ein = 0.5
(The weight are all zeros -- RBF can't break the symmetry and fails.)

 Anne Paulson 03-17-2013 04:38 PM

Re: on the right track?

I had a difficult bug to find here, in R. Beware, other R users. I ran the first example, and it was just fine. I ran the second example, and got nothing like the right answer. What? said me.

I had been using my own linear regression program, since it was conveniently to hand. But I knew it worked. So I tried the builtin R linear regression, which gave the correct answer. I noticed that in my own linear regression, I was calling ginv, the pseudo-inverse function. I called solve, the actual inverse function (why you get the inverse of a square matrix by calling "solve" is beyond me, but I digress). That worked.

Finally I realized that our little example is quite a nasty matrix. The tolerance for ginv was something like 1E-8. Not small enough! Once I lowered the tolerance, everything was dandy.

 boulis 03-17-2013 05:29 PM

Re: on the right track?

Thanks heer2351 and Sendai. Indeed there are 6 different centre-configurations, the ones you gave complete the picture. You can also see this with paper&pencil.

Quote:
 Originally Posted by heer2351 (Post 9974) @Sendai What values for Ein do you get when running your algorithm with Lloyds centers? I think I am still somewhere off the track :(
With this 4-point example the only options for Ein are: 0, 0.25, 0.5
Anything else you get, you know your code has a bug.

I have not tried it with my code, but Sendai's results seem correct. So you should get Ein =0 for the centres that have 1/3 or 2/3 in them (when the clusters are 3-1 points) and Ein = 0.5 when the centres have 1/2 in them (the case were the clusters are 2-2points)

I also think that it might be impossible to get Ein= 0.25 for any centre configuration, providing that the weights are chosen optimally (i.e., using the pseudo inverse method). So you either make no mistakes, or 2 points are misclassified.

 melipone 03-18-2013 07:21 AM

Re: on the right track?

Geez, I don't get the same weights! I do get Ein=0 though. Could you post your phi matrix so that I can locate my error? thanks.

Quote:
 Originally Posted by Sendai (Post 9922) Since probably most of us are writing our own regular RBF implementations by hand, I thought it would be helpful to compare the results of a couple simple test cases to make sure our implmentations are correct. data set = (0, 0) (0, 1) (1, 0) (1, 1) labels = 1 -1 -1 1 centers = (0, 0.2) (1, 0.7) Case 1: =1 weights = (-7.848, 7.397, 7.823) (first weight is the bias) Case 2: =100 weights = (-1.0, 109.196, 16206.168)

 Sendai 03-18-2013 10:04 AM

Re: on the right track?

Quote:
 Originally Posted by melipone (Post 9996) geez, i don't get the same weights! I do get ein=0 though. Could you post your phi matrix so that i can locate my error? Thanks.
Here's the phi matrix I get for case 1:

Code:

[[ 1.          0.96078944  0.22537266]  [ 1.          0.35345468  0.61262639]  [ 1.          0.52729242  0.33621649]  [ 1.          0.19398004  0.91393119]]

 heer2351 03-18-2013 05:53 PM

Re: on the right track?

Sendai thanks for starting this thread, it enabled me to find a flaw in my code and even better fix it :)

All answers I found with my fixed code were correct.

Special thanks to boulis for his first response, I was overthinking the problem.

 melipone 03-18-2013 07:22 PM

Re: on the right track?

Ah ha, I see how you got that but I think that you need to take the square root after you add up the distance of each data point to the cluster centers. You did not take the square root. I think you have to IMHO.

Quote:
 Originally Posted by Sendai (Post 9997) Here's the phi matrix I get for case 1: Code: [[ 1.          0.96078944  0.22537266]  [ 1.          0.35345468  0.61262639]  [ 1.          0.52729242  0.33621649]  [ 1.          0.19398004  0.91393119]]

 boulis 03-18-2013 09:41 PM

Re: on the right track?

Quote:
 Originally Posted by melipone (Post 10008) Ah ha, I see how you got that but I think that you need to take the square root after you add up the distance of each data point to the cluster centers. You did not take the square root. I think you have to IMHO.
If you see the formula, you'll notice that the norm is getting squared. So no need to take the square root in the first place.

All times are GMT -7. The time now is 08:37 AM.