LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > The Final

Reply
 
Thread Tools Display Modes
  #11  
Old 03-17-2013, 10:45 AM
Sendai Sendai is offline
Member
 
Join Date: Jan 2013
Location: Minnesota
Posts: 29
Default Re: on the right track?

Quote:
Originally Posted by boulis View Post
I assume you mean that the centres are given as such and they are not computed by Lloyd's algorithm.
Correct.

Quote:
centres = [[ 0., 0.], [ 0.66666667, 0.66666667]]
clusters = [[[0, 0]], [[0, 1], [1, 0], [1, 1]]]

centres = [[ 0.66666667, 0.33333333], [ 0., 1.]]
clusters = [[[0, 0], [1, 0], [1, 1]], [[0, 1]]]

centres = [[ 1. , 0.5], [ 0. , 0.5]]
clusters = [[[1, 0], [1, 1]], [[0, 0], [0, 1]]]
I get these too, plus the ones with x1 and x2 reversed.
Reply With Quote
  #12  
Old 03-17-2013, 10:48 AM
heer2351 heer2351 is offline
Member
 
Join Date: Feb 2013
Posts: 13
Default Re: on the right track?

@Sendai

What values for Ein do you get when running your algorithm with Lloyds centers?

I think I am still somewhere off the track
Reply With Quote
  #13  
Old 03-17-2013, 11:03 AM
Sendai Sendai is offline
Member
 
Join Date: Jan 2013
Location: Minnesota
Posts: 29
Default Re: on the right track?

centres = [[ 0., 0.], [ 0.66666667, 0.66666667]]
Ein = 0

centres = [[ 0.66666667, 0.33333333], [ 0., 1.]]
Ein = 0

centres = [[ 1. , 0.5], [ 0. , 0.5]]
Ein = 0.5
(The weight are all zeros -- RBF can't break the symmetry and fails.)
Reply With Quote
  #14  
Old 03-17-2013, 04:38 PM
Anne Paulson Anne Paulson is offline
Senior Member
 
Join Date: Jan 2013
Location: Silicon Valley
Posts: 52
Default Re: on the right track?

I had a difficult bug to find here, in R. Beware, other R users. I ran the first example, and it was just fine. I ran the second example, and got nothing like the right answer. What? said me.

I had been using my own linear regression program, since it was conveniently to hand. But I knew it worked. So I tried the builtin R linear regression, which gave the correct answer. I noticed that in my own linear regression, I was calling ginv, the pseudo-inverse function. I called solve, the actual inverse function (why you get the inverse of a square matrix by calling "solve" is beyond me, but I digress). That worked.

Finally I realized that our little example is quite a nasty matrix. The tolerance for ginv was something like 1E-8. Not small enough! Once I lowered the tolerance, everything was dandy.
Reply With Quote
  #15  
Old 03-17-2013, 05:29 PM
boulis boulis is offline
Member
 
Join Date: Feb 2013
Location: Sydney, Australia
Posts: 29
Default Re: on the right track?

Thanks heer2351 and Sendai. Indeed there are 6 different centre-configurations, the ones you gave complete the picture. You can also see this with paper&pencil.

Quote:
Originally Posted by heer2351 View Post
@Sendai

What values for Ein do you get when running your algorithm with Lloyds centers?
I think I am still somewhere off the track
With this 4-point example the only options for Ein are: 0, 0.25, 0.5
Anything else you get, you know your code has a bug.

I have not tried it with my code, but Sendai's results seem correct. So you should get Ein =0 for the centres that have 1/3 or 2/3 in them (when the clusters are 3-1 points) and Ein = 0.5 when the centres have 1/2 in them (the case were the clusters are 2-2points)

I also think that it might be impossible to get Ein= 0.25 for any centre configuration, providing that the weights are chosen optimally (i.e., using the pseudo inverse method). So you either make no mistakes, or 2 points are misclassified.
Reply With Quote
  #16  
Old 03-18-2013, 07:21 AM
melipone melipone is offline
Senior Member
 
Join Date: Jan 2013
Posts: 72
Default Re: on the right track?

Geez, I don't get the same weights! I do get Ein=0 though. Could you post your phi matrix so that I can locate my error? thanks.

Quote:
Originally Posted by Sendai View Post
Since probably most of us are writing our own regular RBF implementations by hand, I thought it would be helpful to compare the results of a couple simple test cases to make sure our implmentations are correct.

data set = (0, 0) (0, 1) (1, 0) (1, 1)
labels = 1 -1 -1 1
centers = (0, 0.2) (1, 0.7)

Case 1:
\gamma=1
E_{in} = 0
weights = (-7.848, 7.397, 7.823)
(first weight is the bias)

Case 2:
\gamma=100
E_{in} = 0
weights = (-1.0, 109.196, 16206.168)
Reply With Quote
  #17  
Old 03-18-2013, 10:04 AM
Sendai Sendai is offline
Member
 
Join Date: Jan 2013
Location: Minnesota
Posts: 29
Default Re: on the right track?

Quote:
Originally Posted by melipone View Post
geez, i don't get the same weights! I do get ein=0 though. Could you post your phi matrix so that i can locate my error? Thanks.
Here's the phi matrix I get for case 1:

Code:
[[ 1.          0.96078944  0.22537266]
 [ 1.          0.35345468  0.61262639]
 [ 1.          0.52729242  0.33621649]
 [ 1.          0.19398004  0.91393119]]
Reply With Quote
  #18  
Old 03-18-2013, 05:53 PM
heer2351 heer2351 is offline
Member
 
Join Date: Feb 2013
Posts: 13
Default Re: on the right track?

Sendai thanks for starting this thread, it enabled me to find a flaw in my code and even better fix it

All answers I found with my fixed code were correct.

Special thanks to boulis for his first response, I was overthinking the problem.
Reply With Quote
  #19  
Old 03-18-2013, 07:22 PM
melipone melipone is offline
Senior Member
 
Join Date: Jan 2013
Posts: 72
Default Re: on the right track?

Ah ha, I see how you got that but I think that you need to take the square root after you add up the distance of each data point to the cluster centers. You did not take the square root. I think you have to IMHO.

Quote:
Originally Posted by Sendai View Post
Here's the phi matrix I get for case 1:

Code:
[[ 1.          0.96078944  0.22537266]
 [ 1.          0.35345468  0.61262639]
 [ 1.          0.52729242  0.33621649]
 [ 1.          0.19398004  0.91393119]]
Reply With Quote
  #20  
Old 03-18-2013, 09:41 PM
boulis boulis is offline
Member
 
Join Date: Feb 2013
Location: Sydney, Australia
Posts: 29
Default Re: on the right track?

Quote:
Originally Posted by melipone View Post
Ah ha, I see how you got that but I think that you need to take the square root after you add up the distance of each data point to the cluster centers. You did not take the square root. I think you have to IMHO.
If you see the formula, you'll notice that the norm is getting squared. So no need to take the square root in the first place.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 07:23 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.