LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 2 - Training versus Testing (http://book.caltech.edu/bookforum/forumdisplay.php?f=109)
-   -   Page 47 / Lecture 6 (10 min) Partitioning the table (http://book.caltech.edu/bookforum/showthread.php?t=4225)

markact 04-20-2013 08:44 PM

Page 47 / Lecture 6 (10 min) Partitioning the table
 
I am struggling with one of more simpler bits of chapter 2.

On page 47 (Lecture 6 - 9 ~ 12 mins), I do not understanding how the table is constructed.

"Some dichotomies on these N-1 points appear only once" Okay, so to me, this implies we have 2^(n-1) rows for the first set (S1).

"The remaining dichotomies on the first n-1 appear twice, once with +1 and once with -1" How can this be true...?

To me, "the remaining dichotomies on the first n-1" would mean (2^n) less (2^n-1) = 2^n-1. But these appear twice! So we have 2*(2^n-1) = 2^n for set S2.

This seems as though we are counting twice.

S1 = 2^n-1
S2 = 2^n

The total of S1 + S2 is greater than 2^n.

Surely, I must be missing something really simple here. But what...? Please help.

Thanks,

Mark

yaser 04-20-2013 10:06 PM

Re: Page 47 / Lecture 6 (10 min) Partitioning the table
 
Quote:

Originally Posted by markact (Post 10516)
"Some dichotomies on these N-1 points appear only once" Okay, so to me, this implies we have 2^(n-1) rows for the first set (S1).

At most 2^{n-1} since it's not necessary that all of them are there. The fact that there may be that many does not play a role in the argument.

Quote:

"The remaining dichotomies on the first n-1 appear twice, once with +1 and once with -1" How can this be true...?
The +1 and -1 are the entries in the last (n^{\rm th}) column. The dichotomies on the first n-1 columns appear twice as they appear once with +1 and once with -1 in the last column.

markact 04-20-2013 11:56 PM

Re: Page 47 / Lecture 6 (10 min) Partitioning the table
 
Thank you very much for replying.

I read the answers and think that I understand the points being made in regards to the maximum being 2^(n-1).

But on reflection, perhaps my misunderstanding could have been expressed more clearly. So, I am sorry about the following tables, but perhaps these help to demonstrate what I am missing in regards to partitioning the original set into 3 disjoint sets.

The following assumes a space of 4 points - a maximum of 16 dichotomies. I have labelled each row with its base 10 equivalence.

Here is alpha (N-1 appears once, with XN either 1 or 0)

X1 X2 X3 XN ID
0 0 0 0 0
0 0 1 1 3
0 1 0 0 4
0 1 1 1 7
1 0 0 0 8
1 0 1 1 11
1 1 0 0 12
1 1 1 1 15

We are left with 8 rows remaining If these have XN being either 1 or -1, then these 8 rows are split into the following two partitions (S2+ and S2-).


[S2+]
X1 X2 X3 XN ID
0 0 0 1 1
0 1 0 1 5
1 0 0 1 9
1 1 0 1 13


[S2-]
X1 X2 X3 XN ID
0 0 1 0 2
0 1 1 0 6
1 0 1 0 10
1 1 1 0 14

So, given the partitions above, I seem to get that X1, X2 appears twice but this is not N-1 appearing twice (i.e. X1, X2, X3).

I understand we are attempting to describe situations were there is a break point and all the combinations listed above will not exist.

But given the instructions in the book and using a nice-and-neat 2n construction, I do not understand how the exhaustive and exclusive 3 sets are derived.

Thanks in advance,

Mark

yaser 04-21-2013 12:36 AM

Re: Page 47 / Lecture 6 (10 min) Partitioning the table
 
Quote:

Originally Posted by markact (Post 10520)
these help to demonstrate what I am missing in regards to partitioning the original set into 3 disjoint sets.

The following assumes a space of 4 points - a maximum of 16 dichotomies. I have labelled each row with its base 10 equivalence.

Here is alpha (N-1 appears once, with XN either 1 or 0)

X1 X2 X3 XN ID
0 0 0 0 0
0 0 1 1 3
0 1 0 0 4
0 1 1 1 7
1 0 0 0 8
1 0 1 1 11
1 1 0 0 12
1 1 1 1 15

We are left with 8 rows remaining If these have XN being either 1 or -1, then these 8 rows are split into the following two partitions (S2+ and S2-).


[S2+]
X1 X2 X3 XN ID
0 0 0 1 1
0 1 0 1 5
1 0 0 1 9
1 1 0 1 13


[S2-]
X1 X2 X3 XN ID
0 0 1 0 2
0 1 1 0 6
1 0 1 0 10
1 1 1 0 14

Thank you for the detailed example. I can now see where the misunderstanding is. The set S_1 contains all the patterns that appear once, and none of the patterns that appear twice, of the original matrix. Therefore, if all the above patterns are indeed in that matrix, then the decimal pattern 0 should not be in S_1 since the decimal pattern 1 makes that pattern appear twice (on the first 3 columns). Similarly, the set S_2 contains all patterns that appear twice, and none that appear once.

markact 04-21-2013 03:00 AM

Re: Page 47 / Lecture 6 (10 min) Partitioning the table
 
Got it!!

Thank you very much for your patience.


All times are GMT -7. The time now is 06:07 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.