LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Create New Homework Problems

Reply
 
Thread Tools Display Modes
  #1  
Old 04-18-2013, 06:25 AM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default VC dimension puzzle

Pondering the loose relationship between parametrisation of a hypothesis set and VC dimension d_{VC} (or the minimum break point, d_{VC} + 1) led me to the following example and puzzle.

Let the set of points be the natural numbers \{1, 2, 3, ...\}

Let elements of the hypothesis set be made up of alternating intervals of the same size, like a 1-dimensional checkerboard with varying scale

H_n = \{1, 2, ... n\} \cup \{2n+1, 2n+2, ..., 3n\} \cup \{4n+1, 4n+2, ... 5n\} \cup ...

What is the VC dimension of this hypothesis set?

[there is also a continuous version on the real line, but all the structure is in this simplified version]
Reply With Quote
  #2  
Old 04-18-2013, 04:15 PM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default Re: A little puzzle

Don't have the rules too clear. Can your alternating segments contain the +1s, _or_ all the -1s? If not, I imagine the VC dimension is 0.

If you allow both, I guess the first pattern you can't achieve is +1, -1, -1
so the VC dimension would be 2.

Or do you allow them shifted as well, then you'd get that one by starting at 2. But I don't think you could do +1, -1, +1, +1.
Reply With Quote
  #3  
Old 04-18-2013, 04:44 PM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: A little puzzle

Forgive me if it wasn't clear. There is exactly one hypothesis for each positive integer n, as described in the first post.

Intuitively, each hypothesis (i.e. permitted subset of \{1, 2, 3, ...\} )is an alternating sequence of "black" and "white" intervals of equal size starting at 1, and continuing indefinitely. "Black" = +1 = inclusion in the set, and "white" = -1 = exclusion from the set.

ok?

Do remember you have great freedom as to how to choose a set of N points. Use it well.

A good attack might be to try to shatter sets for increasing N from N=1 upward.
Reply With Quote
  #4  
Old 04-18-2013, 06:41 PM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default Re: A little puzzle

Ah - not the first time today I was tripped up by the rule that we only need to find a single set of xns with the maximum shattering. I keep forgetting and looking for some set of xns which doesn't.

So here's a method that will work for any N, and I don't even need all of your hypotheses; it's enough just to use the set of H's where n=2^k.
H0=+1 for 1,3,5,7,...
H1=+1 for 1,2,5,6,9,10,...
H2=+1 for 1,2,3,4,9,10,11,12,17,18,19,20,...
H3=+1 for 1,2,3,4,5,6,7,8,17,18,19,...

The easy way to see it is to use binary notation.
H0=+1 for any n with the last digit=1
H1=+1 for any n with the second-to-last digit=1
H2=+1 for any n with the third-to-last digit=1 etc.

Now gH(N)=2^N. Here's how we build our set of x1,...,xN that we can shatter.

For N, take the set Z of every combination of N 1s and 0s and use them to build our x1,...,xN. They will be very big numbers indeed, with 2^N digits or so.
The first digit of xi, call it xi1, is the ith digit of the first element of Z. If that is 000000...0 (N 0s), the ith digit will be zero as well, so all of the numbers x1,...,xN will start with 0.
If the second element of Z is 000000..01, might as well keep them in order, the next digit of x1,...,xN will be 0, 0, ..., 0, and 1 respectively. And so forth, for all 2^N digits.

These N binary numbers are shattered by H0 through H(2^N-1), as each of those H(i)s picks out the ith digit of our numbers, and the digits cover every possibility in Z.

Way to go - this is a neat problem.
Reply With Quote
  #5  
Old 04-18-2013, 06:43 PM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: A little puzzle

Thank you and well done!

Exactly the solution I found.
Reply With Quote
  #6  
Old 04-18-2013, 09:21 PM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default Re: A little puzzle

This problem makes me wonder if the VC dimension formula for the growth function isn't (sometimes?) far too restrictive. It's true that the growth function here is 2^N, and therefore our problem is maybe un-learnable. But is that really true? Because we could find one wacko example with numbers 2^50 digits long, does that mean that our hypotheses are really much too general? Actually, this set of hypotheses is awfully restricted, and you (probably) can't hardly do anything with them for almost all sets of x1,...xn. Maybe we should have a modified version of the VC formula, where the bound works for almost all sets of xn. Is it still true that the growth will be polynomial, almost always?
Reply With Quote
  #7  
Old 04-18-2013, 09:41 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: A little puzzle

Quote:
Originally Posted by Michael Reach View Post
This problem makes me wonder if the VC dimension formula for the growth function isn't (sometimes?) far too restrictive. It's true that the growth function here is 2^N, and therefore our problem is maybe un-learnable. But is that really true? Because we could find one wacko example with numbers 2^50 digits long, does that mean that our hypotheses are really much too general? Actually, this set of hypotheses is awfully restricted, and you (probably) can't hardly do anything with them for almost all sets of x1,...xn. Maybe we should have a modified version of the VC formula, where the bound works for almost all sets of xn. Is it still true that the growth will be polynomial, almost always?
Indeed, the VC dimension addresses a worst-case scenario. The advantage is that it is applicable regardless of what the input probability distribution turns out to be, so 'learnable' means guaranteed to be learnable. The disadvantage is that it may be unduly pessimistic in many practical cases. There is a modified version of this analysis based on expected values. It is more involved technically, and can be found in Vapnik's book.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #8  
Old 04-19-2013, 08:00 AM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default Re: A little puzzle

Ah - I see that in your book (footnote p. 51), the case of convex regions is mentioned as an example where an "estimated" growth bound works. I'm guessing that's because even though points on the rim of a circle and such can be shattered, "almost every" set of N points is going to have some on the interior, which cannot be shattered.
Reply With Quote
  #9  
Old 04-20-2013, 03:59 AM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: A little puzzle

In this example, Michael, it's not really correct to think of the shattered sets of points as being untypical. Almost all integers are very large! For example, given N if you pick a number M and choose a set of N points randomly in [M, 2M], the probability of the points not being shattered by this hypothesis set will tend to zero as M tends to infinity. [exercise for reader ]
Reply With Quote
  #10  
Old 04-20-2013, 11:40 AM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: A little puzzle

On reflection, it makes more sense to prove the slightly stronger result when you pick N points randomly (with uniform distribution) from the the first M points. i.e. by picking M big enough, you can ensure the probability of the points not being shattered is as small as you wish.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 06:52 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.