Thread: Breaking Point
View Single Post
  #5  
Old 04-15-2016, 09:02 PM
ntvy95 ntvy95 is offline
Member
 
Join Date: Jan 2016
Posts: 37
Default Re: Breaking Point

Quote:
Originally Posted by lfdid View Post
Definition 2.3 on p. 45 of the LFD book says that "if NO data set of size k can be shattered by H, then k is the break point for H."

My understanding is that it should read: "if there is a data set of size k such that it can NOT be shattered by H, then k is the break point for H".

Is this correct?

Many thanks!
I don't think so. For the example of Positive rays (Page 43-44), the book also says:

Quote:
Notice that if we picked N points where some of the points coincided (which is allowed), we will get less than N + 1 dichotomies. This does not affect the value of mH(N) since it is defined based on the maximum number of dichotomies.
In the Positive intervals example, we have derived:

m_{H}(N) = \frac{1}{2}N^{2} + \frac{1}{2}N + 1

We observe that not all the value of k gets m_{H}(k) < 2^{k}, indeed:

m_{H}(k) = \frac{1}{2}k^{2} + \frac{1}{2}k + 1 = 2^{k} \Leftrightarrow k = 2

This means that for some (not all) data set of size k \leq 2, the hypothesis set H is able to shatter (in other words, be able to generate 2^{k} dichotomies). However, for any data set of size k > 2, there is no way that the hypothesis set H is able to generate 2^{k} dichotomies.

For example, if k = 3, the hypothesis set H is only able to generate 7 dichotomies (while 2^{3} = 8). However, even when k = 2, if the two points coincide (both have the same value of x), there is no way for H to generate 2^{2} = 4 dichotomies on those points.
Reply With Quote