Exercise 3.15
I have a question about (b)
(a) since all x are mapped onto a line. A perceptron cannot generate all separations for 3 data points. But it works for 2 data points. Therefore d_vc (H_k) = 2.
(b) Suppose N data points have d bits. In order to let one perceptron generates all 2^N separations, we could let every separations shows up in one dimension, i.e. one bit. Since one half of separations equals another half, i.e. the separation where the first data point is separated from others could be represented as 0 1 1 1 ... , or 1 0 0 0 ... . Therefore
[ math ] 2^N/2 \le d [ /math ]
i.e.
[ math ] d_vc \le \log_2d + 1 [/math]
