On page 20, the book tries to connect the bin model to the learning problem. Here's I understand:

1) Introduce a hypothesis h

2) We can now compare h(x) and f(x) for all x in X. Think of the bin is completely sealed so we don't see the colors, yet there are some red and some green data points.

3) Due to h, we are introducing some probability to the bin X.

4) Now grab N data points from the bin X and look at these data points. We know their colors now. If most of them are red, we know we need to fix our hypothesis h because it is extremely unlikely to get a lot of red if h is very close to f (asserted by the Hoeffding's inequality).

This makes a lot of sense to me. However, I also watch the video lecture 2 and it seems that my understanding is incorrect. It seems that the professor says the probability is introduced in order to generate the sample data points, not because of the hypothesis h. This is completely opposite to my understanding because I think that the probability comes from the fact that we introduce a hypothesis h.

Can someone please help me understand this? Thanks!

s