It is a subtle point, so let me try to explain it in the terms you outlined. Let us take the sample

(what you call

, just to follow the book notation). Now evaluate

for all hypotheses

in your model

. We didn't start at one

and moved to another. We just evaluated

for all

. The question is, does Hoeffding inequality apply to each of these

's by itself? The answer is clearly yes since each of them could be in principle the hypothesis you started with (which you called

).

Hoeffding states what the probabilities are before the sample is drawn. When you choose one of these hypotheses because of its small

, as in the scenario you point out, the probability that applies now is

**conditioned** on the sample having small

. We can try to get conditional version of Hoeffding to deal with the situation, or we can try to get a version of Hoeffding that applies regardless of which

we choose and how we choose it. The latter is what we did using the union bound.

Finally, taking the example you illustrated, any hypothesis you use has to be in

(which is decided before the sample is drawn). The one you constructed is not guaranteed to be in

. Of course you can guarantee that it is in

by taking

to be the set of all possible hypotheses, but in this case,

is thoroughly infinite

and the multiple-bin Hoeffding does not guarantee anything at all.