Okay I understand that the h function is what we're after. There can be many h functions from the hypothesis space H and we're after the h function that models the data optimally. The f function has been repeatedly stated as unknown. What I don't understand is page 21, the equation. How can the "fraction D where f and h disagree" be an accurate statement when f is considered unknown. Can't test if they disagree when f is unknown to begin with. I hope someone can clarify this for me.