I am confused on the answer for Q6. I can see that all choices have the same score, but I was assuming that not all of them would be valid hypothesis. I don't see how the opposite of XOR can agree with the 5 points in the dataset D. Isn't it required that the hypothesis would match those points??
You are right that in learning we try to match the training points. The message of this problem is that regardless of whether you are doing something intelligent or otherwise, there is noting that can be learned outside the training sample in a deterministic sense. To make the message crisp, the problem considers some 'crazy' training schemes in the mix.
