Quote:
Originally Posted by goodrm
From slide 9 (lecture 2), this seems to infer that the training samples are generated from the probability distribution. Am I correct in my understanding?
Before proability was introduced, it seemed that we were discussing training samples as representing actual customer data such as (salary, years in residence, Years at job,etc).
I am having some trouble understanding how the probability assumes points on x and what this really means. Is this sort of like a random generator?

Your understanding is correct. The inputs of the data set
are now assumed to be generated by some probability distribution. This does not change the nature of the
's; they are still customer data in the the credit example. The only change is that those customers are now assumed to be pulled from the population according to some distribution that will also be used to generate new customers to test the learned hypothesis.
The assumption is benign since no restrictions are made about what kind of distribution it is, and we don't even need to know what the distribution is. It is just a mechanism that allows us to invoke probabilistic analysis, which is needed to establish the feasibility of learning as discussed in the lecture.