View Single Post
Old 04-13-2012, 12:12 AM
sandpjain sandpjain is offline
Junior Member
Join Date: Apr 2012
Posts: 1
Default Meaning of the variables y1,y2,..yN etc.?

(Note: I sent this question directly to Professor Abu-Mostafa, and he suggested that I post the question and his reply to the forum)

Given the data set

(x1,y1), (x2,y2),....(xN,yN)

where in the case for example of a credit card qualification application the x1 ... xN might mean income, age, debt, etc.

and where Y = h(x1,x2,x3..xN) as Y being a binary function of the vector X to the scalar +1 or -1, with h as a candidate hypotheses,

what do the values y1,y2,y3...yN in the data set stand for, how are they determined, and what is their role in the mapping h as defined above?

Reply by Professor Yaser S. Abu-Mostafa:
The source of confusion is the at bold-face x (let's call it xx) stands for a full vector which is the total input (all the information about a particular credit-card applicant), while italic x (let's call it x) stands for a single coordinate (salary or years in residence for example) in the input.

Therefore, xx_1,...,xx_N are different customers, while x_1,...,x_d are coordinates of the same customer. The notation N for number of examples and d for dimensionality of the input space is standard in the course.

With this in mind, y_1,...,y_N are simply the credit behavior of the N different customers (whether each of them was a good or a bad credit customer).
Reply With Quote