#1




Meaning of the variables y1,y2,..yN etc.?
(Note: I sent this question directly to Professor AbuMostafa, and he suggested that I post the question and his reply to the forum)
Question: >>> Given the data set (x1,y1), (x2,y2),....(xN,yN) where in the case for example of a credit card qualification application the x1 ... xN might mean income, age, debt, etc. and where Y = h(x1,x2,x3..xN) as Y being a binary function of the vector X to the scalar +1 or 1, with h as a candidate hypotheses, what do the values y1,y2,y3...yN in the data set stand for, how are they determined, and what is their role in the mapping h as defined above? <<< Reply by Professor Yaser S. AbuMostafa: >>> The source of confusion is the at boldface x (let's call it xx) stands for a full vector which is the total input (all the information about a particular creditcard applicant), while italic x (let's call it x) stands for a single coordinate (salary or years in residence for example) in the input. Therefore, xx_1,...,xx_N are different customers, while x_1,...,x_d are coordinates of the same customer. The notation N for number of examples and d for dimensionality of the input space is standard in the course. With this in mind, y_1,...,y_N are simply the credit behavior of the N different customers (whether each of them was a good or a bad credit customer). <<< 
Thread Tools  
Display Modes  

