LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 1 - The Learning Problem (http://book.caltech.edu/bookforum/forumdisplay.php?f=108)
-   -   Meaning of the variables y1,y2,..yN etc.? (http://book.caltech.edu/bookforum/showthread.php?t=334)

 sandpjain 04-13-2012 01:12 AM

Meaning of the variables y1,y2,..yN etc.?

(Note: I sent this question directly to Professor Abu-Mostafa, and he suggested that I post the question and his reply to the forum)

Question:
>>>
Given the data set

(x1,y1), (x2,y2),....(xN,yN)

where in the case for example of a credit card qualification application the x1 ... xN might mean income, age, debt, etc.

and where Y = h(x1,x2,x3..xN) as Y being a binary function of the vector X to the scalar +1 or -1, with h as a candidate hypotheses,

what do the values y1,y2,y3...yN in the data set stand for, how are they determined, and what is their role in the mapping h as defined above?
<<<

Reply by Professor Yaser S. Abu-Mostafa:
>>>
The source of confusion is the at bold-face x (let's call it xx) stands for a full vector which is the total input (all the information about a particular credit-card applicant), while italic x (let's call it x) stands for a single coordinate (salary or years in residence for example) in the input.

Therefore, xx_1,...,xx_N are different customers, while x_1,...,x_d are coordinates of the same customer. The notation N for number of examples and d for dimensionality of the input space is standard in the course.

With this in mind, y_1,...,y_N are simply the credit behavior of the N different customers (whether each of them was a good or a bad credit customer).
<<<

 All times are GMT -7. The time now is 07:01 AM.