LFD Book Forum PLA vs Linear Regression

#1
04-10-2012, 03:18 PM
 jg2012 Junior Member Join Date: Apr 2012 Posts: 7
PLA vs Linear Regression

Can we ask week 2 lecture questions on the Linear Model here?

Building a Perceptron classifier in the week 1 homework helped me see how the weight vector, w, defines the decision boundary (or line). For two-dimensional input, w0 + w1*x1 + w2*x2 = 0 is the equation for the decision line.

But when doing linear regression it is more like fitting a (hyper-)line to data points, which seems orthogonal to a decision boundary. I'm confused on how to reconcile these points of view. Does training w with PLA yield roughly the same results as training w using linear regression in which the y values are set to -1,1 for the two classes?

Here's what I'm thinking so far: Maybe linear regression with x1, x2, and y={-1,1} finds the equation of a plane that 'passes through' _all_ of the data best (in the least squares sense), _not_ a plane that separates the two classes. Then where the plane passes through y=0, maybe that would be similar to the line found from PLA. But if this interpretation is right, wouldn't the w vector for linear regression have 4 numbers for this example (so it defines a plane), whereas w for PLA only had three?

(If this view is right then I can see how, as mentioned in class, the spread of training examples further away pull the plane towards that class, moving the decision line further into that class in an unwanted way...)
#2
04-10-2012, 05:15 PM
 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595
Re: PLA vs Linear Regression

Yes, questions relating to the book/course can be asked here at any time. This is a thought provoking question.

Indeed there are a several things to think about.

1) There is a difference between the classification function learned by PLA and the classification boundary (a line) which separates +1 from -1. The classification function attaches a value () to every point in the input space. The classification function learned by PLA is a halfspace of +1 and a halfspace of -1. It is this classification function that is analogous to the learned linear regression function, which also attaches a value to every point in the input space -- except that this linear regression function attaches not just but any real value.

2) Yes, setting the linear regression function to 0 in some sense generates a linear boundary that 'separates' the region between where the regression function is positive and where it is negative. In fact one use of regression is to classify the space into +1 and -1 in exactly this way.

Hope this helps. You may also find problem 3.13 in the book providing an interesting link between classification in 2-d and regression in 1-d.

Quote:
 Originally Posted by jg2012 Can we ask week 2 lecture questions on the Linear Model here? Building a Perceptron classifier in the week 1 homework helped me see how the weight vector, w, defines the decision boundary (or line). For two-dimensional input, w0 + w1*x1 + w2*x2 = 0 is the equation for the decision line. But when doing linear regression it is more like fitting a (hyper-)line to data points, which seems orthogonal to a decision boundary. I'm confused on how to reconcile these points of view. Does training w with PLA yield roughly the same results as training w using linear regression in which the y values are set to -1,1 for the two classes? Here's what I'm thinking so far: Maybe linear regression with x1, x2, and y={-1,1} finds the equation of a plane that 'passes through' _all_ of the data best (in the least squares sense), _not_ a plane that separates the two classes. Then where the plane passes through y=0, maybe that would be similar to the line found from PLA. But if this interpretation is right, wouldn't the w vector for linear regression have 4 numbers for this example (so it defines a plane), whereas w for PLA only had three? (If this view is right then I can see how, as mentioned in class, the spread of training examples further away pull the plane towards that class, moving the decision line further into that class in an unwanted way...)
__________________
Have faith in probability

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 03:27 AM.