 LFD Book Forum Classifying Handwritten Digits: 1 vs. 5

#1
 nahgnaw Junior Member Join Date: Aug 2012 Posts: 4 Classifying Handwritten Digits: 1 vs. 5

I don't quite understand the first classification method given by the problem: "Linear Regression for classiﬁcation followed by pocket for improvement". Since the weight returned by linear regression is an analytically optimal result, how can the pocket algorithm improve it?
#2 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595 Re: Classifying Handwritten Digits: 1 vs. 5

It is only analytically optimal for regression. It can be suboptimal for classification.

Quote:
 Originally Posted by nahgnaw I don't quite understand the first classification method given by the problem: "Linear Regression for classiﬁcation followed by pocket for improvement". Since the weight returned by linear regression is an analytically optimal result, how can the pocket algorithm improve it?
__________________
Have faith in probability
#3
 rpistu Member Join Date: Oct 2012 Posts: 10 Re: Classifying Handwritten Digits: 1 vs. 5

Hi Professor, you said that the weight vector w learnted from the Linear Regression could be suboptimal for classification. However, after run the pocket algorithm with 1,000,000 iteration, the w still not change, which means that the w learnt from the Linear Regression is optimal. Is that true? Maybe I made some mistake.
#4
 nahgnaw Junior Member Join Date: Aug 2012 Posts: 4 Re: Classifying Handwritten Digits: 1 vs. 5

The pocket algorithm indeed is able to improve the linear regression. Mine decreased the in-sample error from 0.8% to around 0.4%.
#5
 mileschen Member Join Date: Sep 2012 Posts: 11 Re: Classifying Handwritten Digits: 1 vs. 5

Do you set the w learnt from Linear Regression as the initial w for Pocket Algorithm? I did like that, but without any improvement. Maybe some mistakes.
#6
 nahgnaw Junior Member Join Date: Aug 2012 Posts: 4 Re: Classifying Handwritten Digits: 1 vs. 5

Quote:
 Originally Posted by mileschen Do you set the w learnt from Linear Regression as the initial w for Pocket Algorithm? I did like that, but without any improvement. Maybe some mistakes.
Yes, I did. I guess you probably should look at your implementation of the pocket algorithm. I also got no improvement at first. But then I messed around the code a little bit, and it worked.
#7 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595 Re: Classifying Handwritten Digits: 1 vs. 5

Any one of these three can happen:

1) the linear regression weights are optimal
2) the linear regression weights are not optimal and the PLA/Pocket algorithm can improve the weights.
3) the linear regression weights are not optimal and the PLA/Pocket algorithm cannot improve the weights.

In practice, we will not know which case we are in because actually finding the optimal weights is an NP-hard combinatorial optimization problem.

However, no matter which case we are in, other than some extra CPU cycles, there is no harm done in running the pocket algorithm on the regression weights to see if they can be improved.

Quote:
 Originally Posted by rpistu Hi Professor, you said that the weight vector w learnted from the Linear Regression could be suboptimal for classification. However, after run the pocket algorithm with 1,000,000 iteration, the w still not change, which means that the w learnt from the Linear Regression is optimal. Is that true? Maybe I made some mistake.
__________________
Have faith in probability
#8
 alanericy Junior Member Join Date: Oct 2013 Posts: 5 Re: Classifying Handwritten Digits: 1 vs. 5

Quote:
 Originally Posted by magdon Any one of these three can happen: 1) the linear regression weights are optimal 2) the linear regression weights are not optimal and the PLA/Pocket algorithm can improve the weights. 3) the linear regression weights are not optimal and the PLA/Pocket algorithm cannot improve the weights. In practice, we will not know which case we are in because actually finding the optimal weights is an NP-hard combinatorial optimization problem. However, no matter which case we are in, other than some extra CPU cycles, there is no harm done in running the pocket algorithm on the regression weights to see if they can be improved.
Hi Professor, how to plot the separators with the training data if I use Logistic regression for classiﬁcation using gradient descent. In this way we could compute the probabilities for every point in the figure but how to plot a separator for them?
#9 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595 Re: Classifying Handwritten Digits: 1 vs. 5

You can use the weights produced by logistic regression for classification.

Quote:
 Originally Posted by alanericy Hi Professor, how to plot the separators with the training data if I use Logistic regression for classiﬁcation using gradient descent. In this way we could compute the probabilities for every point in the figure but how to plot a separator for them?
__________________
Have faith in probability

 Thread Tools Show Printable Version Email this Page Display Modes Switch to Linear Mode Hybrid Mode Switch to Threaded Mode Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 11:12 PM.