LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Chapter 3 - The Linear Model (http://book.caltech.edu/bookforum/forumdisplay.php?f=110)
-   -   Classifying Handwritten Digits: 1 vs. 5 (http://book.caltech.edu/bookforum/showthread.php?t=2063)

nahgnaw 10-10-2012 09:50 PM

Classifying Handwritten Digits: 1 vs. 5
 
I don't quite understand the first classification method given by the problem: "Linear Regression for classification followed by pocket for improvement". Since the weight returned by linear regression is an analytically optimal result, how can the pocket algorithm improve it?

magdon 10-11-2012 09:11 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
It is only analytically optimal for regression. It can be suboptimal for classification.

Quote:

Originally Posted by nahgnaw (Post 6270)
I don't quite understand the first classification method given by the problem: "Linear Regression for classification followed by pocket for improvement". Since the weight returned by linear regression is an analytically optimal result, how can the pocket algorithm improve it?


rpistu 10-12-2012 12:53 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Hi Professor, you said that the weight vector w learnted from the Linear Regression could be suboptimal for classification. However, after run the pocket algorithm with 1,000,000 iteration, the w still not change, which means that the w learnt from the Linear Regression is optimal. Is that true? Maybe I made some mistake.

nahgnaw 10-12-2012 01:44 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
The pocket algorithm indeed is able to improve the linear regression. Mine decreased the in-sample error from 0.8% to around 0.4%.

mileschen 10-12-2012 02:15 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Do you set the w learnt from Linear Regression as the initial w for Pocket Algorithm? I did like that, but without any improvement. Maybe some mistakes.

magdon 10-12-2012 02:16 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Any one of these three can happen:

1) the linear regression weights are optimal
2) the linear regression weights are not optimal and the PLA/Pocket algorithm can improve the weights.
3) the linear regression weights are not optimal and the PLA/Pocket algorithm cannot improve the weights.

In practice, we will not know which case we are in because actually finding the optimal weights is an NP-hard combinatorial optimization problem.

However, no matter which case we are in, other than some extra CPU cycles, there is no harm done in running the pocket algorithm on the regression weights to see if they can be improved.


Quote:

Originally Posted by rpistu (Post 6301)
Hi Professor, you said that the weight vector w learnted from the Linear Regression could be suboptimal for classification. However, after run the pocket algorithm with 1,000,000 iteration, the w still not change, which means that the w learnt from the Linear Regression is optimal. Is that true? Maybe I made some mistake.


nahgnaw 10-12-2012 02:18 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Quote:

Originally Posted by mileschen (Post 6324)
Do you set the w learnt from Linear Regression as the initial w for Pocket Algorithm? I did like that, but without any improvement. Maybe some mistakes.

Yes, I did. I guess you probably should look at your implementation of the pocket algorithm. I also got no improvement at first. But then I messed around the code a little bit, and it worked.

rpistu 10-12-2012 05:31 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Does the in-samle-error should use the square error formula, like the one of Linear Regression? Or the in-sample-error of binary function?

magdon 10-12-2012 06:23 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Binary classification error.

Quote:

Originally Posted by rpistu (Post 6336)
Does the in-samle-error should use the square error formula, like the one of Linear Regression? Or the in-sample-error of binary function?


rpistu 10-13-2012 02:11 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
How to plot the training and the test data, together with the separators learnt by using a 3rd order polynomial transform. Actually, the 3rd order polynomial hypothesis is a unclear formula with the two features. Then, how to plot this polynomial hypothesis in a two dementional axis?

magdon 10-13-2012 06:15 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
This thread has a response that might help:

http://book.caltech.edu/bookforum/showthread.php?t=2101


Quote:

Originally Posted by rpistu (Post 6344)
How to plot the training and the test data, together with the separators learnt by using a 3rd order polynomial transform. Actually, the 3rd order polynomial hypothesis is a unclear formula with the two features. Then, how to plot this polynomial hypothesis in a two dementional axis?


admas 10-03-2013 08:00 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Hello Professor Magdon,

I have a slightly different question related to plotting.
When I am asked to "familiarize yourself with the data by giving a plot of two
of the digit images", what do we mean by plotting the data? Are we referring to
generating a digit image from the greyscale value vector? Or are you referring to
somehow plotting the numerical values in the vector, itself? I have a feeling that it is
the former, but I am not familar with image generation from greyscale pixels.

Any guidance you could provide would be greatly appreciated.

magdon 10-04-2013 09:58 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
If you go to www.amlbook.com, click on 'supporting material' on the right and then scroll down to the `Data' section, you will find some information that can be of use. In particular, there is matlab code for plotting the digit images which takes the matrix of grayscale values and plots an image. This can be of help for developing your own code and utilities.

Quote:

Originally Posted by admas (Post 11519)
Hello Professor Magdon,

I have a slightly different question related to plotting.
When I am asked to "familiarize yourself with the data by giving a plot of two
of the digit images", what do we mean by plotting the data? Are we referring to
generating a digit image from the greyscale value vector? Or are you referring to
somehow plotting the numerical values in the vector, itself? I have a feeling that it is
the former, but I am not familar with image generation from greyscale pixels.

Any guidance you could provide would be greatly appreciated.


admas 10-05-2013 04:13 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Thank you for your assistance. That helps me greatly.

alanericy 10-15-2013 08:49 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Quote:

Originally Posted by magdon (Post 6325)
Any one of these three can happen:

1) the linear regression weights are optimal
2) the linear regression weights are not optimal and the PLA/Pocket algorithm can improve the weights.
3) the linear regression weights are not optimal and the PLA/Pocket algorithm cannot improve the weights.

In practice, we will not know which case we are in because actually finding the optimal weights is an NP-hard combinatorial optimization problem.

However, no matter which case we are in, other than some extra CPU cycles, there is no harm done in running the pocket algorithm on the regression weights to see if they can be improved.

Hi Professor, how to plot the separators with the training data if I use Logistic regression for classification using gradient descent. In this way we could compute the probabilities for every point in the figure but how to plot a separator for them?

magdon 10-16-2013 08:29 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
You can use the weights produced by logistic regression for classification.

Quote:

Originally Posted by alanericy (Post 11571)
Hi Professor, how to plot the separators with the training data if I use Logistic regression for classification using gradient descent. In this way we could compute the probabilities for every point in the figure but how to plot a separator for them?


admas 10-16-2013 10:09 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Hello. I have a question about the digits assignment. Are we supposed to use two
features separately {1, feature i} or have a input vector consisting of {1,feature 1, feature 2}?

alanericy 10-16-2013 01:36 PM

Re: Classifying Handwritten Digits: 1 vs. 5
 
Thanks for your reply. In the logistic regression will the separator still be linear or not? And should we use fixed or variable step size in the logistic regression?

magdon 10-17-2013 11:46 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
{1,feature 1, feature 2}

Quote:

Originally Posted by admas (Post 11573)
Hello. I have a question about the digits assignment. Are we supposed to use two
features separately {1, feature i} or have a input vector consisting of {1,feature 1, feature 2}?


magdon 10-17-2013 11:47 AM

Re: Classifying Handwritten Digits: 1 vs. 5
 
The classification function is still sign(w^Tx) which is linear. The weights w are obtained using logistic regression.

Quote:

Originally Posted by alanericy (Post 11574)
Thanks for your reply. In the logistic regression will the separator still be linear or not? And should we use fixed or variable step size in the logistic regression?



All times are GMT -7. The time now is 09:45 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.