View Single Post
  #1  
Old 04-12-2016, 09:41 PM
galo galo is offline
Junior Member
 
Join Date: Jan 2016
Posts: 7
Exclamation Q2 - Classifier only correctly predicts a few classes

I can't seem to get Q2 right. I'm using the support vector classifier from the sklearn package (svm.SVC) in Python. I've put my parameters to the right values but the Ein (1-recall in the ouput) is way too high for most classes. I don't think using pandas is the reason, but still, I changed the classes to int since pandas was using float64 as a default type.

Code:
import pandas as pd
from sklearn import svm, metrics

train_df = pd.read_csv(
    filepath,
    sep = "[ ]*",
    engine = "python",
    header = None
    )
train_df.columns = ["Digit", "Intensity", "Symmetry"]
train_df["Digit"] = train_df["Digit"].astype(int)

clf = svm.SVC(
    C = 0.01,
    kernel = 'poly',
    degree = 2.0,
    gamma = 1.0,
    coef0 = 1.0
    )

X = train_df.ix[:,(1,2)].values
y = train_df.ix[:,0].values

clf.fit(X,y)

expected = y
predicted = clf.predict(X)

print("Classification report for classifier %s:\n%s\n"
      % (clf, metrics.classification_report(expected, predicted)))
print("Confusion matrix:\n%s" % metrics.confusion_matrix(expected, predicted))
No 5 or 8 are predicted correctly, very few 4 and 6, and a few 3 and 7. This is way too strange.

Can someone show me where I'm doing something wrong?

Output:

Code:
Classification report for classifier SVC(C=0.01, cache_size=200, class_weight=None, coef0=1.0,
decision_function_shape=None, degree=2.0, gamma=1.0, kernel='poly',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False):
             precision    recall  f1-score   support

          0       0.54      0.83      0.65      1194
          1       0.93      0.96      0.95      1005
          2       0.22      0.55      0.31       731
          3       0.27      0.12      0.17       658
          4       0.12      0.02      0.04       652
          5       0.00      0.00      0.00       556
          6       0.09      0.00      0.00       664
          7       0.26      0.16      0.19       645
          8       0.00      0.00      0.00       542
          9       0.21      0.56      0.30       644

avg / total       0.32      0.40      0.33      7291


Confusion matrix:
[[987  41  65  45   2   0   0   5   0  49]
 [ 35 969   0   0   0   0   0   0   0   1]
 [ 65   3 404  48  16   0   0  48   0 147]
 [204   1 165  79  17   0   0  22   0 170]
 [ 79   9 163  11  16   0   1  81   0 292]
 [ 14   1 361  21  17   0   3  51   0  88]
 [ 38   0 282  35  22   0   1  53   0 233]
 [ 22   1 178   5  29   0   2 100   0 308]
 [298  16  89  26   1   0   2   8   0 102]
 [ 83   0 133  24  17   0   2  23   0 362]]

Last edited by galo; 04-13-2016 at 12:13 AM. Reason: title
Reply With Quote