LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 7

Reply
 
Thread Tools Display Modes
  #41  
Old 05-22-2012, 09:23 AM
elkka elkka is offline
Invited Guest
 
Join Date: Apr 2012
Posts: 57
Default Re: code 3 in Octave qp optimizer

I have tweaked my trick, and now consider it perfectly legal. I think it mght be interesting to those who had the same problem.

So, first, I modified quadratic matrix H by adding to it a diagonal matrix with very small diagonal values. Specifically, I used 10^(-15). I passed otherwise unchanged problem to qp. The result was some \alpha_0, which I immediately used as an initial value for the original optimization problem (with original H).

The result. The first optimization converged in under 200 steps for N=10 in 100% cases. It converged in 99.99% cases for N=100. The latter optimization frequently took just 1 iteration, though occasionally I saw it take more than 100. It did converge in 100% cases. My E_in became consistently 0, and my b's differed by magnitude of the order less than 10^(-8) in 99.99% cases (and I don't believe the outlier influenced my averages all that much).

I believe the trick is legal, because all I do is I find a better initial value of \alpha, but then still solve the same optimization problem.
Reply With Quote
  #42  
Old 05-22-2012, 09:31 AM
markland markland is offline
Member
 
Join Date: Apr 2012
Posts: 30
Default Re: What Quadratic Programming Package?

Thank you! That did the trick for me.
Reply With Quote
  #43  
Old 05-22-2012, 01:17 PM
kurts kurts is offline
Invited Guest
 
Join Date: Apr 2012
Location: Portland, OR
Posts: 70
Default Re: What Quadratic Programming Package?

Thanks for the tip, elkka!

That made a big difference in the quality of my results - in every single case, my "b"'s are perfectly aligned. Unfortunately, it also made a big difference in the answer I submitted

It's a bit annoying that a technical, implementation package-specific tweak can have such a big effect and cause one to submit the wrong answer when one has otherwise done all the correct setup and formulation of the problem, but I suppose this is yet another example of the challenges in designing homework problems for an online audience.

The exercise was definitely worth it, I learned quite a bit from this problem, not the least of which is how to use a program like Octave - and as my wife has to keep reminding me, I'm not actually getting a *grade* in this class...
Reply With Quote
  #44  
Old 05-22-2012, 02:05 PM
Hillbilly Hillbilly is offline
Invited Guest
 
Join Date: Mar 2012
Location: West Virginia
Posts: 45
Default Re: What Quadratic Programming Package?

Quote:
Originally Posted by kurts View Post
and as my wife has to keep reminding me, I'm not actually getting a *grade* in this class...
My wife is following me on the scoreboard. If it weren't for that tangible evidence of accomplishing something, she probably wouldn't put up with me spending so much time on this class. I've tried explaining machine learning to her, but her take-away is, "it's about predicting things with computers." Thank goodness for the scoreboard, or I'd be gone already!
Reply With Quote
  #45  
Old 05-22-2012, 02:22 PM
sakumar sakumar is offline
Member
 
Join Date: Apr 2012
Posts: 40
Default Re: What Quadratic Programming Package?

Dang, elkka! That was a hell of a trick!

qp/SVM now works like a champ.

Any intuition into why this works? When I first tried your suggestion, I initially just used the normal H to get alpha0 and then fed alpha0 to qp. It made no difference. When I added in the tiny diagonal matrix, it was like night and day.
Reply With Quote
  #46  
Old 05-22-2012, 02:37 PM
elkka elkka is offline
Invited Guest
 
Join Date: Apr 2012
Posts: 57
Default Re: What Quadratic Programming Package?

Well, matrix H seems to be degenerate in a fundamental way. Not only is its determinant 0, but its rank is 2 (for both N=10 and N=100). Having a 10*10 matrix of rank 2 is usually a computationally bad thing, let alone a 100*100 matrix of rank 2. So adding just a little bit of a well-behaved matrix to it seems to help computationally. Without knowing the exact algorithm behind qp it is hard to be more specific.

I am surprised though how little it took to improve things. And I am not at all 100% sure it is the correct approach. Let's wait for the answers.
Reply With Quote
  #47  
Old 05-23-2012, 12:25 AM
sakumar sakumar is offline
Member
 
Join Date: Apr 2012
Posts: 40
Default Re: What Quadratic Programming Package?

Shown below are two runs with Octave qp, each run twice with the same data. Figure 1 is with @elkka's initialization suggestion. Figure 2 shows the results when the initial alpha is all ones.

The support vector points are circled.

With the right initialization (Figure 1), SVM/QP nails the support vectors every time. Without the initialization, it is less than impressive, only occasionally showing an optimum solution.

Figure 1 always has 3 points circled. Figure 2 is between 3 and 4. Also, the runtimes with the right initialization are a lot less -- obviously because it converges instead of running for max-iterations.




Reply With Quote
  #48  
Old 05-23-2012, 12:50 AM
kurts kurts is offline
Invited Guest
 
Join Date: Apr 2012
Location: Portland, OR
Posts: 70
Default Re: What Quadratic Programming Package?

sakumar,
I am an Octave newbie - can you show me the commands you used to generate those plots?
Reply With Quote
  #49  
Old 05-23-2012, 01:02 AM
sakumar sakumar is offline
Member
 
Join Date: Apr 2012
Posts: 40
Default Re: What Quadratic Programming Package?

Quote:
Originally Posted by kurts View Post
sakumar,
I am an Octave newbie - can you show me the commands you used to generate those plots?
Sure. Here it is. It is hacked up a bit because I cut and paste to get Figure 1 and Figure 2. It just runs one iteration and generates two plots.

Code:
iterations = 1;
n = 100;

test_n = 10000; % number of points used to determine Eout

svm_threshold = 10^(-5); % if alpha value is greater than this, count it as a support vector


function w = GetWFrom2Points(x, y)
    w(3) = - 1.0;
    w(2) = (x(2) - y(2)) / (x(1) - y(1));
    w(1) = x(2) - w(2) * x(1);
end

X = zeros(3,n);
testX = zeros(3, test_n);


min_val_vec = zeros(n, 1);
min_val_vec = svm_threshold;

svm_winner_count = 0;
sv_count = 0;

for m = 1:iterations

    svm_fraction_incorrect = 0.0;
    perceptron_fraction_incorrect = 0.0;

    do % we need to check to make sure the n points are not all on one side of the line
       % asked to discard datapoint if that happens and start again
        a = unifrnd(-1.0, 1.0, 1, 2);
        b = unifrnd(-1.0, 1.0, 1, 2);

        fw = GetWFrom2Points(a, b); % This is the target function
        fw = fw';

        % now generate n points -- these are the in-sample points
        X(2:3,1:n) = unifrnd(-1.0, 1.0, 2, n);
        X(1,1:n) = 1.0;

        % get true value for in-sample points
        ty = sign(fw'*X);
    until abs(sum(ty)) != n % if all the y's added up equal to n, it means all the points are on one side of the line


    % now we run the perceptron
    % initialize hypothesis ("all-zero vector")
    w = zeros(3,1);

    while (1)
        y = sign(w'*X);
        comp = (y == ty);

        if (comp) % if all comp values are 1, then we have converged
            break;
        end

        % get random mismatched index
        ilist = find(!comp);
        m_indx = ilist(unidrnd(sum(!comp)));

        w = w + ty(m_indx)*X(:, m_indx);
    end

    % now get probability that f(x) is not equal to g(x)
    testX(2:3,1:test_n) = unifrnd(-1.0, 1.0, 2, test_n);
    testX(1,1:test_n) = 1.0;

    test_ty = sign(fw'*testX);
    test_y = sign(w'*testX);

    comp = (test_y == test_ty);

    perceptron_fraction_incorrect += sum(!comp)/test_n;


    % Now on to the quadratic programming. Am going to try the "qp" function in Octave

    % initialize alpha0 vector to zeros (??)
    alpha0 = ones(n, 1);

    % the "q" matrix
    q = ones(n,1)*-1;

    % the A matrix (actually vector)
    A = ty;

    % the b value
    b = 0;

    % the lb vector (all zeros)
    lb = zeros(n, 1);

    X_svm = X(2:3, :);
    % build the H (called Q in lecture) matrix
    H = (ty'*ty).*(X_svm'*X_svm);

    tempH = H + eye(100).*10^-15;
    [alpha0, obj, info, lambda] = qp([], tempH, q, A, b, lb, []);

    options.MaxIter = 1000;
    [alpha, obj, info, lambda] = qp(alpha0, H, q, A, b, lb, [], options);

    'INFO.INFO'
    info.info

    % count number of support vectors
    comp = ge(alpha, min_val_vec);
    sv_index = find(comp);
    num_sv = sum(comp);

    sv_count += num_sv;


    svm_w = zeros(3, 1);

    svm_w(2) = sum(X_svm(1, :).*(alpha'.*ty));
    svm_w(3) = sum(X_svm(2, :).*(alpha'.*ty));

    % get index of the maximum alpha
    [maxval, maxindx] = max(alpha);

    svm_w(1) = 1/ty(maxindx) - (svm_w(2:3,1))'*X_svm(:, maxindx);


    negmark = lt(ty, min_val_vec);
    negindex = find(negmark);

    posmark = gt(ty, min_val_vec);
    posindex = find(posmark);

    figure(1);
    plot(X(2,negindex), X(3,negindex), "@11", X(2, posindex), X(3, posindex), "@22", X(2, sv_index), X(3, sv_index), "o");

    printf("NUMBER OF SUPPORT VECTORS %i\n", length(sv_index));

    options.MaxIter = 1000;
    [alpha, obj, info, lambda] = qp([], H, q, A, b, lb, [], options);

    'INFO.INFO'
    info.info

    % count number of support vectors
    comp = ge(alpha, min_val_vec);
    sv_index = find(comp);
    num_sv = sum(comp);

    sv_count += num_sv;


    svm_w = zeros(3, 1);

    svm_w(2) = sum(X_svm(1, :).*(alpha'.*ty));
    svm_w(3) = sum(X_svm(2, :).*(alpha'.*ty));

    % get index of the minimum alpha
    [maxval, maxindx] = max(alpha);

    svm_w(1) = 1/ty(maxindx) - (svm_w(2:3,1))'*X_svm(:, maxindx);


    negmark = lt(ty, min_val_vec);
    negindex = find(negmark);

    posmark = gt(ty, min_val_vec);
    posindex = find(posmark);

    figure(2);
    plot(X(2,negindex), X(3,negindex), "@11", X(2, posindex), X(3, posindex), "@22", X(2, sv_index), X(3, sv_index), "o");

    printf("NUMBER OF SUPPORT VECTORS %i\n", length(sv_index));


end
Reply With Quote
  #50  
Old 05-23-2012, 01:25 AM
kurts kurts is offline
Invited Guest
 
Join Date: Apr 2012
Location: Portland, OR
Posts: 70
Default Re: What Quadratic Programming Package?

Thanks!!
Reply With Quote
Reply

Tags
hw7, package, quadratic programming

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 12:58 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.