#41




Re: code 3 in Octave qp optimizer
I have tweaked my trick, and now consider it perfectly legal. I think it mght be interesting to those who had the same problem.
So, first, I modified quadratic matrix H by adding to it a diagonal matrix with very small diagonal values. Specifically, I used 10^(15). I passed otherwise unchanged problem to qp. The result was some , which I immediately used as an initial value for the original optimization problem (with original H). The result. The first optimization converged in under 200 steps for N=10 in 100% cases. It converged in 99.99% cases for N=100. The latter optimization frequently took just 1 iteration, though occasionally I saw it take more than 100. It did converge in 100% cases. My E_in became consistently 0, and my b's differed by magnitude of the order less than 10^(8) in 99.99% cases (and I don't believe the outlier influenced my averages all that much). I believe the trick is legal, because all I do is I find a better initial value of , but then still solve the same optimization problem. 
#42




Re: What Quadratic Programming Package?
Thank you! That did the trick for me.

#43




Re: What Quadratic Programming Package?
Thanks for the tip, elkka!
That made a big difference in the quality of my results  in every single case, my "b"'s are perfectly aligned. Unfortunately, it also made a big difference in the answer I submitted It's a bit annoying that a technical, implementation packagespecific tweak can have such a big effect and cause one to submit the wrong answer when one has otherwise done all the correct setup and formulation of the problem, but I suppose this is yet another example of the challenges in designing homework problems for an online audience. The exercise was definitely worth it, I learned quite a bit from this problem, not the least of which is how to use a program like Octave  and as my wife has to keep reminding me, I'm not actually getting a *grade* in this class... 
#44




Re: What Quadratic Programming Package?
My wife is following me on the scoreboard. If it weren't for that tangible evidence of accomplishing something, she probably wouldn't put up with me spending so much time on this class. I've tried explaining machine learning to her, but her takeaway is, "it's about predicting things with computers." Thank goodness for the scoreboard, or I'd be gone already!

#45




Re: What Quadratic Programming Package?
Dang, elkka! That was a hell of a trick!
qp/SVM now works like a champ. Any intuition into why this works? When I first tried your suggestion, I initially just used the normal H to get alpha0 and then fed alpha0 to qp. It made no difference. When I added in the tiny diagonal matrix, it was like night and day. 
#46




Re: What Quadratic Programming Package?
Well, matrix H seems to be degenerate in a fundamental way. Not only is its determinant 0, but its rank is 2 (for both N=10 and N=100). Having a 10*10 matrix of rank 2 is usually a computationally bad thing, let alone a 100*100 matrix of rank 2. So adding just a little bit of a wellbehaved matrix to it seems to help computationally. Without knowing the exact algorithm behind qp it is hard to be more specific.
I am surprised though how little it took to improve things. And I am not at all 100% sure it is the correct approach. Let's wait for the answers. 
#47




Re: What Quadratic Programming Package?
Shown below are two runs with Octave qp, each run twice with the same data. Figure 1 is with @elkka's initialization suggestion. Figure 2 shows the results when the initial alpha is all ones.
The support vector points are circled. With the right initialization (Figure 1), SVM/QP nails the support vectors every time. Without the initialization, it is less than impressive, only occasionally showing an optimum solution. Figure 1 always has 3 points circled. Figure 2 is between 3 and 4. Also, the runtimes with the right initialization are a lot less  obviously because it converges instead of running for maxiterations. 
#48




Re: What Quadratic Programming Package?
sakumar,
I am an Octave newbie  can you show me the commands you used to generate those plots? 
#49




Re: What Quadratic Programming Package?
Quote:
Code:
iterations = 1; n = 100; test_n = 10000; % number of points used to determine Eout svm_threshold = 10^(5); % if alpha value is greater than this, count it as a support vector function w = GetWFrom2Points(x, y) w(3) =  1.0; w(2) = (x(2)  y(2)) / (x(1)  y(1)); w(1) = x(2)  w(2) * x(1); end X = zeros(3,n); testX = zeros(3, test_n); min_val_vec = zeros(n, 1); min_val_vec = svm_threshold; svm_winner_count = 0; sv_count = 0; for m = 1:iterations svm_fraction_incorrect = 0.0; perceptron_fraction_incorrect = 0.0; do % we need to check to make sure the n points are not all on one side of the line % asked to discard datapoint if that happens and start again a = unifrnd(1.0, 1.0, 1, 2); b = unifrnd(1.0, 1.0, 1, 2); fw = GetWFrom2Points(a, b); % This is the target function fw = fw'; % now generate n points  these are the insample points X(2:3,1:n) = unifrnd(1.0, 1.0, 2, n); X(1,1:n) = 1.0; % get true value for insample points ty = sign(fw'*X); until abs(sum(ty)) != n % if all the y's added up equal to n, it means all the points are on one side of the line % now we run the perceptron % initialize hypothesis ("allzero vector") w = zeros(3,1); while (1) y = sign(w'*X); comp = (y == ty); if (comp) % if all comp values are 1, then we have converged break; end % get random mismatched index ilist = find(!comp); m_indx = ilist(unidrnd(sum(!comp))); w = w + ty(m_indx)*X(:, m_indx); end % now get probability that f(x) is not equal to g(x) testX(2:3,1:test_n) = unifrnd(1.0, 1.0, 2, test_n); testX(1,1:test_n) = 1.0; test_ty = sign(fw'*testX); test_y = sign(w'*testX); comp = (test_y == test_ty); perceptron_fraction_incorrect += sum(!comp)/test_n; % Now on to the quadratic programming. Am going to try the "qp" function in Octave % initialize alpha0 vector to zeros (??) alpha0 = ones(n, 1); % the "q" matrix q = ones(n,1)*1; % the A matrix (actually vector) A = ty; % the b value b = 0; % the lb vector (all zeros) lb = zeros(n, 1); X_svm = X(2:3, :); % build the H (called Q in lecture) matrix H = (ty'*ty).*(X_svm'*X_svm); tempH = H + eye(100).*10^15; [alpha0, obj, info, lambda] = qp([], tempH, q, A, b, lb, []); options.MaxIter = 1000; [alpha, obj, info, lambda] = qp(alpha0, H, q, A, b, lb, [], options); 'INFO.INFO' info.info % count number of support vectors comp = ge(alpha, min_val_vec); sv_index = find(comp); num_sv = sum(comp); sv_count += num_sv; svm_w = zeros(3, 1); svm_w(2) = sum(X_svm(1, :).*(alpha'.*ty)); svm_w(3) = sum(X_svm(2, :).*(alpha'.*ty)); % get index of the maximum alpha [maxval, maxindx] = max(alpha); svm_w(1) = 1/ty(maxindx)  (svm_w(2:3,1))'*X_svm(:, maxindx); negmark = lt(ty, min_val_vec); negindex = find(negmark); posmark = gt(ty, min_val_vec); posindex = find(posmark); figure(1); plot(X(2,negindex), X(3,negindex), "@11", X(2, posindex), X(3, posindex), "@22", X(2, sv_index), X(3, sv_index), "o"); printf("NUMBER OF SUPPORT VECTORS %i\n", length(sv_index)); options.MaxIter = 1000; [alpha, obj, info, lambda] = qp([], H, q, A, b, lb, [], options); 'INFO.INFO' info.info % count number of support vectors comp = ge(alpha, min_val_vec); sv_index = find(comp); num_sv = sum(comp); sv_count += num_sv; svm_w = zeros(3, 1); svm_w(2) = sum(X_svm(1, :).*(alpha'.*ty)); svm_w(3) = sum(X_svm(2, :).*(alpha'.*ty)); % get index of the minimum alpha [maxval, maxindx] = max(alpha); svm_w(1) = 1/ty(maxindx)  (svm_w(2:3,1))'*X_svm(:, maxindx); negmark = lt(ty, min_val_vec); negindex = find(negmark); posmark = gt(ty, min_val_vec); posindex = find(posmark); figure(2); plot(X(2,negindex), X(3,negindex), "@11", X(2, posindex), X(3, posindex), "@22", X(2, sv_index), X(3, sv_index), "o"); printf("NUMBER OF SUPPORT VECTORS %i\n", length(sv_index)); end 
#50




Re: What Quadratic Programming Package?
Thanks!!

Tags 
hw7, package, quadratic programming 
Thread Tools  
Display Modes  

