LFD Book Forum HW2.5: does location of split boundaries matter?

#1
04-12-2012, 10:36 PM
 learnaholic Member Join Date: Apr 2012 Posts: 22
HW2.5: does location of split boundaries matter?

Hi,

First, a piece of background. I'm a 41 year old software engineer who is loooong out of school and would like to learn a bit about machine learning. Having said that:

1) What software language/tools are people in general using for their homework experiments? I saw one reference to matlab, but I'm not sure I can get that. I'm not using anything visual right now, which makes things a bit tougher, so if someone had a recommendation, that would be great.

2) Re: hw2-5: If I understood the terminology correctly, I'm supposed to pick 2 random points in the x,y place which are enclosed within the square (-1, -1,), (-1, 1), (1, 1,), and (1, -1). I then connect the dots, get a line, and that is my separator, and then I obtain all my data and run different samples based on this line.

My problem though is this: I've noticed in my program that if the area is 50/50 (I can force this by choosing one point through (0,0), I obtain one solution, but when the area has a much different split (say 80/20), I get a different answer. Is this expected, or do I have a bug in my program?

#2
04-13-2012, 06:54 AM
 elkka Invited Guest Join Date: Apr 2012 Posts: 57
Re: HW2.5: does location of split boundaries matter?

I use Octave, which is a free source analog of Matlab.

I was also wondering about the dependence of my answer o particular function f. The question is worded in such a way that it seems we only need to determine f once, and then generate 1000 random data sets and 1000 approximations g for that one f. Is this is indeed the case, I also observe significant differences in the average in-sample and out-of-sample errors depending on f.

But if I generate randomly 1000 f's, form one random set for each, and calculate one approximation g - the variability in average results practically disappears.
#3
04-13-2012, 09:02 AM
 SamK52 Member Join Date: Apr 2012 Posts: 25
Re: HW2.5: does location of split boundaries matter?

1) I recommend GNU Octave, which is (a) free, (b) mostly compatible with MATLAB, (c) command-line oriented, (d) multi-platform and (e) used in several Coursera.org courses, including their upcoming Machine Learning class. I just picked it up last week and I found it pretty easy once I got over the initial learning curve.

I found this tutorial pretty useful:

Octave-Matlab-Tutorial

2) I think you got it right, and I would not be surprised if results are different when the proportion between the two areas vary. That should not be a problem, since the assignment calls for running multiple experiments (each with a different separator line and different training points.)
#4
04-13-2012, 11:15 AM
 learnaholic Member Join Date: Apr 2012 Posts: 22
Re: HW2.5: does location of split boundaries matter?

I kinda expected the difference in results too based on some points made in lecture 3. So I'm not surprised by this.

But I have the same issue as ellka. In the precursor part of the problem, it says to set up the boundary. The "rinse, lather, repeat" step is only after this one-time boundary step is set up.

Without hearing anything from the instructors, I'm going to randomize the boundary in the N=1000 steps and test my results (I haven't done this yet). But I hope I get some clarification!

Thanks!
#5
04-13-2012, 11:55 AM
 learnaholic Member Join Date: Apr 2012 Posts: 22
Re: HW2.5: does location of split boundaries matter?

Hmmm...I just ran the tests where I randomized where the boundary line was before I ran each N sample. I now get a consistent value (which I was sure I would), but the value isn't what I would consider to be "close" to any of the answers given.

Sure, I could choose the best one, but either I have a bug in my python shell (non-gui python, lol) or I'm misinterpreting the question.

I probably should take the time to learn octave, but sadly I don't even know matlab and it'll be a bit tough to get this assignment done on time. Oh well. At least I'll get to learn stuff.
#6
04-13-2012, 12:09 PM
 Tyler Junior Member Join Date: Apr 2012 Posts: 9
Re: HW2.5: does location of split boundaries matter?

For the first assignment I just used Excel. LibreOffice Calc is an open source version if you don't like Microsoft. I don't know if this will be sophisticated enough to do the other assignments.
#7
04-13-2012, 05:17 PM
 ManUtd Junior Member Join Date: Apr 2012 Posts: 6
Re: HW2.5: does location of split boundaries matter?

You can use either R or Octave.
#8
04-14-2012, 07:38 AM
 alfansome Member Join Date: Apr 2012 Posts: 35
Re: HW2.5: does location of split boundaries matter?

I wrote a java program for first homework assignment (various classes for the problems) using the Eclipse ide; it worked pretty well although, for the PLA exercise, there were some subtle bugs in the logic that were hard to trace down; I ended up writing a second program for this exercise which is visual: it shows a panel with the target function and the random points. You can step through the iterations or just have it run. If you step through, you will see the current hypothesis function (characterized as a line based on the current weights) and which points are misclassified at each step.

Not sure about the new homework as java doesn't supply much in the way of matrix support, but I'm looking for matrix packages.
#9
04-15-2012, 12:34 AM
 kurts Invited Guest Join Date: Apr 2012 Location: Portland, OR Posts: 70
Re: HW2.5: does location of split boundaries matter?

I also had trouble with the matrix calculations. I have access to Matlab, but I don't know how to use it. I prefer to do these exercises in Objective-C and use the iPhone simulator, and fortunately the iPhone SDK has the CBLAS and LAPACK libraries available, which are widely available for gcc as well.

I used the cblas_dgemm function for matrix multiplication and the combination of dgetrf_ and dgetri_ (from LAPACK) for calculating the inverse of a matrix.

It took a while to figure out what values to put into the multitude of parameters that these functions require, but I managed it. It sure took a lot less time than writing out the matrix operations from scratch, though!
#10
04-15-2012, 03:43 AM
 qitiq Junior Member Join Date: Apr 2012 Location: Belgium Posts: 5
Re: HW2.5: does location of split boundaries matter?

I am using Python for the homework.
I learned this while taking the course from Udacity in AI for robotics.
Prof. Thrun from Udacity provided the routines for a matrix class. Very helpfull to do inversions.

For visualisation I use Pylab in Python.

 Tags hw2-5

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 06:01 AM.