#1




HW2.5: does location of split boundaries matter?
Hi,
First, a piece of background. I'm a 41 year old software engineer who is loooong out of school and would like to learn a bit about machine learning. Having said that: 1) What software language/tools are people in general using for their homework experiments? I saw one reference to matlab, but I'm not sure I can get that. I'm not using anything visual right now, which makes things a bit tougher, so if someone had a recommendation, that would be great. 2) Re: hw25: If I understood the terminology correctly, I'm supposed to pick 2 random points in the x,y place which are enclosed within the square (1, 1,), (1, 1), (1, 1,), and (1, 1). I then connect the dots, get a line, and that is my separator, and then I obtain all my data and run different samples based on this line. My problem though is this: I've noticed in my program that if the area is 50/50 (I can force this by choosing one point through (0,0), I obtain one solution, but when the area has a much different split (say 80/20), I get a different answer. Is this expected, or do I have a bug in my program? Thanks in advance 
#2




Re: HW2.5: does location of split boundaries matter?
I use Octave, which is a free source analog of Matlab.
I was also wondering about the dependence of my answer o particular function f. The question is worded in such a way that it seems we only need to determine f once, and then generate 1000 random data sets and 1000 approximations g for that one f. Is this is indeed the case, I also observe significant differences in the average insample and outofsample errors depending on f. But if I generate randomly 1000 f's, form one random set for each, and calculate one approximation g  the variability in average results practically disappears. 
#3




Re: HW2.5: does location of split boundaries matter?
1) I recommend GNU Octave, which is (a) free, (b) mostly compatible with MATLAB, (c) commandline oriented, (d) multiplatform and (e) used in several Coursera.org courses, including their upcoming Machine Learning class. I just picked it up last week and I found it pretty easy once I got over the initial learning curve.
I found this tutorial pretty useful: OctaveMatlabTutorial 2) I think you got it right, and I would not be surprised if results are different when the proportion between the two areas vary. That should not be a problem, since the assignment calls for running multiple experiments (each with a different separator line and different training points.) 
#4




Re: HW2.5: does location of split boundaries matter?
I kinda expected the difference in results too based on some points made in lecture 3. So I'm not surprised by this.
But I have the same issue as ellka. In the precursor part of the problem, it says to set up the boundary. The "rinse, lather, repeat" step is only after this onetime boundary step is set up. Without hearing anything from the instructors, I'm going to randomize the boundary in the N=1000 steps and test my results (I haven't done this yet). But I hope I get some clarification! Thanks! 
#5




Re: HW2.5: does location of split boundaries matter?
Hmmm...I just ran the tests where I randomized where the boundary line was before I ran each N sample. I now get a consistent value (which I was sure I would), but the value isn't what I would consider to be "close" to any of the answers given.
Sure, I could choose the best one, but either I have a bug in my python shell (nongui python, lol) or I'm misinterpreting the question. I probably should take the time to learn octave, but sadly I don't even know matlab and it'll be a bit tough to get this assignment done on time. Oh well. At least I'll get to learn stuff. 
#6




Re: HW2.5: does location of split boundaries matter?
For the first assignment I just used Excel. LibreOffice Calc is an open source version if you don't like Microsoft. I don't know if this will be sophisticated enough to do the other assignments.

#7




Re: HW2.5: does location of split boundaries matter?
You can use either R or Octave.

#8




Re: HW2.5: does location of split boundaries matter?
I am glad you asked this question, the responses have been helpful.
I wrote a java program for first homework assignment (various classes for the problems) using the Eclipse ide; it worked pretty well although, for the PLA exercise, there were some subtle bugs in the logic that were hard to trace down; I ended up writing a second program for this exercise which is visual: it shows a panel with the target function and the random points. You can step through the iterations or just have it run. If you step through, you will see the current hypothesis function (characterized as a line based on the current weights) and which points are misclassified at each step. Not sure about the new homework as java doesn't supply much in the way of matrix support, but I'm looking for matrix packages. 
#9




Re: HW2.5: does location of split boundaries matter?
I also had trouble with the matrix calculations. I have access to Matlab, but I don't know how to use it. I prefer to do these exercises in ObjectiveC and use the iPhone simulator, and fortunately the iPhone SDK has the CBLAS and LAPACK libraries available, which are widely available for gcc as well.
I used the cblas_dgemm function for matrix multiplication and the combination of dgetrf_ and dgetri_ (from LAPACK) for calculating the inverse of a matrix. It took a while to figure out what values to put into the multitude of parameters that these functions require, but I managed it. It sure took a lot less time than writing out the matrix operations from scratch, though! 
#10




Re: HW2.5: does location of split boundaries matter?
I am using Python for the homework.
I learned this while taking the course from Udacity in AI for robotics. Prof. Thrun from Udacity provided the routines for a matrix class. Very helpfull to do inversions. For visualisation I use Pylab in Python. 
Tags 
hw25 
Thread Tools  
Display Modes  

