Dear Yaser,

Thanks a lot for your very interesting and useful course.

I have a question on HW1,Q9.

I implemented PLA. The implementation looks correct: I answered Q7,Q8, and Q10 correctly. More importantly, I can see from a visualization (see below) that the PLA does its job. Here, dots are data, green line corresponds to the target function f, and 1000 yellow lines correspond to g_i, where i=1:1000.

In the first case, the average # of iterations is k1 (relatively small), in the second case it is k2 (relatively large), and k2 is approximately equal to 10*k1. Based on the first case, I had to chose one out 5 possible answers, while, based on the second case, I had to chose another. My (submitted) choice turned out to be wrong.

On the other hand, from the above figures, it is intuitively clear that to find g in the 2nd case is more difficult (yellow area is smaller), and, therefore, it should take more iterations for the PLA to converge. In general, isn't it true that the number of iterations significantly depends on the data? Say, it could be k but also it could be 10*k, depending on the sample x1,...,xN?

Am I missing something?