Quote:
Originally Posted by yaser
That would be (the target function). The symbol is reserved for the final hypothesis that the PLA will produce (which should be close to ).
The initial function must be a perceptron rather than a random assignment of 's.
If you start with a zero weight vector, and take , pick any point for the first iteration. When you update the weight vector, uses the target , so that won't be zero.

Thanks for the clarification. That helped quite a bit.
I think it would be interesting if we can all input our actual numbers and
you later show a histogram of what people entered on their homework
solutions. ;)
I'm shocked by the speed with which PLA converges. I never would have
guessed that until I actually coded it up. This is a very interesting and
intellectually satisfying exercise!
I'm having a hard time deciding how to answer the multiple choice Q 710.
The answer depends upon if I use log or linear scaling.
Aren't CS algorithm efficiencies usually classified in log scaling?
Or am I overthinking this?
If an algorithm always converges would the Pr(f(x) ne g(x)) = 0?