Perceptron: should w_0 (bias) be updated?
The zeroth term is just a clever way to simplify the notation by adding the threshold/bias term as another term in the sum. The value of the threshold/bias, however, is not an observed quantity, though  it was chosen. So I am assuming that when updating the weights, we should NOT update the zeroth weight (the threshold/bias). Is this correct?
Thanks, Fred 
Re: Perceptron: should w_0 (bias) be updated?
Actually, w0 never converges to anything meaningful using the w = w + y*x update rule because x0 = 1 for all the data points and y is constrained to be either 1 or +1... so w0 just flips between 1 and zero forever without converging.
if you print out your w0 values as the PLA algorithm progresses, you can see this happening. 
Re: Perceptron: should w_0 (bias) be updated?
Quote:
Code:
4.0 6 An interesting question would be what is the distribution of the values w0 can reach. It's not like a simple random walk, I don't think, because the farther it gets out the less likely it will step farther from 0 I believe. Maybe a Gaussian? That doesn't seem exact eitherlooking at a few of these, they seem to be skewed rather than symmetric, but I didn't look at a lot of samples so this could just be normal random variation. 
Re: Perceptron: should w_0 (bias) be updated?
Yeah, I saw the same thing as I kept experimenting with it.
So the problem is that w0 does a random walk over the integers, without ever converging to a meaningful value, at least if you use a starting value of 0. Since w1 and w2 determine the orientation of the dividing line between the positive and negative points, and w0 determines it's location relative to the origin, it seems to me that this update rule can never find a good solution if the true dividing line does not pass through (x1=0,x2=0). 
Re: Perceptron: should w_0 (bias) be updated?
Quote:

Re: Perceptron: should w_0 (bias) be updated?
Right, so the obvious solution is to normalize w after the update. This causes w0 to converge along with w1 and w2. I actually implemented this in my solution for the homework submission, but it has an effect on the on number of iterations that are required for a given initial dividing line, depending on how far from the origin it is. In general, after implementing normalization, the number of required iterations required for convergence went up. As a result I got different answers for 7 and 9.

Re: Perceptron: should w_0 (bias) be updated?
It is a mistake to talk about converging. It is the vector w that converges. w should not be normalized after each update because doing so alters the relative scale of the error adjustments performed with each iteration. I suspect that this could result in cases where convergence would fail to occur even for a linearly separable training vector.

Re: Perceptron: should w_0 (bias) be updated?
I found that for N=100, the average number of iterations to get the final w was around 2000. On 4 out of 1000 cases, the number of iterations exceeded 100,000. Did others find similar behavior? Or is there a bug in my code?
For N=10, it took about 30 iterations. 
Re: Perceptron: should w_0 (bias) be updated?
Since I've not seen your code, I cannot say with certainty that it has a defect; however, the results you have reported indicate an extremely high probability that such is the case. It is also possible, though highly improbable, that you were extraordinarily unlucky and repeatedly drew random values that skewed your results.

All times are GMT 7. The time now is 12:53 PM. 
Powered by vBulletin® Version 3.8.3
Copyright ©2000  2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. AbuMostafa, Malik MagdonIsmail, and HsuanTien Lin, and participants in the Learning From Data MOOC by Yaser S. AbuMostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.