Thanks for the insightful replies! Indeed I was choosing the first misclassified point every time. I updated my code to choose randomly and produced the following results:

The tail past 1000 is visibly smaller and the mean is much closer to the correct answer of 100 when the misclassified point is randomly chosen.