Quote:
Originally Posted by yaser
Ture. If we were applying batch mode, permutation would not change anything since the weight update is done at the end of the epoch and takes all the examples into consideration regardless of the order they were presented. In Stochastic gradient descent, however, the update is done after each example, so the order changes the outcome. These permutations ensure that the order is randomized so we get the benefits of randomness that were mentioned briefly in Lecture 9.

I was just about to delete my post as I figured out my error. I missed the "stochastic" part and had not yet watched lecture 10. That's what I get for trying to solve the homework before learning all the material =) Thanks so much for your quick reply