Quote:
Originally Posted by mvellon
Each epoch will require a set of N "executions" of the SGD algorithm, running through all the points of its permuted data set. In each of these executions I start with the weight vector  initialized to zeros.
|
In each of these executions, you start with the weight vector

that came out of the previous execution. Only the initial weights at the very beginning of the algorithm are set to zero.
Quote:
I fear I made a wrong turn in Albuquerque.
|