LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 5 (http://book.caltech.edu/bookforum/forumdisplay.php?f=134)
-   -   Hw5 q8 data permutation (http://book.caltech.edu/bookforum/showthread.php?t=4263)

marek 05-04-2013 06:03 PM

Hw5 q8 data permutation
 
I must be missed something, but I do not understand why we permute the data.

\nabla E_{in} = -\frac{1}{N}\sum_{i=i}^N \frac{y_n x_n}{1+e^{y_nw^{\top}x_n}} treats each data point separately, but then sums them all up. Thus, even if we do permute the data points, in the end it all gets combined together in this sum. What am I overlooking?

yaser 05-04-2013 07:21 PM

Re: Hw5 q8 data permutation
 
Quote:

Originally Posted by marek (Post 10699)
I must be missed something, but I do not understand why we permute the data.

\nabla E_{in} = -\frac{1}{N}\sum_{i=i}^N \frac{y_n x_n}{1+e^{y_nw^{\top}x_n}} treats each data point separately, but then sums them all up. Thus, even if we do permute the data points, in the end it all gets combined together in this sum. What am I overlooking?

Ture. If we were applying batch mode, permutation would not change anything since the weight update is done at the end of the epoch and takes all the examples into consideration regardless of the order they were presented. In Stochastic gradient descent, however, the update is done after each example, so the order changes the outcome. These permutations ensure that the order is randomized so we get the benefits of randomness that were mentioned briefly in Lecture 9.

marek 05-04-2013 08:03 PM

Re: Hw5 q8 data permutation
 
Quote:

Originally Posted by yaser (Post 10700)
Ture. If we were applying batch mode, permutation would not change anything since the weight update is done at the end of the epoch and takes all the examples into consideration regardless of the order they were presented. In Stochastic gradient descent, however, the update is done after each example, so the order changes the outcome. These permutations ensure that the order is randomized so we get the benefits of randomness that were mentioned briefly in Lecture 9.

I was just about to delete my post as I figured out my error. I missed the "stochastic" part and had not yet watched lecture 10. That's what I get for trying to solve the homework before learning all the material =) Thanks so much for your quick reply


All times are GMT -7. The time now is 02:17 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.