LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 6 (http://book.caltech.edu/bookforum/forumdisplay.php?f=135)
-   -   Clarification on HW6-Q8 (http://book.caltech.edu/bookforum/showthread.php?t=1014)

TonySuarez 08-15-2012 10:39 PM

Clarification on HW6-Q8
 
In Q8 it is asked the "...closest to the total number of operations required in a single iteration of backpropagation (using SGD on one data point)".

My question is if we have to account only the backpropagation step or, instead, one complete iteration of the backpropagation algorithm, which includes the forward propagation, the backpropagation and the SGD updating steps.

Thanks in advance.

TS

yaser 08-15-2012 10:48 PM

Re: Clarification on HW6-Q8
 
Quote:

Originally Posted by TonySuarez (Post 4099)
In Q8 it is asked the "...closest to the total number of operations required in a single iteration of backpropagation (using SGD on one data point)".

My question is if we have to account only the backpropagation step or, instead, one complete iteration of the backpropagation algorithm, which includes the forward propagation, the backpropagation and the SGD updating steps.

Forward, backward, and update.

dvs79 08-16-2012 04:49 AM

Re: Clarification on HW6-Q8
 
Did I understand right that dimensions for input and hidden layer are given without constant term? (so, that in input layer, e.g., we have 5 +1 (const) nodes).

yaser 08-16-2012 04:53 AM

Re: Clarification on HW6-Q8
 
Quote:

Originally Posted by dvs79 (Post 4104)
Did I understand right that dimensions for input and hidden layer are given without constant term? (so, that in input layer, e.g., we have 5 +1 (const) nodes).

Correct, as the convention has the index i going from 0 to d^{(l-1)}.

TonySuarez 08-16-2012 06:49 AM

Re: Clarification on HW6-Q8
 
Thank you very much Professor for answering "on-the-fly" (as always).
TS

invis 08-17-2012 02:05 AM

Re: Clarification on HW6-Q8
 
Only product of the form ... w_{ij}^{(l)} \delta_j^{(l)}... count as operations.

On backward we have: \delta_i^{(l-1)} = (1-(x_i^{(l-1)})^2) \sum_{j=1}^{d^{(l)}}w_{ij}^{(l)} \delta_j^{(l)}

d^{(2)}=1 and d^{(1)}=3 and +1 (constant) for layer (1). So we have the same 4 operations for 2-1 layer, and obviously 18 operations for 1-0 layer. Or we dont need to compute \delta for constants ?

dvs79 08-17-2012 07:07 AM

Re: Clarification on HW6-Q8
 
yes, we don't need to:

1. compute delta for the output (because it doesn't need any of the operations, counted as operation in this certain task)

2. compute deltas for constants (because they're constants)

3. compute deltas for input (because they're just features (x), and delta is a derivative of the error with respect to s)

So for computing deltas you only need 3 operations.

invis 08-17-2012 07:27 AM

Re: Clarification on HW6-Q8
 
Thank you very much ! Enlgish isnt my native language and its hard to me to catch 100% information from video. Thanks for clarifying !

marek 05-11-2013 02:26 PM

Re: Clarification on HW6-Q8
 
Quote:

Originally Posted by dvs79 (Post 4135)
yes, we don't need to:

1. compute delta for the output (because it doesn't need any of the operations, counted as operation in this certain task)

2. compute deltas for constants (because they're constants)

3. compute deltas for input (because they're just features (x), and delta is a derivative of the error with respect to s)

So for computing deltas you only need 3 operations.

Thank you! Your answer preempted one of my questions. Specifically about the calculation of the final layer delta, as technically none of it seems to count as an operation under our definition.


All times are GMT -7. The time now is 09:47 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.