![]() |
#1
|
|||
|
|||
![]()
In Q8 it is asked the "...closest to the total number of operations required in a single iteration of backpropagation (using SGD on one data point)".
My question is if we have to account only the backpropagation step or, instead, one complete iteration of the backpropagation algorithm, which includes the forward propagation, the backpropagation and the SGD updating steps. Thanks in advance. TS |
#2
|
||||
|
||||
![]() Quote:
__________________
Where everyone thinks alike, no one thinks very much |
#3
|
|||
|
|||
![]()
Did I understand right that dimensions for input and hidden layer are given without constant term? (so, that in input layer, e.g., we have 5 +1 (const) nodes).
|
#4
|
||||
|
||||
![]() Quote:
![]() ![]() ![]()
__________________
Where everyone thinks alike, no one thinks very much |
#5
|
|||
|
|||
![]()
Thank you very much Professor for answering "on-the-fly" (as always).
TS |
#6
|
|||
|
|||
![]() |
#7
|
|||
|
|||
![]()
yes, we don't need to:
1. compute delta for the output (because it doesn't need any of the operations, counted as operation in this certain task) 2. compute deltas for constants (because they're constants) 3. compute deltas for input (because they're just features (x), and delta is a derivative of the error with respect to s) So for computing deltas you only need 3 operations. |
#8
|
|||
|
|||
![]()
Thank you very much ! Enlgish isnt my native language and its hard to me to catch 100% information from video. Thanks for clarifying !
|
#9
|
|||
|
|||
![]() Quote:
|
![]() |
Tags |
clarification, hw6-q7 |
Thread Tools | |
Display Modes | |
|
|