LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 6

Reply
 
Thread Tools Display Modes
  #1  
Old 08-15-2012, 10:39 PM
TonySuarez TonySuarez is offline
Member
 
Join Date: Jul 2012
Location: Lisboa, Portugal
Posts: 35
Default Clarification on HW6-Q8

In Q8 it is asked the "...closest to the total number of operations required in a single iteration of backpropagation (using SGD on one data point)".

My question is if we have to account only the backpropagation step or, instead, one complete iteration of the backpropagation algorithm, which includes the forward propagation, the backpropagation and the SGD updating steps.

Thanks in advance.

TS
Reply With Quote
  #2  
Old 08-15-2012, 10:48 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,476
Default Re: Clarification on HW6-Q8

Quote:
Originally Posted by TonySuarez View Post
In Q8 it is asked the "...closest to the total number of operations required in a single iteration of backpropagation (using SGD on one data point)".

My question is if we have to account only the backpropagation step or, instead, one complete iteration of the backpropagation algorithm, which includes the forward propagation, the backpropagation and the SGD updating steps.
Forward, backward, and update.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 08-16-2012, 04:49 AM
dvs79 dvs79 is offline
Member
 
Join Date: Jul 2012
Location: Moscow, Russia
Posts: 24
Default Re: Clarification on HW6-Q8

Did I understand right that dimensions for input and hidden layer are given without constant term? (so, that in input layer, e.g., we have 5 +1 (const) nodes).
Reply With Quote
  #4  
Old 08-16-2012, 04:53 AM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,476
Default Re: Clarification on HW6-Q8

Quote:
Originally Posted by dvs79 View Post
Did I understand right that dimensions for input and hidden layer are given without constant term? (so, that in input layer, e.g., we have 5 +1 (const) nodes).
Correct, as the convention has the index i going from 0 to d^{(l-1)}.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 08-16-2012, 06:49 AM
TonySuarez TonySuarez is offline
Member
 
Join Date: Jul 2012
Location: Lisboa, Portugal
Posts: 35
Default Re: Clarification on HW6-Q8

Thank you very much Professor for answering "on-the-fly" (as always).
TS
Reply With Quote
  #6  
Old 08-17-2012, 02:05 AM
invis invis is offline
Senior Member
 
Join Date: Jul 2012
Posts: 50
Default Re: Clarification on HW6-Q8

Only product of the form ... w_{ij}^{(l)} \delta_j^{(l)}... count as operations.

On backward we have: \delta_i^{(l-1)} = (1-(x_i^{(l-1)})^2) \sum_{j=1}^{d^{(l)}}w_{ij}^{(l)} \delta_j^{(l)}

d^{(2)}=1 and d^{(1)}=3 and +1 (constant) for layer (1). So we have the same 4 operations for 2-1 layer, and obviously 18 operations for 1-0 layer. Or we dont need to compute \delta for constants ?
Reply With Quote
  #7  
Old 08-17-2012, 07:07 AM
dvs79 dvs79 is offline
Member
 
Join Date: Jul 2012
Location: Moscow, Russia
Posts: 24
Default Re: Clarification on HW6-Q8

yes, we don't need to:

1. compute delta for the output (because it doesn't need any of the operations, counted as operation in this certain task)

2. compute deltas for constants (because they're constants)

3. compute deltas for input (because they're just features (x), and delta is a derivative of the error with respect to s)

So for computing deltas you only need 3 operations.
Reply With Quote
  #8  
Old 08-17-2012, 07:27 AM
invis invis is offline
Senior Member
 
Join Date: Jul 2012
Posts: 50
Default Re: Clarification on HW6-Q8

Thank you very much ! Enlgish isnt my native language and its hard to me to catch 100% information from video. Thanks for clarifying !
Reply With Quote
  #9  
Old 05-11-2013, 02:26 PM
marek marek is offline
Member
 
Join Date: Apr 2013
Posts: 31
Default Re: Clarification on HW6-Q8

Quote:
Originally Posted by dvs79 View Post
yes, we don't need to:

1. compute delta for the output (because it doesn't need any of the operations, counted as operation in this certain task)

2. compute deltas for constants (because they're constants)

3. compute deltas for input (because they're just features (x), and delta is a derivative of the error with respect to s)

So for computing deltas you only need 3 operations.
Thank you! Your answer preempted one of my questions. Specifically about the calculation of the final layer delta, as technically none of it seems to count as an operation under our definition.
Reply With Quote
Reply

Tags
clarification, hw6-q7

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 08:41 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.