LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 6 (http://book.caltech.edu/bookforum/forumdisplay.php?f=135)
-   -   *ANSWER* Hw6 q10 (http://book.caltech.edu/bookforum/showthread.php?t=4420)

GB449 09-21-2013 01:09 AM

*ANSWER* Hw6 q10
I get 522. If we assume two hidden layers with 18 units each, we have a total of four layers. From L0 to L1, there will be 180 weights, from L1 to L2 there will be 18*18 = 324 weights and from L2 to L3, 18 weights. Total = 522

I went with two layers of equal units as an educated guess using this reasoning from numbers: given any two positive integers whose sum is a constant, the product is maximum if the two numbers are equal (e.g. x + y = 10, max of x*y is 5*5 = 25).

Since the answer is 510, I seem to be making a mistake but not sure what that mistake is.

foruhar 08-09-2014 07:05 PM

Re: *ANSWER* Hw6 q10
got the same which is not in the provided options... :clueless:

yaser 08-10-2014 07:53 PM

Re: *ANSWER* Hw6 q10
Maybe take a look at


Post 14 seems to have resolved the issue for a number of participants.

foruhar 08-17-2014 12:30 PM

Re: *ANSWER* Hw6 q10
Thanks a lot professor! I got the right answer.

The error is introduced by the fact that the constant node does not receive any input weights.

Tackling the general problem: Assuming two hidden layers, with m and n nodes, and i input nodes, the goal is to

maximize: (i*(m-1)) + (m*(n-1)) + (n*1)

given m + n = s (s is the total number of hidden nodes allowed - in this case 36)

I think it it is intuitive to have 2 layers vs more when maximizing number of weights:
- First adding more layers creates more wasted constant nodes.
- Secondly even if there were no wasted nodes we would be creating smaller multipliers and then summing them up vs multiplying them directly. i.e. adding layers cannot increase the weights among the hidden layers (it only deprives some nodes from connecting to others by putting them more than one layer apart) and cannot increase the connections to the input and output nodes.

In general, (using my dusty calculus), Setting n = s - m, differentiating over n and setting to zero, I get:

n = (s + 2 - i)/2

Where n is the number of nodes in the second layer, s is the sum total of hidden nodes and i is the number of input nodes.

It also seems to indicates that if s + 2 is less than or equal to i, having one layer instead of two is the way to go.

allenyin 08-25-2015 05:27 PM

Re: *ANSWER* Hw6 q10
The answer still doesn't make sense to me...even when I agree with your approach.

If the arrangement that gives the most weights is (10)-18-18-(1), where the hidden nodes are distributed evenly. Then according to your formula (which I agree is correct), would give:


This is still not the answer given (510). What am I doing wrong?

All times are GMT -7. The time now is 03:05 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.