Re: *ANSWER* Hw6 q10
Thanks a lot professor! I got the right answer.
The error is introduced by the fact that the constant node does not receive any input weights.
Tackling the general problem: Assuming two hidden layers, with m and n nodes, and i input nodes, the goal is to
maximize: (i*(m1)) + (m*(n1)) + (n*1)
given m + n = s (s is the total number of hidden nodes allowed  in this case 36)
I think it it is intuitive to have 2 layers vs more when maximizing number of weights:
 First adding more layers creates more wasted constant nodes.
 Secondly even if there were no wasted nodes we would be creating smaller multipliers and then summing them up vs multiplying them directly. i.e. adding layers cannot increase the weights among the hidden layers (it only deprives some nodes from connecting to others by putting them more than one layer apart) and cannot increase the connections to the input and output nodes.
In general, (using my dusty calculus), Setting n = s  m, differentiating over n and setting to zero, I get:
n = (s + 2  i)/2
Where n is the number of nodes in the second layer, s is the sum total of hidden nodes and i is the number of input nodes.
It also seems to indicates that if s + 2 is less than or equal to i, having one layer instead of two is the way to go.
