Maybe I'm misreading the question, but I would have thought the minimum number of weights is zero. It says we can count the constants

as a unit, so why can't I build a extremely dumb neural network with 36 constants strung together sequentially as the only hidden units, 36 hidden layers? Each layer is "fully connected to the layer above it", in the sense that any layer is ever fully connected: that is, except for the constants!

Or maybe the number of weights would be 1, as the last hidden (constant) unit would have to connect to the output unit.