LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 6 (http://book.caltech.edu/bookforum/forumdisplay.php?f=135)
-   -   Question 9 - minimum # of weights (http://book.caltech.edu/bookforum/showthread.php?t=4282)

Michael Reach 05-10-2013 12:40 PM

Question 9 - minimum # of weights
 
Maybe I'm misreading the question, but I would have thought the minimum number of weights is zero. It says we can count the constants x_0^{(l)} as a unit, so why can't I build a extremely dumb neural network with 36 constants strung together sequentially as the only hidden units, 36 hidden layers? Each layer is "fully connected to the layer above it", in the sense that any layer is ever fully connected: that is, except for the constants!
Or maybe the number of weights would be 1, as the last hidden (constant) unit would have to connect to the output unit.

Elroch 05-10-2013 01:17 PM

Re: Question 9 - minimum # of weights
 
I think I see your point (as long as you are not assuming there are any inputs to the constant bias nodes), but it is clear that if only biases were connected to each layer, the network would not be connected in a practical sense (as there is no information passing between layers) or in a topological sense. You know the answer. :)

Moobb 05-13-2013 02:57 PM

Re: Question 9 - minimum # of weights
 
In the lecture on neural networks it is mentioned that the number of weights works as a reference for the VC dimension of the network. Linking to this question, is there any guidance towards how to construct a neural net? I am thinking about the balance between the number of hidden layers and the number of units per layer. My intuition is that working with units near to equally distributed across the layers increases the VC dimension, so more expressiveness against larger generalasation error? In practice one would then decide based on a generalisation error analysis?

yaser 05-13-2013 03:34 PM

Re: Question 9 - minimum # of weights
 
Quote:

Originally Posted by Moobb (Post 10811)
is there any guidance towards how to construct a neural net? I am thinking about the balance between the number of hidden layers and the number of units per layer.

Excellent question. From a theoretical point of view, it is difficult to do an exact analysis. There are bits and pieces of practical results. For instance, there was a tendency to use shallow neural networks for almost two decades, quite successfully in some applications such as computational finance. More recently, deeper neural networks have shown promise in certain applications such as computer vision. The performance seems to be domain-specific, but the jury is still out.

Elroch 05-13-2013 07:02 PM

Re: Question 9 - minimum # of weights
 
I believe I recall reading that the range of functions which can be approximated to any given accuracy with multi-layer networks is the same as the range achievable with networks with just 2 hidden layers. However, networks with one hidden layer are limited to approximating a more restricted (but also rather general) range of functions (which, on checking, I find consists of continuous functions on compact subsets of \mathbb R^n)

Of course this doesn't preclude networks with a greater number of hidden layers being better in some other definable sense. [Thinking of natural neural networks such as those in our brains, it is natural for these to be very deep, using multiple levels of processing feeding into each other].

Regarding design of neural networks, I've experimented with them on several occasions over several years and have applied rules of thumb for design. As well as generally limiting hidden layers to 2, one idea concerns how much data you need to justify using a certain complexity of neural network. While it is normal to use validation to stop training when overfitting occurs, I suspect there is no advantage to having lots of neurons if stopping occurs too early to make good use of them.

One practical formula is:

N_{hidden} \leq {{N_{examples} * E_{tolerance}}\over {N_{in}+N_{out}}}

where E_{tolerance} is the tolerable error.

Sorry I can't locate where this came from: perhaps someone else knows?

There is also theoretical work on estimating the VC-dimensions of NNs, such as "Vapnik-Chervonenkis Dimension of Neural Nets" by Peter L. Bartlett.

Moobb 05-15-2013 12:01 AM

Re: Question 9 - minimum # of weights
 
Many thanks for your answers, that's really helpful and interesting. Have googled the reference Elroch posted and found quite a few references, going through them at the moment!

Elroch 05-22-2013 04:22 AM

Re: Question 9 - minimum # of weights
 
You might find this reference interesting as well (especially as regards Yaser's comment) :)

http://yann.lecun.com/exdb/mnist/index.html


All times are GMT -7. The time now is 06:30 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.