LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 6

Reply
 
Thread Tools Display Modes
  #1  
Old 05-10-2013, 11:40 AM
Michael Reach Michael Reach is offline
Senior Member
 
Join Date: Apr 2013
Location: Baltimore, Maryland, USA
Posts: 71
Default Question 9 - minimum # of weights

Maybe I'm misreading the question, but I would have thought the minimum number of weights is zero. It says we can count the constants x_0^{(l)} as a unit, so why can't I build a extremely dumb neural network with 36 constants strung together sequentially as the only hidden units, 36 hidden layers? Each layer is "fully connected to the layer above it", in the sense that any layer is ever fully connected: that is, except for the constants!
Or maybe the number of weights would be 1, as the last hidden (constant) unit would have to connect to the output unit.
Reply With Quote
  #2  
Old 05-10-2013, 12:17 PM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: Question 9 - minimum # of weights

I think I see your point (as long as you are not assuming there are any inputs to the constant bias nodes), but it is clear that if only biases were connected to each layer, the network would not be connected in a practical sense (as there is no information passing between layers) or in a topological sense. You know the answer.
Reply With Quote
  #3  
Old 05-13-2013, 01:57 PM
Moobb Moobb is offline
Junior Member
 
Join Date: Apr 2013
Posts: 9
Default Re: Question 9 - minimum # of weights

In the lecture on neural networks it is mentioned that the number of weights works as a reference for the VC dimension of the network. Linking to this question, is there any guidance towards how to construct a neural net? I am thinking about the balance between the number of hidden layers and the number of units per layer. My intuition is that working with units near to equally distributed across the layers increases the VC dimension, so more expressiveness against larger generalasation error? In practice one would then decide based on a generalisation error analysis?
Reply With Quote
  #4  
Old 05-13-2013, 02:34 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,474
Default Re: Question 9 - minimum # of weights

Quote:
Originally Posted by Moobb View Post
is there any guidance towards how to construct a neural net? I am thinking about the balance between the number of hidden layers and the number of units per layer.
Excellent question. From a theoretical point of view, it is difficult to do an exact analysis. There are bits and pieces of practical results. For instance, there was a tendency to use shallow neural networks for almost two decades, quite successfully in some applications such as computational finance. More recently, deeper neural networks have shown promise in certain applications such as computer vision. The performance seems to be domain-specific, but the jury is still out.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #5  
Old 05-13-2013, 06:02 PM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: Question 9 - minimum # of weights

I believe I recall reading that the range of functions which can be approximated to any given accuracy with multi-layer networks is the same as the range achievable with networks with just 2 hidden layers. However, networks with one hidden layer are limited to approximating a more restricted (but also rather general) range of functions (which, on checking, I find consists of continuous functions on compact subsets of \mathbb R^n)

Of course this doesn't preclude networks with a greater number of hidden layers being better in some other definable sense. [Thinking of natural neural networks such as those in our brains, it is natural for these to be very deep, using multiple levels of processing feeding into each other].

Regarding design of neural networks, I've experimented with them on several occasions over several years and have applied rules of thumb for design. As well as generally limiting hidden layers to 2, one idea concerns how much data you need to justify using a certain complexity of neural network. While it is normal to use validation to stop training when overfitting occurs, I suspect there is no advantage to having lots of neurons if stopping occurs too early to make good use of them.

One practical formula is:

N_{hidden} \leq {{N_{examples} * E_{tolerance}}\over {N_{in}+N_{out}}}

where E_{tolerance} is the tolerable error.

Sorry I can't locate where this came from: perhaps someone else knows?

There is also theoretical work on estimating the VC-dimensions of NNs, such as "Vapnik-Chervonenkis Dimension of Neural Nets" by Peter L. Bartlett.
Reply With Quote
  #6  
Old 05-14-2013, 11:01 PM
Moobb Moobb is offline
Junior Member
 
Join Date: Apr 2013
Posts: 9
Default Re: Question 9 - minimum # of weights

Many thanks for your answers, that's really helpful and interesting. Have googled the reference Elroch posted and found quite a few references, going through them at the moment!
Reply With Quote
  #7  
Old 05-22-2013, 03:22 AM
Elroch Elroch is offline
Invited Guest
 
Join Date: Mar 2013
Posts: 143
Default Re: Question 9 - minimum # of weights

You might find this reference interesting as well (especially as regards Yaser's comment)

http://yann.lecun.com/exdb/mnist/index.html
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 09:20 PM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.