Quote:
Originally Posted by hhprogram
follow up on this question. So, if the final ensemble learned hypothesis set has weights on all the original individual hypothesis sets  does that mean the VC dimension is the union of all the individual hypothesis sets?
It seems in general that ensemble learning might run into the VC dimension / generalization problem (ie similar to 'snooping' when you try a model and then see it doesn't perform well, and then try another model etc..) but since it is used a lot in practice  I'm curious to learn why it doesn't suffer from generalization problems. After doing a little research  is it because generally when using the ensemble learning the individual hypothesis are relatively simple and thus have a low VC dimension (and also perform ok but not great by themselves) therefore, when combining simple models together the VC dimension doesn't get too ridiculous? Thanks

In general it might be bigger than the union. For instance, a linear ensemble hypothesis set includes each individual hypothesis as special cases. So its VC dimension is bigger. Hope this helps.