
#1




A Modification to the Learning Diagram
How does the learning problem change if the training samples are drawn from an indexed set of distributions? That is, suppose our training samples, x and y, are drawn from:
where Suppose I am trying to classify groups of pixels in images. I have 10 images that I can draw groups of pixels from. The images are indexed by theta, with k=10. How do we account for the grouping of the training data? What strategies exist to build a "good" (unbiased) training set in cases like this? 
#2




Re: A Modification to the Learning Diagram
There are perhaps additional details I might need to give a correct answer for your situation. But as I understand your problem, I would try to produce a training set that is representative of your images, perhaps sampling from all or from a (ideally) unbiased subset. If your sampling is representative of actual use, your E[in] and E[out] relationships should all hold true.

#3




Re: A Modification to the Learning Diagram
This is an interesting example. What you actually describe is a restriction of the paradigm from a general to one that is of the form you mention which arises by mixing 10 different distributions. This additional knowledge about the nature of your problem can inform how to choose your hypothesis set, and one appropriate model is (appropriately) called a mixture model tailored for situations like this.
I did not understand the question about the training data. Typically the training data is given. Or is your task to develop an algorithm to separate the observed 'signal' into the components coming from each image. This is called a source separation problem, and is different from a multiclass problem. In a multiclass problem, each data point belongs to one of the classes and the goal is to determine which. Quote:
__________________
Have faith in probability 
Thread Tools  
Display Modes  

