LFD Book Forum A Modification to the Learning Diagram

#1
04-20-2012, 06:39 AM
 DASteines Junior Member Join Date: Apr 2012 Posts: 1
A Modification to the Learning Diagram

How does the learning problem change if the training samples are drawn from an indexed set of distributions? That is, suppose our training samples, x and y, are drawn from:

where

Suppose I am trying to classify groups of pixels in images. I have 10 images that I can draw groups of pixels from. The images are indexed by theta, with k=10. How do we account for the grouping of the training data? What strategies exist to build a "good" (unbiased) training set in cases like this?
#2
04-20-2012, 01:04 PM
 dudefromdayton Invited Guest Join Date: Apr 2012 Posts: 140
Re: A Modification to the Learning Diagram

There are perhaps additional details I might need to give a correct answer for your situation. But as I understand your problem, I would try to produce a training set that is representative of your images, perhaps sampling from all or from a (ideally) unbiased subset. If your sampling is representative of actual use, your E[in] and E[out] relationships should all hold true.
#3
04-22-2012, 04:20 PM
 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595
Re: A Modification to the Learning Diagram

This is an interesting example. What you actually describe is a restriction of the paradigm from a general to one that is of the form you mention which arises by mixing 10 different distributions. This additional knowledge about the nature of your problem can inform how to choose your hypothesis set, and one appropriate model is (appropriately) called a mixture model tailored for situations like this.

I did not understand the question about the training data. Typically the training data is given. Or is your task to develop an algorithm to separate the observed 'signal' into the components coming from each image. This is called a source separation problem, and is different from a multi-class problem. In a multi-class problem, each data point belongs to one of the classes and the goal is to determine which.

Quote:
 Originally Posted by DASteines How does the learning problem change if the training samples are drawn from an indexed set of distributions? That is, suppose our training samples, x and y, are drawn from: where Suppose I am trying to classify groups of pixels in images. I have 10 images that I can draw groups of pixels from. The images are indexed by theta, with k=10. How do we account for the grouping of the training data? What strategies exist to build a "good" (unbiased) training set in cases like this?
__________________
Have faith in probability

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 12:20 AM.