LFD Book Forum How to handle ambiguous target function, f

#1
01-12-2013, 10:52 AM
 DaveS Junior Member Join Date: Jan 2013 Posts: 2
How to handle ambiguous target function, f

There are many occasions in engineering when the unknown target function f is being designed while a system to predict f is also being designed.

Let me give a concrete example. Suppose I am building a robotic system that builds widgets. I will deploy a machine vision system on my widget machine to classify different widgets as they are being built. I plan to use supervised learning to take features derived from possible widgets and learn a target function that will predict which widget the machine vision system is looking at. Unfortunately, my machine vision system isn't the highest priority part of the widget system. The systems engineer thinks there may be some shadows that I will have to contend with, and there may exist some bizarre lighting, which he has hasn't really nailed down yet. In other words, I'm not yet able to generate training data from the true f (which would be the final system that my systems engineer decides on). Still, my systems engineer wants me to tell him what an optimal f would be. That is, if I could choose an f to maximize the chances that my g approximates f really well, what should that f look like? (Let's assume that learning is necessary -- that is, no matter what we do f will not be something we can easily pin down mathematically)

So what do I do? Suppose that I go into the lab and build some mock-up machine vision systems. This is like constructing an f_possible that anticipates the true f. I generate samples from the mock-up, learn a g, and report back how well my g approximates f_possible. How can I take the results of this process to inform what f should be? It seems like this is the inverse problem from supervised learning. That is, I want to design my f so that g will approximate it very well.

I'm thinking out loud a bit here -- I'm trying to map this problem to the learning diagram and understand where it fits in. I appreciate any insight that might help clarify my thinking.
#2
01-12-2013, 11:58 AM
 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 595
Re: How to handle ambiguous target function, f

You are asking deep questions that you will address a little later in the book/course, but here is the general idea.

The easiest s to learn are 'simple' s (we call this a low deterministic noise setting). The easiest data to learn from are noiseless examples of the input-output relationship (we call this a low stochastic noise setting). And greater success comes with a larger number of data points (intuitive). So in terms of designing your design process:

1) Make sure the different widgets are easily distinguishible, so the widgets have easily collectible features which differ signifficantly from one type of widget to another. This means that, based on these features, a simple function would exist that can distinguish one widget from another (low deterministic noise), even though apriori you may not be able to pin this down.

2) Make sure that your features can be collected as noiselessly as possible (so for example choose features that are easy to capture, and, to the extent possible, immune to occlusion/shadowing, rotation of the widget, perturbations in camera position, etc). To the extent that this is possible, you will have a low stochastic noise setting.

3) Make sure that you have some automated framework in which it is easy to generate the labeled data, so that you can generate many training data points. So for example, this might mean that you can run the machine (at least early on) in different modes, where in each mode it only generates one type of widget; that will make it easy to generate lots of labelled data.

If these three things are in place you have poised yourself for success.

Quote:
 Originally Posted by DaveS There are many occasions in engineering when the unknown target function f is being designed while a system to predict f is also being designed. Let me give a concrete example. Suppose I am building a robotic system that builds widgets. I will deploy a machine vision system on my widget machine to classify different widgets as they are being built. I plan to use supervised learning to take features derived from possible widgets and learn a target function that will predict which widget the machine vision system is looking at. Unfortunately, my machine vision system isn't the highest priority part of the widget system. The systems engineer thinks there may be some shadows that I will have to contend with, and there may exist some bizarre lighting, which he has hasn't really nailed down yet. In other words, I'm not yet able to generate training data from the true f (which would be the final system that my systems engineer decides on). Still, my systems engineer wants me to tell him what an optimal f would be. That is, if I could choose an f to maximize the chances that my g approximates f really well, what should that f look like? (Let's assume that learning is necessary -- that is, no matter what we do f will not be something we can easily pin down mathematically) So what do I do? Suppose that I go into the lab and build some mock-up machine vision systems. This is like constructing an f_possible that anticipates the true f. I generate samples from the mock-up, learn a g, and report back how well my g approximates f_possible. How can I take the results of this process to inform what f should be? It seems like this is the inverse problem from supervised learning. That is, I want to design my f so that g will approximate it very well. I'm thinking out loud a bit here -- I'm trying to map this problem to the learning diagram and understand where it fits in. I appreciate any insight that might help clarify my thinking.
__________________
Have faith in probability

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 06:51 AM.