LFD Book Forum *ANSWER* question 2

#1
04-09-2013, 10:35 AM
 arun.n Junior Member Join Date: Apr 2013 Posts: 3

I couldnt understand the reasoning behind the answer to question 2.

Its clear that detecting fraudulent credit card transactions is a machine learning problem because it satisfies the three requirements needed for a problem to be solved using machine learning.

I dont understand how finding the optimal cycle for traffic lights is a machine learning problem. Where are you going to get the data from. Are you going to experiment on busy traffic intersections to collect data? and what is the pattern.

I actually thought that modelling the time it takes for falling objects is a machine learning problem. Yes it can be modeled mathematically, but you can be lazy about it and let the computer come up with the model, further there is the issue of friction with air, changing wind conditions which may introduce some uncertainty into the mathematical model.

I know that this is a subjective question and depends on how you look at it.
#2
04-09-2013, 01:36 PM
 vikkamath Junior Member Join Date: Apr 2013 Posts: 7
Re: question 2

Ok. Great Question. Let's look at this from a point of view of what we already know. i.e the three points in the 'essence of machine learning'.

Problem : Optimal Cycle for Traffic Lights
1. A pattern exists: If you think about it intuitively, there should be some correlation (speaking loosely) between the traffic at a light on the intersection with the amount of time that a car spends in that light. Right? If there are a lot of cars, you'd expect that the light is green for a far greater duration than a light in which there are only one or two cars.

2. We cannot pin in down mathematically: You could, with a great amount of difficulty try and pin in down mathematically, but heck, that's what learning's for right? You make the machine do the dirty work and you collect the check at the end of the day.

3. We have data on it:
It's quite evident that collecting data for this kind of problem isn't too much of a big deal. You could put a bunch of cameras and have some grad students or interns write some computer vision program to see how many cars there are at a light and measure the duration of time that that light is green and so on (this is just one example of how you might go about collecting data - it's eventually up to you).

Problem: Calculating how much time a body takes to fall to the ground

1. A pattern exists :
Newtons laws of motions and all that have showed us that a pattern does indeed exist and that..
2. It can be described mathematically:
Since there is a closed form solution to your problem, wouldn't you rather use that? Learning from data inherently has, associated with it, a generalization error. There are so many things that could possibly go wrong - you might choose the wrong hypothesis set (you might think that the relationship between the time of flight and the height of the object from the ground is linear when it is actually quadratic). Do you see what I am saying? Why would you want to settle for an approximate solution when an exact solution is well within your reach?
#3
04-09-2013, 04:52 PM
 arun.n Junior Member Join Date: Apr 2013 Posts: 3
Re: question 2

Quote:
 Originally Posted by vikkamath Ok. Great Question. Let's look at this from a point of view of what we already know. i.e the three points in the 'essence of machine learning'. Problem : Optimal Cycle for Traffic Lights 1. A pattern exists: If you think about it intuitively, there should be some correlation (speaking loosely) between the traffic at a light on the intersection with the amount of time that a car spends in that light. Right? If there are a lot of cars, you'd expect that the light is green for a far greater duration than a light in which there are only one or two cars. 2. We cannot pin in down mathematically: You could, with a great amount of difficulty try and pin in down mathematically, but heck, that's what learning's for right? You make the machine do the dirty work and you collect the check at the end of the day. 3. We have data on it: It's quite evident that collecting data for this kind of problem isn't too much of a big deal. You could put a bunch of cameras and have some grad students or interns write some computer vision program to see how many cars there are at a light and measure the duration of time that that light is green and so on (this is just one example of how you might go about collecting data - it's eventually up to you). Problem: Calculating how much time a body takes to fall to the ground 1. A pattern exists : Newtons laws of motions and all that have showed us that a pattern does indeed exist and that.. 2. It can be described mathematically: Since there is a closed form solution to your problem, wouldn't you rather use that? Learning from data inherently has, associated with it, a generalization error. There are so many things that could possibly go wrong - you might choose the wrong hypothesis set (you might think that the relationship between the time of flight and the height of the object from the ground is linear when it is actually quadratic). Do you see what I am saying? Why would you want to settle for an approximate solution when an exact solution is well within your reach?
Thanks for the reply. However I am still not clear on these points

1. I do agree that there is a pattern that exists between traffic flow to an intersection and the amount of time the green light should be on. But when you are doing machine learning you need data points which go like

(traffic flow1, time1), (traffic flow2, time2) and so on.

If this is the training data, then obviously somebody arrived at these values. Because the time is set by humans and because we are presenting these as a training examples, it means that we already know the formula for computing the correct time.

2. If we don't have the exact formula for computing the traffic light times, then we might go about collecting the data by experimenting. I know that technologically it is not difficult to do, but what about the cost of experimenting with traffic lights on busy intersections. Is that something we want to do.

3. The time it takes for an object to fall to ground can be modeled exactly only if the object is falling in vacuum. If its falling through atmosphere, then its no longer certain, because of density of atmosphere, wind conditions so on. But physics is not my area and I don't want to make any strong claims on this.

4. I think its easier to drop objects and record times than it is to conduct experiments at busy traffic junctions and drive people crazy

Anyway, I don't want to go on and on on this topic. I do feel that this question is quite subjective.

thanks
Arun
#4
04-09-2013, 05:21 PM
 Elroch Invited Guest Join Date: Mar 2013 Posts: 143
Re: question 2

This is a fascinating topic. I have no doubt that the answer is correct, in that this application is one suitable for machine learning, but the details are far from obvious.

Firstly, the time the lights are on for is not really a suitable target (unless the aim is merely to emulate conventionally programmed systems). This is something that is completely under the control of the system, not something to be predicted.

My first thought is to seek something to optimise. A bit of pondering leads me to average time spent waiting by vehicles. With different traffic light schedules, this will vary and is a reasonable target for optimisation. [It might be improved by weighting long waits more heavily to avoid being excessively unfair to vehicles from minor roads]

But we also need to identify the set of inputs. Far from trivial, but the basic idea is to concisely represent every possible control protocol that might be considered usable. In addition, all the observed information that is available to the system must be included: this could be nothing at all, or it could include various information about vehicles reaching or leaving the junction.

As for input data there are three possibilities. The first is to use data from existing systems of any type at junctions that are similar to one of interest. The second is to use data from the junction itself and use some sort of reinforcement learning. The third is to use simulations of junctions, using realistic vehicle data, and predict the performance of a selection of control protocols. Machine learning could then be used to extend this knowledge to billions of protocols by generalisation. With a bit of luck.

I am now reminded that 23 years ago I had an interview at the Transport and Roads Research Laboratory where they asked me for my thoughts on how traffic light control systems could be designed. Given the naivety of my answer, I'm not sure how I got offered a post: in hindsight perhaps I should have accepted it rather than the one I did take.

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 05:45 AM.