LFD Book Forum Discussion of Lecture 11 "Overfitting"
 User Name Remember Me? Password
 FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
02-13-2013, 08:49 PM
 vikasatkin Caltech Join Date: Sep 2011 Posts: 39
Discussion of Lecture 11 "Overfitting"

Links: [Lecture 11 slides] [all slides] [Lecture 11 video]

Question: (Slide 10/23) It seems, that this situation is not typical, because the data points are clashed together. Should not we space the points evenly? Would we get a different result in that case?

Answer: To make sure, that the general result is not a fluctuation, the experiment was produced multiple times for each value of noise level and order of the polynomial. In each run points were chosen independently according to the uniform distribution. You can see this result on the slide 13/23. You may see some coincidences on the example on slide 10/23, but they were probably averaged out.

So the short answer is: you may interpret the figure on the slide 10/23 as an illustration and the slide 13/23 as the final result.
#2
02-13-2013, 09:00 PM
 vikasatkin Caltech Join Date: Sep 2011 Posts: 39
Discussion of Lecture 11 "Overfitting"

Question: (Slide 11/23) How did you generate the polynomials? How did you choose the coefficients?

Answer: Here is the technical description of the process of generating the target function and the dataset (which may be useful, if you want to reproduce the pictures from the slide 13/23). It was actually described in the "Learning From Data" book on p.123 (section 4.1.2 "Catalysts for Overfitting").

The process of generating the target function depends on two parameters: (degree of the generated polynomials) and (noise level). Of course, you also need --- amount of points in the dataset.

1. Take Legendre polynomials . Note, that they are normalized according to their value at (i.e. ), not their average square.
2. Choose coefficients independently according to the standard normal distribution.
3. Generate points (pick them randomly from , independently from each other).
4. For every point generate the noise .

The target is given by . Here is a normalization constant, which depends only on . It is chosen in such a way, that the mean square value of is equal to 1 (mean with respect to both and choices we made during this process: ). On can compute, that

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 02:33 PM.

 Contact Us - LFD Book - Top

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.