LFD Book Forum calculation 4.8 on p.139
 User Name Remember Me? Password
 Register FAQ Calendar Mark Forums Read

 Thread Tools Display Modes
#1
08-06-2013, 09:14 AM
 arapmv Junior Member Join Date: Aug 2013 Location: Baltimore Posts: 4
calculation 4.8 on p.139

Hello,

I have a question regarding the calculation (4.8) on p.139 in the book. The final hypothesis depends, albeit indirectly, on the choice of the validation set . Indeed, by construction we train on the complement of in , which makes dependent on the choice of . The derivation (4.8) appears to rely on the assumption that is independent of .

Does anyone have any comments on this?
#2
08-07-2013, 10:48 AM
 magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 597
Re: calculation 4.8 on p.139

I will try to reformulate the construction of Dval in such a way that the independence is patent, and then try to suggest where your confusion may be coming from.

Construction 1 of Dval: Randomly generate D. Randomly partition it into Dtrain (N-K points) and Dval (K points). Learn on Dtrain to obtain and compute Eval of using Dval. (This is the standard validation setting in the book.)

Construction 2 of Dval: Randomly generate N-K points to form Dtrain. Learn on Dtrain to obtain . Now, randomly generate another K points to form Dval. Compute Eval of using Dval.

It is patently clear in Construction 2 that we are computing Eout of , essentially by definition of Eout. There is no difference between constructions 1 and 2 in terms of the Dtrain and Dval they produce (statistically). Randomly generating N points and splitting randomly into N-K and K points is statistically equivalent to first randomly generating N-K points and then another random K points. In construction 1 you generate both Dtrain and Dval at the begining, process Dtrain and then test on Dval. In construction 2, you only generate Dval after you processed Dtrain. But Dval still has the same statistical properties in both cases.

Now for where you may be getting subtly confused. It is true that the value of Eval will change based on what specific partition was selected, in part because changes and in part because Dval also changes. This means that depends on the partition. This is equivalently saying (in the construction 2 setting) that depends on the particular Dtrain generated (no surprise there). However, does not depend on the contents of Dval - if you change the data points in Dval, will not change. The expectation in (4.8) is an expectation over the data points in Dval, and does not depend on that (the partition is fixed and now we are looking at what points are in Dval).

Quote:
 Originally Posted by arapmv Hello, I have a question regarding the calculation (4.8) on p.139 in the book. The final hypothesis depends, albeit indirectly, on the choice of the validation set . Indeed, by construction we train on the complement of in , which makes dependent on the choice of . The derivation (4.8) appears to rely on the assumption that is independent of . Does anyone have any comments on this?
__________________
Have faith in probability
#3
08-09-2013, 05:29 PM
 arapmv Junior Member Join Date: Aug 2013 Location: Baltimore Posts: 4
Re: calculation 4.8 on p.139

Thank you very much for the lucid explanation!
#4
08-09-2013, 06:02 PM
 arapmv Junior Member Join Date: Aug 2013 Location: Baltimore Posts: 4
Re: calculation 4.8 on p.139

On the second thought, I am still not seeing the statistical equivalence of the two constructions. The first construction will always produce two disjoint sets and of sizes and , respectively. The second construction may easily produce two non-disjoint sets. Is this a problem?

Admin Edit: Replaced $with [math] tag in math expressions. #5 08-10-2013, 08:57 AM  magdon RPI Join Date: Aug 2009 Location: Troy, NY, USA. Posts: 597 Re: calculation 4.8 on p.139 Not a problem. The two constructions generate identically distributed Dtrain and Dval. The partitioning approach in construction 1 generates data points with disjoint indices. That does not mean that the data points in Dtrain cannot also appear in Dval. (remember that when you generate D, the data points are iid so there can be repetitions.) __________________ Have faith in probability #6 08-11-2013, 11:14 AM  arapmv Junior Member Join Date: Aug 2013 Location: Baltimore Posts: 4 Re: calculation 4.8 on p.139 Thanks for the explanation! I did not realize that we are doing sampling with replacement to construct the training set and the validation set. Now it is clear why the two constructions you gave are equivalent! The first step of the Calculation 4.8 on p.139 is still not sinking into my brain somehow. It feels like in the first step we pulled out a variable (namely,$\mathcal{D}_{val}\$) from under the integral sign and placed it as an indexing set for the summation. What is the correct mathematical interpretation of this step?

Thank you for your clear explanations and patience!

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is On HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home General     General Discussion of Machine Learning     Free Additional Material         Dynamic e-Chapters         Dynamic e-Appendices Course Discussions     Online LFD course         General comments on the course         Homework 1         Homework 2         Homework 3         Homework 4         Homework 5         Homework 6         Homework 7         Homework 8         The Final         Create New Homework Problems Book Feedback - Learning From Data     General comments on the book     Chapter 1 - The Learning Problem     Chapter 2 - Training versus Testing     Chapter 3 - The Linear Model     Chapter 4 - Overfitting     Chapter 5 - Three Learning Principles     e-Chapter 6 - Similarity Based Methods     e-Chapter 7 - Neural Networks     e-Chapter 8 - Support Vector Machines     e-Chapter 9 - Learning Aides     Appendix and Notation     e-Appendices

All times are GMT -7. The time now is 02:33 PM.

 Contact Us - LFD Book - Top

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.