LFD Book Forum  

Go Back   LFD Book Forum > Book Feedback - Learning From Data > Chapter 5 - Three Learning Principles

Thread Tools Display Modes
Prev Previous Post   Next Post Next
Old 08-05-2012, 03:44 AM
rainbow rainbow is offline
Join Date: Jul 2012
Posts: 41
Default Data snooping (test vs. train data)

Do I understand the issue of data snooping correctly, if it is only an issue related to the test data itself? For example, if the inspection of test data affects the learning in some way.
- The test data has been used for estimation.
- If the learning model is changed after evaluating the performance on the test data?

How does data snooping relates to the train data (if at all). "How much" can you look into this data. Is it a violation wrt. data snooping to look at the target variable y if you are interested in exploratory data analysis such as PCA, or if you want to create features. For example, if you want to create a non-linear feature by cutting a continous variables such as age into a discrete feature with y in respect?
Reply With Quote

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -7. The time now is 06:41 PM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.