LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 6 (http://book.caltech.edu/bookforum/forumdisplay.php?f=135)
-   -   Usage of test data in early stopping (http://book.caltech.edu/bookforum/showthread.php?t=989)

rainbow 08-12-2012 10:38 AM

Usage of test data in early stopping
 
The test data is used to measure the performance of the final hypothesis g* on new data. After you evaluate g* on the test data there is no going back, because the performance evaluation contaminates the test data from further training.

Given this, it seems strange to use the test data iteratively as in the early stopping method. Is this really test data or is it validation data (as in train + validation + test)?

edit: Ask because in the lecture the Professor says "test set".

yaser 08-12-2012 04:06 PM

Re: Usage of test data in early stopping
 
Quote:

Originally Posted by rainbow (Post 4003)
The test data is used to measure the performance of the final hypothesis g* on new data. After you evaluate g* on the test data there is no going back, because the performance evaluation contaminates the test data from further training.

Given this, it seems strange to use the test data iteratively as in the early stopping method. Is this really test data or is it validation data (as in train + validation + test)?

edit: Ask because in the lecture the Professor says "test set".

You are right about the role of a test set, and this is indeed a validation set not a test set if a decision such as early stopping is made. As noted, when the decision is simple (single parameter such as when to stop), the contamination is minimal.

rainbow 08-13-2012 10:06 AM

Re: Usage of test data in early stopping
 
Thanks. Your feedback in the forum is highly appreciated.


All times are GMT -7. The time now is 08:30 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.