LFD Book Forum  

Go Back   LFD Book Forum > Course Discussions > Online LFD course > Homework 5

Thread Tools Display Modes
Prev Previous Post   Next Post Next
Old 08-09-2012, 04:45 AM
hashable hashable is offline
Junior Member
Join Date: Jul 2012
Posts: 8
Default Questions on Lecture 9 (Linear Models II)

1. In the example in the lecture, we were cautioned against data snooping since looking at data can mean that we can be implicitly doing some learning in our head. My question is: Is it legitimate to look at DataSet 1 to identify my predictors, and then train on DataSet 2 with samples entirely different from DataSet 1? Of course, the out of sample error will be evaluated on DataSet 3 different from 1 and 2.

2. At the end of the lecture, somebody asked a question about multiclass classifiers and it was answered that it is commonly done using either one-vs-all training or one-vs-one training. My questions:
  • 2-a) For the one-versus-all, we need to only build 'n' classifiers for n-classes. Whereas for one-versus-one, we have to build n-choose-two classifiers which can take much longer if we have many classes. Are there any inherent benefits to one-vs-one? If not, why do it at all since one-vs-all is faster to train?
  • 2-b) Are there any reasons why one method is preferable over another? E.g Is there impact on accuracy/generalization by choosing either approach?

3. We used cross entropy error for logistic and squared error for linear. It was explained that the choice of error is so that the math becomes easy with respect to implementation of the minimization. In both cases, the practical interpretation was explained and it appears intuitive. My questions:
  • 3-a) Does the choice of error-measure affect the final choice of approximation? In other words, will we get a different g depending on whether we use linear or squared or any other error function? (Ignore the complexity of the math with respect to minimization for now.)
  • 3-b)If we optimize to find g using one error function, but evaluate using a different error function, will the evaluation be meaningful? E.g. Use squared error to evaluate out of sample performance for a logistic model built by minimizing cross entropy error.
Reply With Quote

data snooping, error, generalization error, multi-class classifiers

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -7. The time now is 11:05 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.