LFD Book Forum  

Go Back   LFD Book Forum > Book Feedback - Learning From Data > Chapter 1 - The Learning Problem

Reply
 
Thread Tools Display Modes
  #1  
Old 01-19-2013, 01:28 PM
cygnids cygnids is offline
Member
 
Join Date: Jan 2013
Posts: 11
Default PLA Optimization criteria

The PLA algorithm, eqn. 1.3, can be used to partition linearly separable data. What I'm curious is to what optimization criteria underlies eqn. 1.3? The figures on pp. 6-7 show that for a 2D case we have the algorithm converge to some straight line decision boundary, and it is also qualitatively clear that many different straight-lines, would "work" equally well (ie give the same E_{in} error rate); however PLA converges to a specific solution. The PLA algorithm seems to provide both, an optimization criteria, and a method for solution too. The opt. criteria gives provides uniqueness. Can the optimization criteria underlying PLA (eqn 1.3) be spelled out explicitly? Thank you.
__________________
The whole is simpler than the sum of its parts. - Gibbs
Reply With Quote
  #2  
Old 01-21-2013, 03:51 PM
yaser's Avatar
yaser yaser is offline
Caltech
 
Join Date: Aug 2009
Location: Pasadena, California, USA
Posts: 1,477
Default Re: PLA Optimization criteria

Quote:
Originally Posted by cygnids View Post
The PLA algorithm, eqn. 1.3, can be used to partition linearly separable data. What I'm curious is to what optimization criteria underlies eqn. 1.3? The figures on pp. 6-7 show that for a 2D case we have the algorithm converge to some straight line decision boundary, and it is also qualitatively clear that many different straight-lines, would "work" equally well (ie give the same E_{in} error rate); however PLA converges to a specific solution. The PLA algorithm seems to provide both, an optimization criteria, and a method for solution too. The opt. criteria gives provides uniqueness. Can the optimization criteria underlying PLA (eqn 1.3) be spelled out explicitly? Thank you.
The optimization criterion for the PLA can be viewed as an application of Stochastic Gradient Descent to a particular error measure (Exercise 3.10). This is really just an artificial way of looking at it. A genuine optimization criterion based on margins leads to support vector machines.
__________________
Where everyone thinks alike, no one thinks very much
Reply With Quote
  #3  
Old 01-22-2013, 01:47 PM
cygnids cygnids is offline
Member
 
Join Date: Jan 2013
Posts: 11
Default Re: PLA Optimization criteria

A few weeks ago I recall having read that section on SGD, however the connection with PLA somehow slipped past. My sincere apologies. Then, I suppose I was trying keep my focus on ML paradigms & approaches, and much as optimization is part & parcel of ML, I think I tried not to get sidetracked with finer details of optimization. Lately, I've started re-reading the book, a bit more carefully, and find myself appreciating the whole, and the subtle, even more so than before! Thank you for taking the trouble of pointing out the section. I do appreciate it.
__________________
The whole is simpler than the sum of its parts. - Gibbs
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -7. The time now is 03:31 AM.


Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.