![]() |
|
#1
|
|||
|
|||
![]()
The PLA algorithm, eqn. 1.3, can be used to partition linearly separable data. What I'm curious is to what optimization criteria underlies eqn. 1.3? The figures on pp. 6-7 show that for a 2D case we have the algorithm converge to some straight line decision boundary, and it is also qualitatively clear that many different straight-lines, would "work" equally well (ie give the same E_{in} error rate); however PLA converges to a specific solution. The PLA algorithm seems to provide both, an optimization criteria, and a method for solution too. The opt. criteria gives provides uniqueness. Can the optimization criteria underlying PLA (eqn 1.3) be spelled out explicitly? Thank you.
__________________
The whole is simpler than the sum of its parts. - Gibbs |
#2
|
||||
|
||||
![]() Quote:
__________________
Where everyone thinks alike, no one thinks very much |
#3
|
|||
|
|||
![]()
A few weeks ago I recall having read that section on SGD, however the connection with PLA somehow slipped past. My sincere apologies. Then, I suppose I was trying keep my focus on ML paradigms & approaches, and much as optimization is part & parcel of ML, I think I tried not to get sidetracked with finer details of optimization. Lately, I've started re-reading the book, a bit more carefully, and find myself appreciating the whole, and the subtle, even more so than before!
![]()
__________________
The whole is simpler than the sum of its parts. - Gibbs |
![]() |
Thread Tools | |
Display Modes | |
|
|