![]() |
|
#1
|
|||
|
|||
![]()
I don’t quite understand the Problem 3.17b. What the meaning of minimize E1 over all possible (∆u, ∆v). Instead, I think it should minimize E(u+∆u,v+∆v), starting from the point (u,v)=(0,0). Is the optimal column vector [∆u,∆v]T is corresponding to the vt in the gradient descent algorithm (here, as the problem said, it is -∆E(u,v)), the norm ||(∆u,∆v)||=0.5 corresponding to the step size ɧ, and (u,v) corresponding to the weight vector w? Then, what the meaning of compute the optimal (∆u, ∆v)?
|
#2
|
||||
|
||||
![]() ![]() ![]() ![]() ![]() Quote:
__________________
Have faith in probability |
#3
|
|||
|
|||
![]()
Yes, E1 is a function of ∆u, ∆v, but it is also a function of u, v. Then, what is the u, v in this function? Still use (0, 0) as part (a) said? Also, what is the ininital value of ∆u, ∆v? In the textbook, it sets w to w(0) at step 0.
Further, does the norm ||(∆u,∆v)||=0.5 means that for each iteration we should ensure that the values of ∆u,∆v meet this resuirements? Another point is that in textbook, we need specify the step size ɧ. However, we could not see any information about the step size. I don't quite understand the description of the question (Problem 3.17b), so I have so many questions. Could you probably clarify it for me? |
#4
|
||||
|
||||
![]()
Yes, in this problem you can use (u,v)=(0,0) from part (a).
||(∆u,∆v)||=0.5 means that the step size ![]() In the chapter we considered two step sizes. First where the step size was fixed at ![]() Quote:
__________________
Have faith in probability |
#5
|
|||
|
|||
![]()
I have almost understund the problem. But still have a question that what the meaning of the resulting of E(u+∆u,v+∆v) in Part (b), (e-i), and (e-ii). Is it a number or a formula?
Also what the difference of the two parts of (e). One is to minimize E2, the other is to minimize E(u+∆u,v+∆v). So, what the difference of those two? Quote:
|
#6
|
|||
|
|||
![]() |
![]() |
Thread Tools | |
Display Modes | |
|
|