Quote:
Originally Posted by arcticblue
I am also a little unsure about exactly how this equation works:
Obviously the more negative is the closer E_out is to zero which is good. So is w supposed to be normalized? I presume so because otherwise I could just scale w and then E_out becomes very small. And if it is normalized then the values I'm getting for E_in and E_out are both much greater than any of the options. (Maybe it's meant to be like that, if so it's quite unnerving.)

No normalization. The value of
is determined iteratively by the specific algorithm given in the lecture. If
'agrees' with all the training examples, then indeed the algorithm will try to scale it up to get the value of the logistic function closer to a hard threshold. When you evaluate the quoted formula on a test set,
is frozen and no scaling or any other change in it is allowed.