Correct again.
So let us differentiate between the theory of machine learning and its implementation on finite precision computers. In theory, if you have an infinite precision machine, then the size of the weights does not matter because it is a mathematical fact that, for positive
,
In finite precision, you typically want the weights to be around 1 and the inputs rescaled to be around 1 too (this is called input preprocessing and you can read about it in eChapter 9).
Quote:
Originally Posted by CountVonCount
I have the same question. Can someone help here?
From my understanding having small weights is not perfect for sign(s), since this will lead to a signal that is often around 0 and thus a small change of just one input has a high chance to lead to a completely different output, if the sign changes.
So it would be better to have big weights, thus the signal is always pushed to the big number region and the sign is more stable.
But I maybe I'm just wrong here.
