This is a good paper which combines insights from optimization, hardware, and neuroscience to give a multiplicative weight update for neural nets. It seems worthwhile to try out multiplicative updates in the context of modern architectures, and this paper seems to have made them competitive with existing optimizers, in a way that allows lower-precision computation (as low as 8 bits). As far as I can tell, there isn't a clear advantage for current hardware, but this serves as a good proof-of-concept that could help inform future hardware design. While no particular insight is particularly deep, everything is combined in an interesting and cohesive way, so the reviewers and I think this paper is definitely above the bar for acceptance. I encourage the authors to account for the reviewers' feedback in the camera ready version.