I can only speculate, but I see two advantages to using MSE:
* MSE accomodates games that have more than just win/loss. One of AlphaGo Zero's goals (I'm extrapolating from the paper) was to develop a system that was easy to apply to domains other than go. * It can be used with TD-lambda-like schemes that don't propagate information all the way back from a terminal position. The AlphaGo team may have chosen MSE for this reason while still experimenting with different methods for learning and then never revisited their decision. For a network that plays go and is trained as AlphaGo Zero is, I don't see an a priori advantage for MSE over log loss, either. On Tue, 7 Nov 2017, Petr Baudis wrote:
Hi! Does anyone knows why the AlphaGo team uses MSE on [-1,1] as the value output loss rather than binary crossentropy on [0,1]? I'd say the latter is way more usual when training networks as typically binary crossentropy yields better result, so that's what I'm using in https://github.com/pasky/michi/tree/nnet for the time being, but maybe I'm missing some good reason to use MSE instead? Thanks, -- Petr Baudis, Rossum Run before you walk! Fly before you crawl! Keep moving forward! If we fail, I'd rather fail really hugely. -- Moist von Lipwig _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go