Hi! Does anyone knows why the AlphaGo team uses MSE on [-1,1] as the value output loss rather than binary crossentropy on [0,1]? I'd say the latter is way more usual when training networks as typically binary crossentropy yields better result, so that's what I'm using in https://github.com/pasky/michi/tree/nnet for the time being, but maybe I'm missing some good reason to use MSE instead?
Thanks, -- Petr Baudis, Rossum Run before you walk! Fly before you crawl! Keep moving forward! If we fail, I'd rather fail really hugely. -- Moist von Lipwig _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go