I can only speculate, but I see two advantages to using MSE:

* MSE accomodates games that have more than just win/loss.  One of
  AlphaGo Zero's goals (I'm extrapolating from the paper) was to develop
  a system that was easy to apply to domains other than go.

* It can be used with TD-lambda-like schemes that don't propagate
  information all the way back from a terminal position.  The AlphaGo
  team may have chosen MSE for this reason while still experimenting
  with different methods for learning and then never revisited their
  decision.

For a network that plays go and is trained as AlphaGo Zero is, I don't
see an a priori advantage for MSE over log loss, either.

On Tue, 7 Nov 2017, Petr Baudis wrote:

 Hi!

 Does anyone knows why the AlphaGo team uses MSE on [-1,1] as the value
output loss rather than binary crossentropy on [0,1]?  I'd say the
latter is way more usual when training networks as typically binary
crossentropy yields better result, so that's what I'm using in
https://github.com/pasky/michi/tree/nnet for the time being, but maybe
I'm missing some good reason to use MSE instead?

 Thanks,

--
          Petr Baudis, Rossum
  Run before you walk! Fly before you crawl! Keep moving forward!
  If we fail, I'd rather fail really hugely.  -- Moist von Lipwig
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to