On 6/12/2017 18:57, Darren Cook wrote: >> Mastering Chess and Shogi by Self-Play with a General Reinforcement >> Learning Algorithm >> https://arxiv.org/pdf/1712.01815.pdf > > One of the changes they made (bottom of p.3) was to continuously update > the neural net, rather than require a new network to beat it 55% of the > time to be used. (That struck me as strange at the time, when reading > the AlphaGoZero paper - why not just >50%?)
I read that as a simple way of establishing confidence that the result was statistically significant > 0. (+35 Elo over 400 games - I don't know by hearth how large the typical error margin of 400 games is, but I think it won't be far off!) -- GCP _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go