> Blog post: > https://blog.janestreet.com/accelerating-self-play-learning-in-go/ > Paper: https://arxiv.org/abs/1902.10565
I read the paper, and really enjoyed it: lots of different ideas being tried. I was especially satisfied to see figure 12 and the big difference giving some go features made. Though it would be good to see figure 8 shown in terms of wall clock time, on equivalent hardware. How much extra computation do all the extra ideas add? (Maybe it is in the paper, and I missed it?) > I found some other interesting results, too - for example contrary to > intuition built up from earlier-generation MCTS programs in Go, > putting significant weight on score maximization rather than only > win/loss seems to help. Score maximization in self-play means it is encouraged to play more aggressively/dangerously, by creating life/death problems on the board. A player of similar strength doesn't know how to exploit the weaknesses left behind. (One of the asymmetries of go?) I hope you are able to continue the experiment, with more training time, to see if it flattens out or keeps improving. Darren _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go