For checkers, I used a naive implementation of UCT as my opening book (the "playout" being the actual game where the engine is thinking). So towards the end of the opening book there is always a position where it will try a random move, but in the long run good opening moves will be explored more often. I think this method might work well for other games.
Álvaro. On Mon, Jan 27, 2020 at 6:04 AM Rémi Coulom <remi.cou...@gmail.com> wrote: > > This is a report after my first day of training my Ataxx network: > https://www.game-ai-forum.org/viewtopic.php?f=24&t=693 > Ataxx is played on a 7x7 board. The rules are different, but I expect 7x7 Go > would produce similar results. 2k self-play games are more than enough to > produce a huge strength improvement at the beginning. > > It would take my system less than one day to generate 285k games on a single > GPU. But speed optimizations are probably not your biggest problem at the > moment. > > As I wrote in my previous message, it is important to control the variety of > your self-play game. In my program, I have a function to count the number of > distinct board configurations for each move number of the self-play games. > This way, I can ensure that the same opening is not replicated too many times. > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go