What I did (was a "long" time ago, I don't know if it is still used in Mogo), is to compute the m best moves every so often and most of the time just do the max over those m moves. m was on the order of 5, and "every so often" was an increasing function like sqrt or log (I don't remember). That speeds up quite a lot (when the max becomes the bottleneck), and if you tune the parameters well you almost don't loose any strength at a fixed number of playouts.
Sylvain 2008/12/3 <[EMAIL PROTECTED]>: > There is another speedup trick that might interest you. IIRC Lukasz Lew came > up with it, but I forget what he called it. After calculating the selected > move for an internal node (going through UCT and RAVE and whatnot) store > that move. Then, for the next N visits to that node (where N is 5 or 10 or > so), just repeat that move without having to calculate what the move would > be. > > - Dave Hillis > > -----Original Message----- > From: Mark Boon <[EMAIL PROTECTED]> > To: computer-go <computer-go@computer-go.org> > Sent: Tue, 2 Dec 2008 11:17 pm > Subject: Re: [computer-go] Monte-Carlo Tree Search reference bot > > I have made some minor performance improvements and this is as far as I > intend to take this particular project. I might make some small changes if > necessary, but most likely I'll leave this largely unchanged from now. > > I had set myself as an arbitrary goal that it should do at least 20K > playouts. But with real liberties, AMAF and a RAVE formula I got stuck in > the 16K-17K range. According to my profiler that is mostly due to the > expensive formula used to compare nodes, where it says it spends 25% of > total time. The formula I'm using is: > > beta * (virtual-win-ratio + RAVE) + (1-beta) * (win-ratio + UCT) > > beta = 1 - log(parent-visits) / 20 > UCT = exploration-factor *sqrt( log(parent-visits) / (nr-visits+1) ) > RAVE = exploration-factor *sqrt( log(parent-visits) / > (nr-virtual-visits+1) ) > > There are probably quite a few possibilities still to tune this program with > regards to playing strength and performance. But I felt it doesn't help to > obscure the implementation by too many details. > > The implementation of the search algorithm was entirely game-independent, > until I introduced AMAF. I didn't see how to fix that, as there's no way > getting around that it's based on the fact that a move is represented by a > single coordinate, which is mapped to an array to store the statistical > values. But strip the AMAF part, which is a block of 30 odd lines of code, > and I think it can be used for other games basically as-is. I did this not > because I ever see myself program another game, but because in my experience > in doing so I get a cleaner separation between modules. > > At 2,000 playouts, it's still quite a bit weaker than the plain MC-AMAF > refbot. It wins only about 33%. But that's probably because the 1,000-2,000 > playouts range is the sweet-spot for that particular type of playing engine. > It doesn't scale from there, whereas the MCTS ref-bot only just gets warmed > up with 2,000 playouts. > > This leads me to a question. I suppose it might be of some interest to put > this bot up on CGOS. But what parameters to use? The main one being the > number of playouts, naturally. > > Mark > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > ________________________________ > Tis the season to save your money! Get the new AOL Holiday Toolbar for money > saving offers and gift ideas. > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/