Re: [computer-go] Monte-Carlo Tree Search reference bot

Sylvain Gelly Wed, 03 Dec 2008 00:10:15 -0800

What I did (was a "long" time ago, I don't know if it is still used in
Mogo), is to compute the m best moves every so often and most of the
time just do the max over those m moves.
m was on the order of 5, and "every so often" was an increasing
function like sqrt or log (I don't remember).
That speeds up quite a lot (when the max becomes the bottleneck), and
if you tune the parameters well you almost don't loose any strength at
a fixed number of playouts.


Sylvain

2008/12/3  <[EMAIL PROTECTED]>:
> There is another speedup trick that might interest you. IIRC Lukasz Lew came
> up with it, but I forget what he called it. After calculating the selected
> move for an internal node (going through UCT and RAVE and whatnot) store
> that move. Then, for the next N visits to that node (where N is 5 or 10 or
> so), just repeat that move without having to calculate what the move would
> be.
>
> - Dave Hillis
>
> -----Original Message-----
> From: Mark Boon <[EMAIL PROTECTED]>
> To: computer-go <computer-go@computer-go.org>
> Sent: Tue, 2 Dec 2008 11:17 pm
> Subject: Re: [computer-go] Monte-Carlo Tree Search reference bot
>
> I have made some minor performance improvements and this is as far as I
> intend to take this particular project. I might make some small changes if
> necessary, but most likely I'll leave this largely unchanged from now.
>
> I had set myself as an arbitrary goal that it should do at least 20K
> playouts. But with real liberties, AMAF and a RAVE formula I got stuck in
> the 16K-17K range. According to my profiler that is mostly due to the
> expensive formula used to compare nodes, where it says it spends 25% of
> total time. The formula I'm using is:
>
>   beta * (virtual-win-ratio + RAVE) + (1-beta) * (win-ratio + UCT)
>
>   beta = 1 - log(parent-visits) / 20
>   UCT = exploration-factor *sqrt( log(parent-visits) / (nr-visits+1) )
>   RAVE = exploration-factor *sqrt( log(parent-visits) /
> (nr-virtual-visits+1) )
>
> There are probably quite a few possibilities still to tune this program with
> regards to playing strength and performance. But I felt it doesn't help to
> obscure the implementation by too many details.
>
> The implementation of the search algorithm was entirely game-independent,
> until I introduced AMAF. I didn't see how to fix that, as there's no way
> getting around that it's based on the fact that a move is represented by a
> single coordinate, which is mapped to an array to store the statistical
> values. But strip the AMAF part, which is a block of 30 odd lines of code,
> and I think it can be used for other games basically as-is. I did this not
> because I ever see myself program another game, but because in my experience
> in doing so I get a cleaner separation between modules.
>
> At 2,000 playouts, it's still quite a bit weaker than the plain MC-AMAF
> refbot. It wins only about 33%. But that's probably because the 1,000-2,000
> playouts range is the sweet-spot for that particular type of playing engine.
> It doesn't scale from there, whereas the MCTS ref-bot only just gets warmed
> up with 2,000 playouts.
>
> This leads me to a question. I suppose it might be of some interest to put
> this bot up on CGOS. But what parameters to use? The main one being the
> number of playouts, naturally.
>
> Mark
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
> ________________________________
> Tis the season to save your money! Get the new AOL Holiday Toolbar for money
> saving offers and gift ideas.
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Monte-Carlo Tree Search reference bot

Reply via email to