That's going to repeat the same exact path through the tree three times, isn't it? If so, it seems like it would be more efficient to do N playouts from the
leaf after each traversal of the tree.
[EMAIL PROTECTED] wrote:
There is another speedup trick that might interest you. IIRC Lukasz Lew
came up with it, but I forget what he called it. After calculating the
selected move for an internal node (going through UCT and RAVE and
whatnot) store that move. Then, for the next N visits to that node
(where N is 5 or 10 or so), just repeat that move without having to
calculate what the move would be.
- Dave Hillis
-----Original Message-----
From: Mark Boon <[EMAIL PROTECTED]>
To: computer-go <computer-go@computer-go.org>
Sent: Tue, 2 Dec 2008 11:17 pm
Subject: Re: [computer-go] Monte-Carlo Tree Search reference bot
I have made some minor performance improvements and this is as far as I
intend to take this particular project. I might make some small changes
if necessary, but most likely I'll leave this largely unchanged from now.
I had set myself as an arbitrary goal that it should do at least 20K
playouts. But with real liberties, AMAF and a RAVE formula I got stuck
in the 16K-17K range. According to my profiler that is mostly due to the
expensive formula used to compare nodes, where it says it spends 25% of
total time. The formula I'm using is:
beta * (virtual-win-ratio + RAVE) + (1-beta) * (win-ratio + UCT)
beta = 1 - log(parent-visits) / 20
UCT = exploration-factor *sqrt( log(parent-visits) / (nr-visits+1) )
RAVE = exploration-factor *sqrt( log(parent-visits) /
(nr-virtual-visits+1) )
There are probably quite a few possibilities still to tune this program
with regards to playing strength and performance. But I felt it doesn't
help to obscure the implementation by too many details.
The implementation of the search algorithm was entirely
game-independent, until I introduced AMAF. I didn't see how to fix that,
as there's no way getting around that it's based on the fact that a move
is represented by a single coordinate, which is mapped to an array to
store the statistical values. But strip the AMAF part, which is a block
of 30 odd lines of code, and I think it can be used for other games
basically as-is. I did this not because I ever see myself program
another game, but because in my experience in doing so I get a cleaner
separation between modules.
At 2,000 playouts, it's still quite a bit weaker than the plain MC-AMAF
refbot. It wins only about 33%. But that's probably because the
1,000-2,000 playouts range is the sweet-spot for that particular type of
playing engine. It doesn't scale from there, whereas the MCTS ref-bot
only just gets warmed up with 2,000 playouts.
This leads me to a question. I suppose it might be of some interest to
put this bot up on CGOS. But what parameters to use? The main one being
the number of playouts, naturally.
Mark
_______________________________________________
computer-go mailing list
computer-go@computer-go.org <mailto:computer-go@computer-go.org>
http://www.computer-go.org/mailman/listinfo/computer-go/
------------------------------------------------------------------------
Tis the season to save your money! Get the new AOL Holiday Toolbar
<http://toolbar.aol.com/holiday/download.html?ncid=emlweusdown00000008>
for money saving offers and gift ideas.
------------------------------------------------------------------------
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/