I've noticed that Orego has done very poorly in the last couple of
competitions. This is partly due to the improvements in others'
programs, but I think the fact that Orego currently doesn't have a
transposition table is crippling. Since I'm rewriting this stuff in
Java, I'm thinking about how to handle this, and would like to offer
up the following plan for all of you to poke holes in. I'm not sure
if I'm reinventing the CrazyStone structure here, but I think this
one is slightly different.
DATA STRUCTURE
The "tree" is a enormous, preallocated array of Node objects.
(Because this array is also being used as a hash table, the size of
the table should be a prime number.) This is really a directed
acyclic graph (DAG) rather than a tree, because a Node may be a child
of more than one parent.
Each Node contains the following information:
A+1 "pointers" (in some sense) to child Nodes, where A is the area of
the board. (The extra pointer is for the pass move.) Some of these
pointers (initially, all of them) may be the special value UNEXPLORED.
The number of runs through this Node so far.
The number black wins through this Node so far.
A Zobrist hash of the position represented by this Node. This must
incorporate, in addition to the locations of stones on the board, the
turn number (e.g., zero for the beginning of the game) and the number
of consecutive passes.
The turn number, stored explicitly.
A forced leaf turn number, to be explained below.
(Note that, in contrast to the current Orego structure, we won't need
tails or a free list for this one.)
GENERATING A MONTE CARLO RUN
I want to separate the process of generating a Monte Carlo run from
the process of modifying the tree for two reasons. First, this will
make multithreading easier. Second, I will be able to incorporate
recorded games into the tree by pretending that the program had
played them.
To generate a Monte Carlo run, I start at the root Node. Each move is
chosen using UCT, based on UCB1-TUNED as described on p. 5 of the
recent Gelly et al tech report on MoGo. As in the current Orego, this
is done with a double-hashing scheme and in a way that always chooses
an untried move if one exists.
There is one complication here. The number of runs through the
children might exceed the number of runs through the parent because
of transpositions. In Gelly's notation, this means Tj(n) may exceed
n. I THINK I can ignore this, as it will just give "oversampled"
moves unusually narrow confidence windows.
If we are not at a Node (because we're in an unexplored region of the
search space), if the current Node has a forced leaf turn number
greater than or equal to the root turn number, or if any child is
found that is not at the correct turn number (because of a hash
collision), the move is chosen randomly.
INCORPORATING A GAME INTO THE TREE
Again, we start at the root Node. We work down the tree, updating the
run and black win count of each node encountered. If the child
pointer we would be following is UNEXPLORED, we use the
aforementioned Zobrist hash (modulo the table size) to find a Node to
use as the child. This may be a Node that has already been
initialized, a "fresh" Node (i.e., one that we can overwrite), or a
Node that we can't overwrite.
If the child has the correct turn number, either we have been down
this path before or we have found a transposition. In either case,
continue down the tree.
If the child has a turn number that is lower than that of the root,
it is an old node and can be overwritten. We reinitialize this one
node, update its run and black win counts, and stop. Thus, every run
adds at most one node to the tree.
If the offending child is not so old that it can be overwritten, we
must leave it alone lest it mess up another part of the tree. At this
point, the parent's forced leaf turn number is set to the child's
turn number; all moves from that parent will be random until a later
point in the game when it is safe to overwrite the offending child.
I welcome your input,
Peter Drake
Assistant Professor of Computer Science
Lewis & Clark College
http://www.lclark.edu/~drake/
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/