Quoting Vlad Dumitrescu <[EMAIL PROTECTED]>:
I wonder if anybody has some data (or a theory) about how valuable a
good preordering of the children of a node would be, when UCT. This
might take the form of having nodes start with non-null values for the
'won_simulations' and 'played_simulations' fields.
I have been thinking about this but not done anything empirical about it. Here
are some thoughts: there are two ways for move ordering a) static (such as
pattern matching) and b) dynamic (killer heuristic/history heuristic as for
alpha-beta).
The problem with static move ordering that it must be really good it to be
useful, and it migh be computationally expensive if it is really good.
The problem with dynamic move ordering where the search results from
one part of
the tree is applied to a new part of the tree is the nature of UCT itself.
Alpha-beta has systematic search so when a subtree has been evaluated we can
know for sure if the move was a killer move or not. With UCT the entire
tree is
updated in an unpredictable manner, a move thats look like a killer move might
suddenly become refuted, but how would we then keep track of all places in the
tree this move has affected the move ordering? This happens in alpha-beta too
but with iterative deepening everything all nodes are visited again in an
orderly manner where fresh killer moves are used.
Or maybe this could work only if combining UCT with some kind of alpha-beta?
I used alpha-beta with MC evaluation for Viking5. Only a few 100 simulations
where used at terminal nodes for evaluation and then UCT would not be much
different from plain MC. And since evaluation is at terminal nodes we cannot
know well how to order the moves, unless static move ordering is used.
Implicitely this is done by having improvements to the simulations.
-Magnus
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/