On 5/11/07, Jason House <[EMAIL PROTECTED]> wrote:


On 5/11/07, Brian Slesinsky <[EMAIL PROTECTED]> wrote:
> > Going along with this, the numbers won't add up (although I don't know
> > if that is important.)   In other words, if you do 10,000 simulations at
> > the root, all grandchildren will add up to more (due to transpositions.)
> > If you propogate this up the tree you might come up with many more than
> > 10,000 simulations at the root.
>
> Maybe this is obvious, but it seems like the numbers do add up if you
> think about it differently?   When the program does a simulation that
> has multiple ancestors, it is effectively working on multiple
> simulations at once (counting each transposition separately.)  The
> count at each node would then be the number of virtual simulations
> done.  You could count the number of actual simulations separately and
> see what the ratio is between them at the root to see how effective
> the tables are.



Consider a node who's had one child extensively evaluated through
transpositions.  When UCT does come to visit it, the sqrt(sum(child
simulations)) will be very high.  That will artificially favor exploration
of children with less simulations.  That will have a continuing negative
impact on the exploitation.

... but then once the parent gets updated with the sum(child simulations),
UCT may simply stop searching that branch for a long time... Because it
likely had a low average win rate, and now looks to be extensively explored.
 When UCT eventually does come back to it, the problem in the previous
paragraph will still apply.

Yes, these problems are very real. That's one of the reasons why we
don't use "sum(child simulations)", but have a simulation counter at
the parent instead. The other reason is that we don't want to spend
time adding them up!

If you have explored some node 18 times and now you find a
transposition into a node that has been explored 1000 times, you use
18 as the number of simulations in the UCB formula. The only advantage
of finding the transposition is that the playouts down that path will
be more accurate, since we have a few more moves in UCT.

There are many other ways one can try to use transposition
information, but they tend to have the kind of problems that Jason
describes.

Álvaro.
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to