Re: [computer-go] UCT caveat (was in Explanation to MoGo paper wanted)

Jacques Basaldúa Wed, 11 Jul 2007 04:22:46 -0700

Brian Slesinsky wrote :

> When you favor defense (or attack) you may think: "This is unbiased
> since some times it favors black and other times it favors white" But
> the fact is when black is in danger at the root of the tree, it is in
> danger in most of the tree, therefore the trick gets the evaluation wrong.

Well, this is subtle enough that I don't understand it.  What are two
positions that it would compare incorrectly?

- Brian

It is not two positions. What I say is "the same applies" _obviously_if it is the other color who is in danger.

I will try to explain it better: E.g. The game is in a position where blackis in danger. That position is the root node. All stones in the root node

are inherited in any node below, except when they are captured. Your trick
pretends to favor defense. Therefore, black has higher probabilities of
survival than with uniformly random playouts. Since all nodes in the tree
have been inherited from root, they are mostly in the same situation.
The simulation is a stochastical estimator of either the territorial value
of the game or the percentage of win (which is determined by comparing
the former with some threshold). Since you are favoring systematically
one of the players (the one who is in danger at root is always the
same player) you are biasing the estimation. Because the variance of a
random playout is so big compared with the difference in conditional
probability: P(win | a good move) - P(win | a bad move) is a very

small number -> the smallest bias is too much bias -> the program getsweaker.


Jacques.

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UCT caveat (was in Explanation to MoGo paper wanted)

Reply via email to