I have a thought as to why MCTS bots might prefer to win by smaller margins.

Imagine a situation where a bot has a choice between moves a and b, where move a would allow a large win and move b keeps the game much closer.

Consider the MCTS tree for move a. It is usually the case that a move that wins by a large margin also means that many/most of the opponent's moves are bad -- and equally bad. In this case the MCTS tree will grow so that each of these many possible, equally bad opponent responses gets about the same number of visits. Assuming the bot continues to prefer large wins down the tree, at each opponent move there will be many equally bad choices so the MCTS tree produced will be fairly "bushy" because it will be visiting all of these equally bad moves somewhat uniformly.

On the other hand, in the case of move b, the opponent will now have a more even distribution of good and bad moves. MCTS will focus its attention on the good moves, of which there are fewer, so now the tree will be less bushy and so will grow (considerably?) deeper.

With the move b tree being deeper, this will mean that most of the random playouts will start later in the game and will, presumably, mean they will be more reliable. Thus, more wins will propagate up the tree giving move b a higher win count. Finally, assuming we are selecting the move with the highest win count (but I don't think this really matters) move b will be the winner, again, because its win count is based on more reliable playouts that occur later in the game.

What do you think?

-Richard



On 05/31/2013 03:27 AM, "Ingo Althöfer" wrote:
Hello,
especially in the early years of Monte-Carlo Go it
was often observed in games between MC(TS)-bots and humans
that bots won by the smallest possible margin, 0.5 points.
We all know that this is not a bug but a feature ;-)

For a long time it was my impression that this phenomenon
was typcial only for bots-vs-humans, but not for
MC-bots vs. MC-bots. But now experiments with other games
make me believe that wins by small margins happen often also
for MC-bots against each other.

Who has experiences or explanations for this (in Go)?

Ingo.

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to