On 12/11/07, Mark Boon <[EMAIL PROTECTED]> wrote:
> Question: how do MC programs perform with a long ladder on the board?
>
> My understandig of MC is limited but thinking about it, a crucial
> long ladder would automatically make the chances of any playout
> winning 50-50, regardless of the actual outcome of the ladder.

No, 50/50 would make too much sense. It might be higher or it might be
lower, depending on whose move it is in the ladder and what style of
playouts you use, but exactly 50/50 would be like flipping a coin and
having it land on its edge. In test cases, MC-UCT evaluations tend to
cluster near 50/50 in any case, because MC-UCT, especially dumb
uniform MC-UCT, tends to be conservative about predicting the winner,
especially in 19x19 where the opportunities for luck to overwhelm the
actual advantage on the board are greater. But if you accept this as
just a moot scaling issue -- that a clearly lopsided exchange can mean
just a 2% increment in winning percentage even if read more or less
correctly -- then the numbers may not look so even after all. I's
certainly possible for MC-UCT to climb a broken ladder in a winning
position (and climbing a broken ladder in an even position is at least
half as bad as that anyhow).

I tried testing this on 19x19 using libego at 1 million playouts per
move. The behavior was not consistent, but the numbers trended in the
defender's favor as the sides played out the ladder. In one bizarre
case, the attacker played out the ladder until there were just 17
plies left, and then backed off.

Why would the attacker give up a winning ladder? It appears the MC-UCT
was never actually reading the ladder to begin with; just four or five
plies in, sometimes just a few thousand simulations were still
following the key line. 1 million playouts were not nearly enough for
that in this case; maybe 100 million would be enough, but I couldn't
test that. Also, after enough simulations, decisively inferior moves
lead to fewer losses than slightly inferior ones. Suppose you have
three moves available: one wins 75% of the time, one 50%, and one 25%.
In the long run, the 75% move will be simulated almost all the time,
but the middle move will be simulated roughly four times as often as
the 25% one that, compared to the best move available, is twice as
bad, and four times the simulations with half the loss per simulation
adds up to twice the excess losses compared to the 25% move. That is
apropos here, because giving up on an open-field ladder once it has
been played out for a dozen moves is much more painful for the
defender than for the attacker. The longer the ladder got, the more
the evaluations trended in the defender's favor, and my best
explanation would be the fact that -- until you actually read the
ladder all the way out and find that the defender is dead -- every
move except pulling out of atari is so obviously bad that even uniform
MC-UCT did a better job of focusing on that one good move.

(Incidentally, the conservative nature of MC-UCT ratings largely
explains why maximizing winning probabilities alone is not a bad
strategy, at least in even games. The classic beginner mistake, when
you already have a clear lead in theory, is to fail to fight hard to
grab still more points as blunder insurance. But an MC-UCT evaluation
of 90% typically means a >90% probability of actually winning against
even opposition, not just a 90% likelihood of a theoretical win.
Assigning a 65% evaluation to an obvious THEORETICAL win allows plenty
of room to assign higher evaluations to even more lopsided advantages.
As Don said, when MC-UCT starts blatantly throwing away points for no
obvious reason, it's almost certainly because the game is REALLY over,
because MC-UCT's errors tend to be probabilistic instead of absolute
-- it may in effect evaluate a dead group as 75% alive, but it won't
call it 100% alive except in the rare cases when the underlying random
playout rules forbid the correct line of play.)
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to