Re: [Computer-go] MCTS and perfect endgame

Ben Shoemaker Mon, 04 Jul 2011 21:25:32 -0700

Working through these ideas about wins, score, perfect play, etc... it is clear 
that maximizing wins is the correct basic strategy.  However, I still feel that 
incorporating the score *somehow* should improve the winning estimates and 
overall strength of MCTS.


See below for more thoughts.

>From: Don Dailey <[email protected]>
>
>> 1) According to the rules of Go, the winner is the player with the highest 
>> score, but a win is equivalent to any other win--winning by 0.5 points is 
>> enough.  So perfect play would maximize wins but not necessarily points.
>
>I think you are right.  In fact you say not necessarily but I say, "definitely 
>not",  you won't maximize points by playing to win. 

Let me put that another way--for any given game position, perfect play includes 
0 to m moves which all lead to a win.  These wins would fall in the range from 
0.5 (the lowest possible point total for a win) to n (where n is the maximum 
numbers of points possible for a given boardsize).  If the strategy of the 
perfect player is only to win, then the winning score will be distributed 
(randomly?  bell curve?) from 0.5 to n.  If the strategy of the perfect player 
is to maximize points, then the winning scores will tend to be closer to n, but 
no higher than the maximum points possible from that given game position.  I 
suppose both distributions would also be altered by the ability of the opponent 
(weaker opponent = higher scores, stronger opponent = lower scores).

>>With a perfect evaluation function, the "play to maximize points" strategy 
>>should also lead to perfect play.
>
>Another way to see this is that if you win maximally (in the point sense) you 
>also win.   So winning the most points is a more difficult goal and a superset 
>of just winning.   


I think you mean maximizing points is a _subset_ of just winning.  This makes 
sense.  Of all the winning plays, only a few would lead to maximum points.  
Human intuition tells us that playing aggressively (maximizing points) is risky 
(low probability) and is only successful against a far weaker opponent.

>Playing to win is the only strategy,  the only issue at question is how to 
>improve our estimate of winning chances and it's certainly possible that 
>figuring  out how to factor in >other things (such as consolation or "yose")  
>could improve our estimate.


Playing to win is certainly the best strategy.  I guess the question is: with 
MCTS, to evaluate the winning chance of a move, do you use winrate of playouts, 
the scores of playouts, some combination of the two, or perhaps some other 
information?

>Playing to maximize wins is never the wrong strategy...  
>...counting points is misguided,  it does not improve on the estimate but 
>something else might.
>The point count by itself just doesn't tell you if you are being smart or 
>stupid.

Choosing the move with the highest score is misguided (maximizing points) but 
by the same token, all (estimated) "winning" moves are not equivalent--for 
programs to get smarter we need better ways to distinguish between 
probabilities (risk) of winning moves.  Programs that currently treat all wins 
as equivalent are losing some close games they might otherwise win with better 
risk understanding.  (And perhaps opponent modeling would help.)
The concept of dynamic komi involves adjusting the score with an offset to 
differentiate otherwise equivalent winning moves.  This is one way of combining 
 the "maximize wins" strategy and score information.  The problem I see is that 
some "higher score" moves are also "higher risk" moves and lead to more 
volatile positions--and more losses.  For handicap games (where the opponent is 
weaker) this may work okay, but there must be a better way to make use of 
scores against stronger opponents.

Thanks to everyone for taking the time to explain their ideas.  I really 
appreciate the in-depth and open dialogue on this list.  I hope this discussion 
may have clarified some things for others (or even sparked and idea for further 
research).


Ben.

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] MCTS and perfect endgame

Reply via email to