Re: [Computer-go] MCTS and perfect endgame

terry mcintyre Sun, 03 Jul 2011 20:17:30 -0700

I have several reasons for suggesting some form of the "rich men don't pick 
fights, but they don't give away points either" philosophy.

The major one is that the MCTS scoring function is imperfect; historically, 
programs have snatched defeat from the jaws of victory by letting points be 
nibbled away in yose. 

Second, it is unsatisfying to play against a program which becomes indifferent 
in the yose stage. My reaction is "what, are you phoning in your moves now?" - 
this might be annoying but tolerable if the program actually had reason to be 
so 
sure of itself, but experience has shown that it does not; see above. 

Third, the "only wins matter" approach seems to discard a great deal of useful 
information. 

Terry McIntyre <[email protected]>

Unix/Linux Systems Administration
Taking time to do it right saves having to do it twice.

________________________________
From: Álvaro Begué <[email protected]>
To: [email protected]
Sent: Sun, July 3, 2011 10:50:50 PM
Subject: Re: [Computer-go] MCTS and perfect endgame

On Sun, Jul 3, 2011 at 10:14 PM, terry mcintyre <[email protected]> wrote:
> From: Jean-loup Gailly <[email protected]>
> To: [email protected]
> Sent: Sun, July 3, 2011 9:12:59 AM
> Subject: Re: [Computer-go] MCTS and perfect endgame
>
> Leon,
>> One of problems (which I tested with gogui, thankyou very much)
>> was losing points in endgame when program is winning.
> This is by design. Pachi maximises the chance of winning, not the number
> of points. But if you want Pachi to win by more points while increasing
> the risk of losing, you can simply increase the parameter val_scale. See the
> description in uct/uct.c: "How much of the game result value should be
> influenced by win size. Zero means it isn't". The default value is 0.04,
> which is the result of tuning. (If you increase val_scale above this it
> starts
> losing more.)
>
> Why should this value be static? Shouldn't the behavior change when there is
> a certain win?

It should be static for a reason that is perhaps more philosophical
than practical. I view MCTS as a procedure to maximize the expected
value of a utility function (e.i., how happy I am with the result),
which is in some important sense the only rational way to make
decisions. If the utility of any win is the same, it makes sense to
simply maximize the probability of winning. If we are not happy with
the program wasting points in a favorable endgame, it must be the case
that we are happier with a win by a large margin than with a win by a
small margin, so it makes sense to build that into the reward
function, which is what val_scale does. Perhaps a sigmoid of some sort
would be a better shape, but it should not be something that changes
dynamically.

Álvaro.
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] MCTS and perfect endgame

Reply via email to