Re: [computer-go] RefBot (thought-) experiments

Weston Markham Mon, 17 Nov 2008 16:53:22 -0800

On Mon, Nov 17, 2008 at 2:30 PM, Don Dailey <[EMAIL PROTECTED]> wrote:
> On Mon, 2008-11-17 at 16:04 -0200, Mark Boon wrote:
>> On another note, as an experiment I have a bot running on CGOS that
>> is the ref-bot but instead of using a fixed number of simulations I
>> use a fixed amount of time that slowly diminishes towards the end of
>> the game. The result is it does about 200K simulations per move for
>> most of the game on a single processor. Its rating is currently
>> stuck
> After 2k playouts you are already facing serious diminishing returns -
> but there is still a bit more to be had.  But after 5-10K playouts you
> get almost nothing.


I think that I have seen this sort of thing with Monte Carlo programs,
and I think it is possible to get even less than "almost nothing".
You may be getting overly-precise measurements of the Monte Carlo
values of the moves near the beginning of the game, so that the played
moves are biased toward lines of play where the Monte Carlo values are
unrealistically good.  (This could be thought of as being somewhat
analogous to a "horizon effect".)

Put another way, the Monte Carlo values do tend to accurately
distinguish the relatively good moves from the relatively poor moves,
(which of course makes them very useful) but at any given position,
you can't expect them to give the best score to the best move, even in
the limit.  As you run additional playouts, you can be more and more
confident that your program has identified the move with the best
Monte Carlo value.  But suppose that there are other moves that are
equally good (or better) under perfect play (or against a particular
opponent).  Then any supposed superiority of the program's selected
move over those alternatives is entirely due to inaccuracies of the
Monte Carlo value.  So once you are running enough playouts to detect
those differences, it also becomes more likely that subsequent
positions will encounter these same inaccuracies.

Weston
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

Reply via email to