Re: [computer-go] RefBot (thought-) experiments

Michael Williams Mon, 17 Nov 2008 20:15:26 -0800

Weston Markham wrote:

I think that I have seen this sort of thing with Monte Carlo programs,
and I think it is possible to get even less than "almost nothing".
You may be getting overly-precise measurements of the Monte Carlo
values of the moves near the beginning of the game, so that the played
moves are biased toward lines of play where the Monte Carlo values are
unrealistically good.  (This could be thought of as being somewhat
analogous to a "horizon effect".)

"less than almost nothing" and "overly precise"? That doesn't make any sense to me. At any point in the simulation run, the MC value will be a noisyrepresentation of the value at it's limit (infinite simulations).

Put another way, the Monte Carlo values do tend to accurately
distinguish the relatively good moves from the relatively poor moves,
(which of course makes them very useful) but at any given position,
you can't expect them to give the best score to the best move, even in
the limit.  As you run additional playouts, you can be more and more
confident that your program has identified the move with the best
Monte Carlo value.  But suppose that there are other moves that are
equally good (or better) under perfect play (or against a particular
opponent).  Then any supposed superiority of the program's selected
move over those alternatives is entirely due to inaccuracies of the
Monte Carlo value.  So once you are running enough playouts to detect
those differences, it also becomes more likely that subsequent
positions will encounter these same inaccuracies.


No one ever alleged that pure AMAF or pure MC was infinitely scalable.

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

Reply via email to