Re: [computer-go] MC - Estimating a moves true probability of winning

Jason House Thu, 01 Mar 2007 17:53:49 -0800

I respond to various items below. Sections of the original e-mail thatI'm not responding to were completely deleted.


Jacques Basaldúa wrote:

Hello Jason


I think what you are trying to do can be done more easily.


I guess the key question is "what am I trying to do?".

In UCT, the next move to simulate is chosen based off of an estimatedprobability of winning. Correcting bias in that estimate should lead tobetter sampling.

If one abruptly stops all simulations and picks the "best" move based onthis estimated probability, I think this may give the optimal answer...choosing the move with the highest expectation of winning the game.It's important to note that with a peaked distribution of moves'probabilities of winning, the estimated probability of winning will riseslowly. That means that moves with only a few simulations will never bechosen over moves with (a good win rate and) a lot of simulations.

C. To compare if a move is better than another, you have
to compare _confidence intervals_. I.e. the interval in which
p (the unknown probability) lies computed from your
observed p-hat, n and a desired confidence level, say 95%.
These intervals can be computed with methods you can
find searching for "Confidence Interval for a Binomial
Proportion". The most used are Wilson and Agresti-Coull
intervals. These intervals include continuity correction
as you mention in your post. Other ways of comparing
proportions are: The difference between proportions, the
relative risk and the odds ratio to name a few. My "Bible"
for this is a book called "Categorical Data Analysis" from
Alan Agresti published by Wiley & Sons.

With lots and lots of simulations, this could lead to a prediction suchas move a is better than move b with 95% confidence. If a bot wants toprove with high confidence that the move that it has selected is betterthan all others, I suspect it may have to do lots and lots ofsimulations and would be impractical. I think that was the same pointyou made later in your e-mail. I welcome someone proving us wrong ;)

An alternate but related approach is "move a is better than move b witha p-value of xx%". Of course, I'm also not too sure on how to use thatresult.

> To use these results, you must make some assumption
> about the underlying distribution of a move's probability
> of winning.

That's the good news. You don't. There is no need to
understand what complex mechanism produces p. Only
that: same position == same p.

If you take a good look at your tests, they will make very specific nullhypothesis which in effect make at least some assumption about theunderlying distributions (or try to wash away all effects with thecentral limit theorem).


*Do not* expect a sound statistical analysis to tell you
the best move, unless it is very obvious, n is immense or
your confidence level is extremely low. But, if you are
lucky, it will tell you what moves are clearly bad and can
be safely (= with a given confidence) pruned out.


I agree with that.
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] MC - Estimating a moves true probability of winning

Reply via email to