If you google for "computer go" and "beta distribution" you'll find several
relevant links, like this one:
https://webdisk.lclark.edu/drake/publications/BetaDistribution.pdf



On Thu, Sep 25, 2014 at 7:06 PM, Álvaro Begué <[email protected]>
wrote:

> I believe this has been discussed in the mailing list before: If your
> prior distribution of the win rate of a move is uniform, after L losses and
> W wins the posterior distribution will be a beta distribution with
> alpha=W+1 and beta=L+1. The expected value of this distribution is
> alpha/(alpha+beta) = (W+1)/(W+L+2), which is equivalent to the common trick
> of starting the counters W and L at 1 instead of at 0.
>
> Of course one could start with a different prior, but I think staying
> within the family of beta distributions makes sense because it's very
> tractable.
>
> Is that the kind of thing you were looking for?
>
>
> Álvaro.
>
>
>
> On Thu, Sep 25, 2014 at 6:28 PM, Alexander Terenin <[email protected]>
> wrote:
>
>> Hello everybody,
>>
>> I’m a PhD student in statistics at the University of California, Santa
>> Cruz who previously worked on the Go program Orego, currently in the
>> process of applying for the NSF fellowship. I am working on a Bayesian
>> statistics - related research proposal that I would like to use in my
>> application, and wanted to know if someone was aware of any research
>> related to my topic that has been done.
>>
>> Currently, it seems most MCTS-based Go programs, in the playouts, treat
>> the strength (win rate) of each move as a fixed, unknown value, which is
>> then estimated using frequentist techniques (specifically, by playing a
>> random game, and taking the estimate to be wins / total runs). Has anyone
>> attempted to instead statistically estimate the strength of each move using
>> Bayesian techniques, by defining a set of prior beliefs about the strength
>> of a certain move, playing a random game, and then integrating the
>> information gained from the random game together with the prior beliefs
>> using Bayes' Rule? Equivalently, has anyone defined the strength of each
>> move to be a random variable rather than a fixed and unknown value? Without
>> making this email too long, there’s some theoretical advantages that might
>> allow for more information to be extracted from each playout if this setup
>> is used.
>>
>> If you are aware of any work in this direction that has been done, I
>> would love to hear from you! I’ve been looking through a variety of papers,
>> and have yet to find anything - it seems that any work remotely related to
>> Bayes’ Rule has concerned the tree, not the playouts.
>> Thank you in advance,
>>
>> Alex Terenin​
>> [email protected]​
>> _______________________________________________
>> Computer-go mailing list
>> [email protected]
>> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
>
>
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to