If you google for "computer go" and "beta distribution" you'll find several relevant links, like this one: https://webdisk.lclark.edu/drake/publications/BetaDistribution.pdf
On Thu, Sep 25, 2014 at 7:06 PM, Álvaro Begué <[email protected]> wrote: > I believe this has been discussed in the mailing list before: If your > prior distribution of the win rate of a move is uniform, after L losses and > W wins the posterior distribution will be a beta distribution with > alpha=W+1 and beta=L+1. The expected value of this distribution is > alpha/(alpha+beta) = (W+1)/(W+L+2), which is equivalent to the common trick > of starting the counters W and L at 1 instead of at 0. > > Of course one could start with a different prior, but I think staying > within the family of beta distributions makes sense because it's very > tractable. > > Is that the kind of thing you were looking for? > > > Álvaro. > > > > On Thu, Sep 25, 2014 at 6:28 PM, Alexander Terenin <[email protected]> > wrote: > >> Hello everybody, >> >> I’m a PhD student in statistics at the University of California, Santa >> Cruz who previously worked on the Go program Orego, currently in the >> process of applying for the NSF fellowship. I am working on a Bayesian >> statistics - related research proposal that I would like to use in my >> application, and wanted to know if someone was aware of any research >> related to my topic that has been done. >> >> Currently, it seems most MCTS-based Go programs, in the playouts, treat >> the strength (win rate) of each move as a fixed, unknown value, which is >> then estimated using frequentist techniques (specifically, by playing a >> random game, and taking the estimate to be wins / total runs). Has anyone >> attempted to instead statistically estimate the strength of each move using >> Bayesian techniques, by defining a set of prior beliefs about the strength >> of a certain move, playing a random game, and then integrating the >> information gained from the random game together with the prior beliefs >> using Bayes' Rule? Equivalently, has anyone defined the strength of each >> move to be a random variable rather than a fixed and unknown value? Without >> making this email too long, there’s some theoretical advantages that might >> allow for more information to be extracted from each playout if this setup >> is used. >> >> If you are aware of any work in this direction that has been done, I >> would love to hear from you! I’ve been looking through a variety of papers, >> and have yet to find anything - it seems that any work remotely related to >> Bayes’ Rule has concerned the tree, not the playouts. >> Thank you in advance, >> >> Alex Terenin >> [email protected] >> _______________________________________________ >> Computer-go mailing list >> [email protected] >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
