On Jul 24, 2008, at 1:45 PM, John Stogin <[EMAIL PROTECTED]> wrote:
It seems that the UCB1-Tuned algorithm uses variance from a normal distribution, however we believe it would be more optimal to use variance from a beta distribution. Has any work been done in this area? Are people still using UCB1-Tuned to guide their explorations of moves?
I removed it from my code a few weeks ago. It was partly to reduce template-based complexity and partly because it didn't really fit with multiple win-rate estimators (e.g. RAVE, heuristics).
I recently derived a simple way to combine multiple multiple estimators, and I altered my code to use it.
A long time ago, I posted about using a priori knowledge for the distribution of winning rates and deriving a beta distribution.
Thanks, John Stogin _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/