Chaslot G (MICC) wrote:
p_hat = (w_i + n_h*H_B)/(n_i+n_h)
Interesting... But then how do you compute n_h in practice
The mathematical derivation is based on estimating an a-priori
probability distribution. In theory, one simply needs to run MC
simulations for a wide variety of heuristically identical situations,
and then fit the best beta distribution to the measured data. A beta
distribution has two parameters - alpha and beta. n_h = alpha+beta
and n_h*H_B = alpha.
In practice... I don't have an MC bot yet. I'm slowly redoing my bot
in D (an up and coming programming language http://www.tiobe.com/tpci.htm).
For the full version of my paper I will compare different ways to modify the probability distribution according to knowledge.
I believe there is no optimal way to do that :(
Well, if beta distributions are a good fit, then the above would be the
optimal probability distribution... Of course, my analysis doesn't
take tree searches into... Maybe I'll get lucky and it'll work well like
a multi-armed bandit. Actually, even if it doesn't, being optimal
before it's time to build a subtree may be enough. I think I've seen
stuff like waiting until doing 100 simulations. If n_h is relatively
small, the effect is probably sufficiently washed out by then and
optimality probably doesn't matter.
I guess if empirical evidence shows beta distributions are a good fit
and a high n_h is appropriate, then I'll have to revisit the shortcomings...
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/