On 6/7/07, Peter Drake <[EMAIL PROTECTED]> wrote:

Chaslot G.M.J.B., Winands M.H.M. Winands, Uiterwijk J.W.H.M., van denHerik
H.J., and Bouzy B. Progressive strategies for Monte-Carlo tree
search.
1.3.1: Add heuristic value (divided by # of playouts) to UCT value.


While I haven't done any formal publishing, I have suggested a similar
approach where the mean used for UCT is replaced by a weighted average of
the traditional mean and a heuristic value..  Essentially x_bar_h =
(x_bar_uct * n + heuristic * n_h)/(n+n_h).  This is similar to section
1.3.1in the above paper.  The last time I posted about it, I didn't
know how to
modify the full UCB calculation  While I don't have my notes in front of me,
I think I replaced sqrt(log(n_parent)/n) with
sqrt(x_bar_h*(1-x_bar_h)*log(n_parent)/(n+n_h+1))

I started typing up the derivation into something paper-ish, but it's not
done and I haven't tried it out in a bot yet to prove how effective it is
(or isn't).
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to