On 6/7/07, Peter Drake <[EMAIL PROTECTED]> wrote:
Chaslot G.M.J.B., Winands M.H.M. Winands, Uiterwijk J.W.H.M., van denHerik H.J., and Bouzy B. Progressive strategies for Monte-Carlo tree search. 1.3.1: Add heuristic value (divided by # of playouts) to UCT value.
While I haven't done any formal publishing, I have suggested a similar approach where the mean used for UCT is replaced by a weighted average of the traditional mean and a heuristic value.. Essentially x_bar_h = (x_bar_uct * n + heuristic * n_h)/(n+n_h). This is similar to section 1.3.1in the above paper. The last time I posted about it, I didn't know how to modify the full UCB calculation While I don't have my notes in front of me, I think I replaced sqrt(log(n_parent)/n) with sqrt(x_bar_h*(1-x_bar_h)*log(n_parent)/(n+n_h+1)) I started typing up the derivation into something paper-ish, but it's not done and I haven't tried it out in a bot yet to prove how effective it is (or isn't).
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/