> Theorem: In a finite game tree with no cycles, with binary rewards, the UCT > algorithm with c==0 > converges (in the absence of computational limitations) to the game > theoretic optimal policy. >
This is also tree with RAVE instead of UCT, if you ensure that RAVE values are never below some positive constant (this was posted in discussions between Sylvain and me a long time ago in this mailing list, but never published I guess). By the way: - I guess there are several codes in which c=0, or c>0 but with no significant improvement over c=0. If you want to publish scientific papers, claim that c>0; UCT is so widely accepted that your paper is more likely to be accepted with c>0 :-) - even withou parallelization, mogo is better with constant 0. - on the other hand, and in particular for stochastic games with one player (which is industrially quite relevant), I find UCT quite good. Also, progressive widening as published by Rémi is efficient in this one-player setting (but with different constants). I'm definitely convinced that UCT provides a very great improvement, e.g. in front of Bellman's dynamic programming, in many cases. - Another point is that with the pattern-based exploration term (not UCT-like), we have an exploration term; but not at all UCT-like (something like Heuristic(move)/log(2+numberOfSimulations) ). A somewhat philosophical point related to the fact that UCT is not so efficient for Go, is that clear sound algorithms are probably not the main elements in Go; plenty of parameters (the coefficients of patterns, the choice of Monte-Carlo player) are necessary and do not follow short and sound rules. If you want to publish scientific papers, do not say that - you should preferably hide the details, and show only a small and well chosen part of your results, so that people can believe that a small sound idea is the explanation for the fact that the program works. But, in the particular case of Go, and somewhat disappointingly, I'm not sure that reality is like that. I prefer nonlinear optimization or applications of UCT in one-player games or applications for this point of view - there, sound ideas usually work and outperform dirty tricks :-)
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/