>I think someone pointed out a long time ago on this mailing list that >initializing the prior in terms of Rave simulations was far less efficient >than initializing the prior in terms of "real" simulations.
You might be recalling an exchange that I had with Sylvain. I asked how initial bias was implemented in Mogo, and Sylvain replied that either one will work. And that is true, but biasing the UCT values is much more forceful. There are three differences. First, assigning a win (or loss) to a UCT term is more significant than assigning to a RAVE term because RAVE observations are vastly more plentiful. Second, assigning to UCT causes the upper confidence bounds to start at a less optimistic level, which wastes fewer trials on pointless exploration. Third, engines (should) have a policy for flowing UCT scores up the tree along transpositions. Assigning to RAVE does not exploit that capability. That being said, there is a caveat: assignments to UCT should be unbiased estimates of winning percentage. RAVE terms can express other priorities. For example, Pebbles bias in favor of exploring atari can be as large as 24 wins in 24 trials. The bias varies, depending on the situation, but it is never smaller than 9 wins in 9 trials. It is clear that the 24/24 bias (which is given whether winning or not) is not a sensible estimate of winning chances. Nevertheless, the bias works because it favorably changes search behavior. Obviously, you must search atari moves if you don't want to lose. Pebbles' automated parameter tuning system has pushed that parameter up because higher values helped it to win games. I highly recommend reading the Fuego implementation. I believe that it is largely because of well-judged UCT priors that Fuego plays so efficiently with small trials. E.g. Fuego-400nodes rated at 1518 on 9x9 CGOS. That's only 5 trials per empty point! BTW, Pebbles does not do a good job on this issue. Pebbles uses only RAVE biases, though I have known since my exchange with Sylvain that it was a worse choice. Unfortunately, I started on the other implementation, and now all of the priors are unrelated to winning chances. I need to create a new system that evaluates move quality, and I haven't gotten around to it yet. _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/