On Fri, 2007-04-06 at 12:43 -0400, [EMAIL PROTECTED] wrote: > Alpha/Beta cutoffs only make sense when calling the evaluation > function twice on the exact same position can be guaranteed to > provide > the exact same value. This is obviously not the case for MC > evaluation, hence the success of UCT.
I don't know if any of this is true. You can apply alpha beta cutoffs whether the evaluation function is deterministic or not. UCT calls the "random play-out" only once on any given position and there is no reason in principle that it couldn't be a deterministic evaluation function instead. I have a theory that the search is more robust with some randomness since deterministic evaluation functions have misconceptions. It's all in how you choose to view things - but I see the two search techniques as being more similar than different. Of course you can choose to emphasize the differences and view them as more different if you want to. Here is how they are similar: 1. Both use global search. 2. Both use full width search. 3. Both use an evaluation function. UCT uses a "different" search - but they are both essentially mini-max search. Someone might argue that UCT is NOT full width, but it is. Chrilly used the terminology "soft-pruning" which means a pruning decision is not taken "forever" (to use his definition.) To me, a true selective program cuts of a line forever. Otherwise it is brute force (or full width.) Of course this is my definition, and possibly not the accepted definition. When Chrilly says alpha beta he doesn't mean just classical alpha beta - modern alpha beta is full of speculative cutoffs, but like UCT they "are not taken forever." It's not relevant whether Chrilly's program happens to be weaker or stronger than Crazy Stone at the moment - because there is lot of black art in making everything work whether it's alpha beta or UCT. Note that the UCT programs are not the same - they widely vary in how strong they are. I have this idea that perhaps a good evaluation function could replace the play-out portion of the UCT programs. The evaluation function would return a value between 0 and 1 and would be an estimate of the odds of winning. It only comes down to whether the inherent randomness is critical to the success of UCT or not. If it isn't, then a play-out is nothing more than just an evaluation function. - Don _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/