In a dream last night (not kidding) I realized one of the complicating factors in trying to generalize RAVE.
The UCT tree is highly selective. It allocates trials to paths that delay clear results for as long as possible. This is a side-effect of the selection policy, which shifts attention as soon as a move shows inferior results. There are two effects on trying to generalize RAVE. One is that you will be less likely to obtain statistically significant results, because UCT shifts before such results are obtained. Second, UCT will delve into unfamiliar positions rather than retrying familiar but bad situations. So we will often see situations that have never been seen before. The standard RAVE policy that focuses attention only on successors cannot be fooled by faulty generalizations, so it has a robustness edge. In my dream, I went on to overcome these difficulties and solve the problem. But when I woke up I couldn't remember how I did it... _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/