In a dream last night (not kidding) I realized one of the complicating
factors in trying to generalize RAVE.

The UCT tree is highly selective. It allocates trials to paths that delay
clear results for as long as possible. This is a side-effect of the
selection policy, which shifts attention as soon as a move shows inferior
results.

There are two effects on trying to generalize RAVE. One is that you will be
less likely to obtain statistically significant results, because UCT shifts
before such results are obtained.

Second, UCT will delve into unfamiliar positions rather than retrying
familiar but bad situations. So we will often see situations that have never
been seen before.

The standard RAVE policy that focuses attention only on successors cannot be
fooled by faulty generalizations, so it has a robustness edge.

In my dream, I went on to overcome these difficulties and solve the problem.
But when I woke up I couldn't remember how I did it...

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to