Hello,
Is there any known (by theory or tests) function of how much a increase in the strength of the simulation policy increases the strength of the MC/UCT Program as a whole?
I think that is a very interesting question. In our work on MoGo we found that there could be a decrease of the strength of the MC/UCT program while using a stronger simulation policy. It is why in MoGo it is more the "sequence idea", than the "strength idea". Our best simulation policy is quite weak compared to others we tested. But we have further experiments, in a work with David Silver from the university of Alberta. We found out that the relation "strong simulation policy" <=> "strong MC program" is wrong at a much larger scale. So the "intransivity" is true even with much much stronger simulation policies. Of course there is the simple counter example of a deterministic player. But our results hold even if we randomise (in a lot of manners, and tuning as best as we can the parameters) the much stronger policy. I have some theory about this phenomenon in general, but not enough "polished" for the moment. I really think that understanding deeply this experimental evidence, deeper than some intuition, would help going further. But maybe some already did. Sylvain _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/