Re: [computer-go] UCT RefBot

Magnus Persson Thu, 20 Nov 2008 09:20:36 -0800

Quoting Mark Boon <[EMAIL PROTECTED]>:

What is not exactly clear to me is what you mean by 'postponing
expansion'. Let me write it in my own words to see if that's what you
mean. When you have selected a best node based on the UCT + wins/visits
value which has no children yet, you first simply do a simulation and
collect the playout result in the current node, including the AMAF
value that you call 'virtual win-visit ratio, and only when that is
done a certain number of times (in your case 10) do you suddenly create
all the children and weight them based on the virtual win-visit ration
and possibly weight them based on other move-priorities that resulted
from 'heavy' playout selection?

Yes I "suddenly" create all children, but at the creation I have nosimulation and thus no virtual win-visit ratios for the children(although one might copy such values from higher up in the tree, whichI think the Mogo team tried but with little or no success). Thevirtual win-visit ratios are initialized to some default value. Butone can initialize the ratios differently depending on staticevaluation the position using patterns for example. Or the proximityheuristic. It is not clear to me how to best do this. And I need totest the parameters for biasing. The nice thing is that one caninitialize the virtual win-visits ratios and keep real win-visitsratios unbiased. You can afford to make mistakes here because if theposition is searched a lot the virtual values get data much quickerthan the real ones.


Magnus
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UCT RefBot

Reply via email to