Re: [computer-go] UCT RefBot

Mark Boon Thu, 20 Nov 2008 09:50:52 -0800


On 20-nov-08, at 15:20, Magnus Persson wrote:

Quoting Mark Boon <[EMAIL PROTECTED]>:
What is not exactly clear to me is what you mean by 'postponing
expansion'. Let me write it in my own words to see if that's what you
mean. When you have selected a best node based on the UCT + wins/visits
value which has no children yet, you first simply do a simulation and
collect the playout result in the current node, including the AMAF
value that you call 'virtual win-visit ratio, and only when that is
done a certain number of times (in your case 10) do you suddenlycreateall the children and weight them based on the virtual win-visitration
and possibly weight them based on other move-priorities that resulted
from 'heavy' playout selection?
Yes I "suddenly" create all children, but at the creation I have nosimulation and thus no virtual win-visit ratios for the children(although one might copy such values from higher up in the tree,which I think the Mogo team tried but with little or no success).The virtual win-visit ratios are initialized to some default value.But one can initialize the ratios differently depending on staticevaluation the position using patterns for example. Or theproximity heuristic. It is not clear to me how to best do this. AndI need to test the parameters for biasing. The nice thing is thatone can initialize the virtual win-visits ratios and keep real win-visits ratios unbiased. You can afford to make mistakes herebecause if the position is searched a lot the virtual values getdata much quicker than the real ones.

OK, things start to fall in place for me. I was wondering all thistime what would happen with the information of the simulations thathappen before expansion. So the answer is: nothing. But at least theresult of the playout gets percolated up the tree, so I suppose youget the most important bit of information of the playout in your treeanyway. I have also been wondering before what was meant when I readthat CrazyStone uses ownership information in the patterns (if Iremember that correctly). I was always wondering how you gotownership information before doing playouts. So it turns out thesecome from simulations done before expansion. Somehow I haddisregarded that idea because I was under the assumption it woulddiscard too much information. But I suppose keeping the playoutresults in the tree is valuable enough by itself. And like you say,you win back a lot by not having to do at least a playout for everysingle move at every level in the tree.

From the beginning I have been experimenting with ownershipinformation and I think I have a few interesting ideas. But theydidn't fit in yet. Maybe with my new understanding I can go back andtry out a few of those ideas again. It means rewriting my search-codea bit. Hopefully it's not too much work.


Mark

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UCT RefBot

Reply via email to