Nice to hear that :)

2009/6/5 Michael Williams <michaelwilliam...@gmail.com>:
> Łukasz Lew wrote:
>>
>> On Wed, Jun 3, 2009 at 00:56, Michael Williams
>> <michaelwilliam...@gmail.com> wrote:
>>>
>>> Two things:  Firstly, I'm storing (only in RAM) the precalculated Winrate
>>> and InvSqrtVisits and keeping them updated.
>>> So my UCT formula went from
>>>
>>>       Wins / Visits + sqrt(lnParentVisits / Visits)
>>>
>>> to
>>>
>>>       WinRate + sqrtLnParentVisits * InvSqrtVisits;
>>>
>>> This has a memory cost, but I don't care so much about RAM since I can
>>> send
>>> the nodes to disk.
>>>
>>> And the second thing is to store in the parent node a reference to what
>>> is
>>> likely the UCT-best child node.  If the parent has been visited
>>> 100*boardspaces times, I will go directly to the likely-best child with
>>> probability 2047/2048.  Anytime a proper UCT loop occurs, the likely-best
>>> reference is updated (about 90% of the time there is no change, so I
>>> think
>>> it's safe).
>>
>> This is quite similar to epsilon trick described here:
>> http://www.mimuw.edu.pl/~lew/files/epsilon_trick.pdf
>>
>> in short when you calculate best UCT child you visit it
>> max(best.visit_count * epsilon, 1) times
>> with epsilon = 0.05 for instance
>> It works well both for new and old nodes, but you have to keep the
>> counter of visits.
>> The soft way would be to recalculate best child with probability
>> min(1, 1/(best.visit_count*epsilon)).
>>
>> Both variants of ET can give you some guarantees about the way the
>> tree is explored.
>>
>> Łukasz
>
>
> I switched to this method.  Thanks, Lukasz.
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to