Why are m and n different? Isn't every playout used both to update the UCT
win rate and the RAVE values for the same nodes? Won't the number of UCT
simulations and the number of RAVE simulations be the same?
Davdi
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of David Silver
David Silver wrote:
>BTW if anyone just wants the formula, and doesn't care about the
>derivation - then just use equations 11-14.
Yes, I just want to use the formula.
But I don't know what the "bias" is...
How can I get the value of br?
By the way I currently use this formula.
beta = 1 - log(
On Fri, 2008-02-08 at 16:39 -0700, David Silver wrote:
> 2. No, the assumption itself is not correct. The true value of a node
> in the tree is 0 or 1, given perfect play. So the UCT value (which
> just averages the outcomes of simulations) is significantly biased.
Who can predict perfect play?
On Feb 8, 2008 12:09 PM, David Silver <[EMAIL PROTECTED]> wrote:
> I think it is time to share this idea with the world :-)
> The idea is to estimate bias and variance to calculate the best
> combination of UCT and RAVE values.
> I have attached a pdf explaining the new formula.
Thanks!
The ori
Probably true, but I am already running into RAM
limits with big_Mogo18 - had to halve the number of
instances of the autotest program, and am installing
RAM in the next few days to alleviate this problem.
There is also the time-per-game, which will
approximately double.
I'd vote for moving on to