Re: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-18 Thread Erik van der Werf
Hi David, On Sat, Feb 16, 2008 at 7:07 PM, David Silver <[EMAIL PROTECTED]> wrote: > Yes, but why add upper confidence bounds to the rave values at all? If > they really go down that fast, does it make much of a difference? > > According to the recent experiments in MoGo, you are right :-) Howeve

Re: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-17 Thread Jason House
Good catch Yamato. I think the idea is that they're trying to calculate the true variances rather than the sample variances. It's true that q_ur would probably give a better estimate than q_u or q_r alone. Of course, q_ur depends on beta, and as they calculate it, beta depends on q_ur. It may b

Re: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-17 Thread Yamato
David Silver wrote: >There are two differences between your suggestion and the original >formula, so I'll try and address both: > >1. Your formula gives the variance of a single simulation, with >probability value_u. But the more simulations you see, the more you >reduce the uncertainty, so y

[computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-16 Thread David Silver
I am very confused about the new UCT-RAVE formula. The equation 9 seems to mean: variance_u = value_ur * (1 - value_ur) / n. Is it wrong? If correct, why is it the variance? I think that the variance of the UCT should be: variance_u = value_u * (1 - value_u). Hi Yamato, There are two differe

[computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-16 Thread David Silver
David Silver wrote: >BTW if anyone just wants the formula, and doesn't care about the >derivation - then just use equations 11-14. Yes, I just want to use the formula. But I don't know what the "bias" is... How can I get the value of br? Sorry for the slow reply... The simplest answer is that t

[computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-16 Thread David Silver
Hi Erik, Thanks for the thought-provoking response! Yes, but why add upper confidence bounds to the rave values at all? If they really go down that fast, does it make much of a difference? According to the recent experiments in MoGo, you are right :-) However, I've seen slightly different resul

Re: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-15 Thread Yamato
I am very confused about the new UCT-RAVE formula. The equation 9 seems to mean: variance_u = value_ur * (1 - value_ur) / n. Is it wrong? If correct, why is it the variance? I think that the variance of the UCT should be: variance_u = value_u * (1 - value_u). Why cannot we use that? Anyway, c

Re: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-09 Thread Erik van der Werf
Hi David, On Fri, Feb 8, 2008 at 6:09 PM, David Silver <[EMAIL PROTECTED]> wrote: > > Note as well that the current implementation of MoGo (not the one at > > the time of the ICML paper) use a different tradeoff between UCT and > > Rave value, thanks to an idea of David Silver, which brought >

RE: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-09 Thread Olivier Teytaud
Why are m and n different? Isn't every playout used both to update the UCT win rate and the RAVE values for the same nodes? Won't the number of UCT simulations and the number of RAVE simulations be the same? Each playout is used both to update the UCT win rate and the RAVE values for the same

RE: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-08 Thread David Fotland
f Of David Silver Sent: Friday, February 08, 2008 3:40 PM To: computer-go@computer-go.org Subject: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8) Hi Jason, The original paper's formula for beta always felt wrong to me. I like this new one a lot better. Good!

Re: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-08 Thread Yamato
David Silver wrote: >BTW if anyone just wants the formula, and doesn't care about the >derivation - then just use equations 11-14. Yes, I just want to use the formula. But I don't know what the "bias" is... How can I get the value of br? By the way I currently use this formula. beta = 1 - log(

Re: [computer-go] New UCT-RAVE formula (was Re: computer-go Digest, Vol 43, Issue 8)

2008-02-08 Thread Jason House
On Fri, 2008-02-08 at 16:39 -0700, David Silver wrote: > 2. No, the assumption itself is not correct. The true value of a node > in the tree is 0 or 1, given perfect play. So the UCT value (which > just averages the outcomes of simulations) is significantly biased. Who can predict perfect play?