On Fri, 2008-02-08 at 16:39 -0700, David Silver wrote:

> 2. No, the assumption itself is not correct. The true value of a node
> in the tree is 0 or 1, given perfect play. So the UCT value (which
> just averages the outcomes of simulations) is significantly biased.

Who can predict perfect play?  I'm willing to go with a slow-varying
probability as search continues.  Of course, we assume away that slow
part...


> > Calculation of the
> > MSE seems to assume this going into the last step but doesn't simplify life
> > by doing it in the first reduction...

> I've numbered the equations to make it easier to discuss (attached). 
> The zero uct bias is used to get from equations 3 to 4 - is this what
> you mean?
> Maybe there is a simpler way to derive this (or a better) result - I
> am definitely not a statistician! Suggestions welcome :-)

Yeah, that's what I was talking about.  If bu=0 is assumed (and stated),
then bur^2=(B*br)^2 is trivial and equation 3 can be skipped altogether.


> BTW if anyone just wants the formula, and doesn't care about the
> derivation - then just use equations 11-14.
> >  Maybe it's just academic, but when I plug in bias = 0, I don't get the UCT
> > formula for sims = n+m.  Q comes out correct, but Q+ does not.  I guess I'd
> > sort of expect to see something along the lines of Q+ur = Qur +
> > c*sqrt(log(???)/x) where x = B^2/m + (1-B)^2/n.  When br = 0, x reduces to
> > m+n.  Maybe I'm just crazy and there's no good way to compute "???" inside
> > my log.
> I think this is a really good point. Making a linear combination of
> the upper confidence bounds (14) is not mathematically justified
> (although it works in practice).

I think I got my equation for x wrong.  I think (true x) = 1/(old x
definition).  Maybe mathematically justified bounds would work better in
practice, but the "???" term along with a few floating constants would
make optimizing it to find out the truth kind of tough.


_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to