On Fri, 2008-02-08 at 16:39 -0700, David Silver wrote: > 2. No, the assumption itself is not correct. The true value of a node > in the tree is 0 or 1, given perfect play. So the UCT value (which > just averages the outcomes of simulations) is significantly biased.
Who can predict perfect play? I'm willing to go with a slow-varying probability as search continues. Of course, we assume away that slow part... > > Calculation of the > > MSE seems to assume this going into the last step but doesn't simplify life > > by doing it in the first reduction... > I've numbered the equations to make it easier to discuss (attached). > The zero uct bias is used to get from equations 3 to 4 - is this what > you mean? > Maybe there is a simpler way to derive this (or a better) result - I > am definitely not a statistician! Suggestions welcome :-) Yeah, that's what I was talking about. If bu=0 is assumed (and stated), then bur^2=(B*br)^2 is trivial and equation 3 can be skipped altogether. > BTW if anyone just wants the formula, and doesn't care about the > derivation - then just use equations 11-14. > > Maybe it's just academic, but when I plug in bias = 0, I don't get the UCT > > formula for sims = n+m. Q comes out correct, but Q+ does not. I guess I'd > > sort of expect to see something along the lines of Q+ur = Qur + > > c*sqrt(log(???)/x) where x = B^2/m + (1-B)^2/n. When br = 0, x reduces to > > m+n. Maybe I'm just crazy and there's no good way to compute "???" inside > > my log. > I think this is a really good point. Making a linear combination of > the upper confidence bounds (14) is not mathematically justified > (although it works in practice). I think I got my equation for x wrong. I think (true x) = 1/(old x definition). Maybe mathematically justified bounds would work better in practice, but the "???" term along with a few floating constants would make optimizing it to find out the truth kind of tough. _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/