If one set c=0, one must do something else to balancng the width and depth. If
not, one will get very inconsistent results. By the way I'm not sure c is the
best way to balancing the width and depth. This could be an excellent subject
for research.
DL
-Original Message-
From: Brian Sh
Hi Yamato,
If M and N are the same, is there any reason to run M simulations and
N simulations separately? What happens if you combine them and
calculate
V and g in the single loop?
I think it gives the wrong answer to do it in a single loop. Note that
the simulation outcomes z are used
> Theorem: In a finite game tree with no cycles, with binary rewards, the UCT
> algorithm with c==0
> converges (in the absence of computational limitations) to the game
> theoretic optimal policy.
>
This is also tree with RAVE instead of UCT, if you ensure that RAVE values
are never below some po
In reading Sylvain Gelly's thesis, it seemed that incorporating a prior
estimate of winning percentage is
very important to the practical strength of Mogo.
E.g., with 1 trials, Mogo achieved 2110 rating on CGOS, whereas my
program attempts to
reproduce existing research and is (maybe) 1900 ra