I was surprised many MC programs are not UCT anymore.
> UCB = (wins / games) + C*sqrt( log(all_games) / games )
> But in MFG, CS, Pachi and Fuego, C = 0. So they use something like this.
> UCB_RAVE = (1-beta)*(wins / games) + beta*(rave_wins / rave_games) +
> somebias.
>

I think that in many UCTs, the C was so small that it was close to the case
C=0.

In fact, wins/games is not asymptotically consistent (because a move with
0/1 is discarded if another move has a score >0).
But "(wins+K)/(games+2K)" for any K>0 makes a MCTS consistent. We've worked
on this in http://hal.inria.fr/inria-00437146/ .
Olivier
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to