I talked to Chenjun just now so this is what we both remember.
The PUCB formula as published in Chris Rosin’s paper actually has an additive 
knowledge term, and it looks nothing like the two different PUCT variants tried 
in AlphaGo and our paper.

Chenjun tried an additive term as in Chris’ paper first, and it did not work 
well for him. Then he tried the “UCT-like” PUCT as in our paper, with the decay 
term under the square root. This worked well for him. He never tried the 
AlphaGo formula with the faster decay.

Beyond the published papers, my memory is that many people tried many different 
combinations of knowledge terms and of decay formulas over time. I have never 
read any systematic comparison, or any principled argument for which type of 
decay is “best” for which type of knowledge, and why. It must depend on the 
quality of knowledge, amongst many other things.

There are also earlier MoGo papers that combined many different evaluation 
terms into one formula and tested them empirically.

        Martin

> However, the formula in the AGZ paper doesn't look like any "UCT variant". 
> Formula from paper: Cpuct * P(s,a) * sqrt(Sum(N(s,b))) / (1 + N(s,a)) Note 
> that there is no logarithmic term, and the division by N+1 falls outside the 
> sqrt. For comparison, a normal UCT term looks like sqrt(ln(sum(N(s,b))) / (1 
> + N))
> 
> Since I asked my question, I found that other people have also noticed a 
> discrepancy. I saw a post on a DeepChem board about this subject. I also 
> found a paper 
> (https://webdocs.cs.ualberta.ca/~mmueller/ps/2016/2016-Integrating-Factorization-Ranked.pdf)
>  by our old friends Chenjun Xiao and Martin Muller:
> 
>     "We apply a variant of PUCT [11] formula which is used in AlphaGo [12] to 
> integrate FBT knowledge in MCTS. ...." But then the formula that they give 
> differs: argmax((Q(s,a) + Cpuct * P(s,a) * sqrt( lg(N(s)) / (1 + N(s,a)))
> 
> I am guessing that Chenjun and Martin decided (or knew) that the AGZ paper 
> was incorrect and modified the equation accordingly.
> 
> Anyone remember anything about this?
> 
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to