In the mentioned article it is said that :
"
A weakness remains, due to the initial time: at the very beginning, neither AMAF
values nor standard statistics are available
"
I have developed a model that aim at healing that part. It gives
virtual_simulations
out of arbitrary ancestor at a rate of 1/2 for each node you go through.
I called that process "GENERALIZED_AMAF". I'm not familiar
with mathematical terms, but i think the idea is dead simple :
AMAF goes as this :
Evaluate the expected results KNOWING that move A has been played.
GENERALIZED_AMAF :
Evaluate the expected results KNOWING that move A (AND) move B (AND) move C
has been played.
--------------
You can easily build for example an AMAF_map, which would get both the moves
that have been marked as "played" by black on one part, and the moves that has
been "played" by white in the other part. That is associated with the result
(black win/lose)
Given a node, classical AMAF would then try to check out every of those maps,
for each simulation starting from that node where the child you are trying to
score has been played.
Generalized AMAF can build statistics out simulation from arbitrary
ancestors from the considered node. You would get (1/2)^n simulations
from the ancestor as GENERALIZED_AMAF_simulations (n the number of choice
made from the ancestor to the node you assess).
Suppose that you have 1000 simulations for a root node R.
Then amaf gives you about 500 simulations for every child
let's call them Cn those child, where n is the id of the child.
Cn also represent the move made from R to reach that child.
Then for each child of Cn, you can get 250 generalized_amaf
simulations. you would consider from the set of simulations from the root, only
those where
Cn has been played, and then aggregate the results in the AMAF fashion for each
son of Cn.
My prototype was scoring 30% win against Gnu-go lvl 0,
without any tuning using plain standard_light_simulations.
It would use the generalized-amaf
as a way to get RAVE values. Then it would guarantees that
the most promising nodes is explored exponentially more.
(it used a raw non deterministic algorithm)
I did not however tried to compare that win ratio, with
the win ratio i would have got out of standard Amaf.
Best regard,
Denis FIDAALI.
> Date: Mon, 1 Dec 2008 21:55:03 +0100
> From: [EMAIL PROTECTED]
> To: computer-go@computer-go.org
> Subject: Re: [computer-go] Mogo MCTS is not UCT ?
>
> > I think it's now well known that Mogo doesn't use UCT.
> > I realize that i have no idea at all what Mogo do use for
> > it's MCTS.
>
> A complicated formula mixing
> (i) patterns (ii) rules (iii) rave values (iv) online statistics
>
> Also we have a little learning (i.e. late parts of simulations
> are evolved based on online statistics and not only the early parts).
>
> > I really wonder if there was an article describing
> > the new MCTS of mogo somewhere that i missed.
> > How is it better than UCT ?
>
> http://www.lri.fr/~teytaud/eg.pdf contains most of the information
> (many other things
> have been tried and kept as they provided small improvements, but
> essentially the
> ideas are in this version)
>
> Best regards,
> Olivier
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
_________________________________________________________________
Téléphonez gratuitement à tous vos proches avec Windows Live Messenger !
Téléchargez-le maintenant !
http://www.windowslive.fr/messenger/1.asp
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/