Hi such question that do you typically traverse all child objects or is there faster way to select explored node child object. I have concluded that it is not at least easy as multiple nodes uct values change each simulation so trying to keep biggest uct value in first in list is maybe too expensive best practical way is go trough all child moves.

Or if _all_ UCT leaf nodes could be uctvalue "heap or similar",
   then biggest leaf is selected,
   traversed for root,
   game from root->leaf re-played- then random simulation
   leaf is stored its right position.

So atleast in this scenario selecting uct node is fast but this heap updating is it worth it?

t. Harri

----- Original Message ----- From: "Gunnar Farnebäck" <[EMAIL PROTECTED]>
To: "computer-go" <computer-go@computer-go.org>
Sent: Saturday, February 02, 2008 7:47 PM
Subject: Re: [computer-go] Hydra theory (was Hybrid theory)


David Doshay wrote:
> I looked up borda voting on Wikipedia. I did not know this was called
> Borda voting, and it might be called a zeroth-order version of what I
> am thinking. Rather than just take rank order from each, I intended to
> try to include other metrics, for example, some measure of distance
> from top. One engine may evaluate that there is one really great move
> with all others considered very bad. That is different than many nearly
> equal good moves.

[Commenting a somewhat arbitrary message in the thread.]

MonteGNU (and "gnugo --monte-carlo" in the next development release)
is doing a simple form of voting. Internally there are two engines;
one UCT-MC engine and the ordinary GNU Go move generation. The UCT
engine is allowed to nominate a number of moves and the ordinary move
generation must choose one of those.

More specifically the UCT engine nominates the following moves:

1. The highest scoring move in terms of win rate. Let N be the number
   of simulations for this move.
2. All moves with more than N/2 simulations
3. All moves with win rate win rate greater than 0.9 and more than
   N/10 simulations.
4. All moves if the highest scoring move has win rate less than 0.1.

The move values from the ordinary move generation are then used to
choose one of the nominated moves. If all nominated moves are thought
useless the highest scoring UCT move is chosen. Pass is generated if
and only if the ordinary move generation wants to pass.

Usually this works fairly well. Most of the strength (on 9x9) comes
from the UCT part and if it finds a clear winner the ordinary move
generation only has a single move to choose from. When safely ahead
many moves are nominated and the ordinary move generation gets to play
for points, providing sensibly looking moves. Similarly when clearly
losing and resignation is disabled, ordinary move generation gets to
finish the game off in a tidy manner.

A nice point is that with too few simulations the UCT engine will
nominate lots of moves and the total result is that it more or less
reverts to the ordinary move generation.

Occasionally, however, it gives the worst of two worlds. The UCT
engine may have a good move A first but a bad move B close behind and
also nominated . The ordinary move generation maybe likes another good
move C best and hates B but has completely missed A. Then B will be
played, although both engines prefer better moves.

/Gunnar
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to