
You're probably right and I'm misunderstanding how it's supposed to work.

Let me quote te original description:

  6.  Scoring for game play uses AMAF - all moves as first.  In the
      play-outs, statistics are taken on moves played during the
      play-outs.  Statistics are taken only on moves that are played by
      the side to move, and only if the move in question is being
      played for the first time in the play-out (by either side.)  A
      win/loss record is kept for these moves.

Somehow I had constructed this to mean you're building a tree, where you
only gather statistics when you enter a unique position. (Which is basically what would
happen with UCT search expansion.) If this is not what it means I'm
totally in the dark as to what exactly this section does mean.

When you say you count all the playouts starting from an empty board, then
I have no idea how our outcome can be different by 3-4 moves, which is
coincidentally the average depth of a uniform tree of 1,000,000 moves on a 9x9 board.

When I run the Orego playout benchmark(without mercy rule) for 1,000,000 playouts, I get an average game-length of 114.5. This is the same number I'm getting.

With your explanation I think I'm getting a bit closer to understanding what your reference implementation does. But I'd like to see some more clarity before I'll attempt to implement it. Especially criteria #6 quoted above and #3 need some
disambiguation (if that's a word).

For example, in #3 I understand that the ending criteria is 2 consecutive passes or the game-length exceeds N*N*3 where N is the board-size. The bit about 'at least 1 move gets tried...' is not clear to me.
I think it should read:

Play-outs stop after 2 consecutive pass moves, OR when N*N*3 moves have been completed (where N is the size of the board). Except that at least 1 move gets tried.

But that still leaves me the question how you can try at least one move when no legal moves are left?

You may also need to include an exact defintion of AMAF.


On 23-okt-08, at 11:19, Don Dailey wrote:

On Thu, 2008-10-23 at 09:38 -0200, Mark Boon wrote:

I have figured out the discrepancy in the average game length. As
playout length I count from the start of the game, which gives me
114-115. I believe you count from the starting position where the
playout starts. Because when I modify my code to do that I also get
111 moves per game on average.

But I believe counting from the start is more meaningful if you're
going to use the game-length as a reference number. Because otherwise
you might get a different number depending on how explorative your
search is. If your search tends to explore deeper lines in favour of
wider lines you get a different number. Or at least you'll have to
include the exploration-factor K in your UCT tree as a criteria.

I think you are misunderstanding how this works. I am defining a very simple non UCT bot and you are using terminology that implies something
different that I am specifying.

My timing is based solely on the opening position so how would it come
out any different? We are both counting from the start of the game in
this specific case.   (I'll make sure my spec makes this unambiguous.)

You also talk about how "explorative" your "search" is.   The spec
doesn't talk about a search or exploration,  you just do 1 million
uniformly random playouts from some specified position (the starting

The score should be the total black wins divided by the total number of games played and for this specific test it should be 0.5 komi. I will
build a more sophisticated test soon that you simply run your bot

The whole idea of this (for me) was to create the simplest reasonable go
benchmark that played a "reasonable" game of go (depending on your
definition of reasonable) so that hopefully many people might be willing
to go to the trouble to implement it.   I'm not going for a
sophisticated UCT tree based algorithm, because that would require a
considerably larger amount of effort for people to write, and it would
also be significantly more difficult to define precisely.

Having said that, you could easily take a UCT program and with very
little effort have a "simple" mode that does what we are asking.

The simple bot also serves as a stepping stone for beginners to start
implementing their own more capable bot because they can be confident
that they got the details correct and bug-free (more or less.) So you
could grow a UCT program from this.

- Don


On 22-okt-08, at 20:57, Don Dailey wrote:

On Wed, 2008-10-22 at 20:29 -0200, Mark Boon wrote:
On Wed, Oct 22, 2008 at 6:07 PM, Don Dailey <[EMAIL PROTECTED]> wrote:

For one thing,  komi is different.   I used 0.5 for running this

I would have use 0.0  but some implementations don't like even

But the komi should have no effect on the playout length. I started
out with 103 moves, but that was because of a mercy-rule. Without it
it's 114. You have 111 rather consistently. And I assume you don't
a mercy-rule. Nor super-ko checking, is that correct?

The score is what I was looking at, but you are right about the
length because the play-outs don't care what komi is.

Do you use the 3X rule?  Have you checked the other specs?

My guess is that your playouts are not uniformly random - that is very easy to get wrong. Of course that is just a wild guess, it may very
well be something completely different.

- Don

Another thing I have never looked at is AMAF. But that also shouldn't
affect playout length I assume.

By the way, thanks for all the pointers to 'git from everyone.' It's new to me and at first glance the specs look good, so I'll definitely
give it a go.


computer-go mailing list
computer-go mailing list

computer-go mailing list

Reply via email to