How much would you lose for 19x19 board?  A board representation is not very
interesting unless it scales to 19 line boards.

 

David

 

From: computer-go-boun...@computer-go.org
[mailto:computer-go-boun...@computer-go.org] On Behalf Of René van de
Veerdonk
Sent: Sunday, August 23, 2009 1:11 PM
To: computer-go@computer-go.org
Subject: [computer-go] Bitmap Go revisited and mockup implementation

 

Hi all,

 

After years of reading this very interesting list, I decided to make my
first contribution this week after reading once again about bitmap go. There
is no freely available implementation of a bitmap based engine that I know
of, so here is a start. Note that I am not a professional programmer,
self-taught, and this is a part-time hobby, so do not expect perfect form.

 

The attachment is an attempt to create an efficient implementation of a
bitmap based light playout engine, following mostly the recommendations as
spelled out by Antoine de Maricourt in his November 2006 message to this
list. The application is single threaded and achieves around 75-80 kpps on
my Centrino Duo 2.4 GHz Intel labtop. The mockup was developed in C/C++
using the Microsoft Visual Studio 2008 IDE for Windows XP (32-bit) and
undoubtedly suffers from portability issues as a result. The 128-bit SSE
instructions are called through intrinsic functions, hopefully at least
these interfaces are the same for other compilers. The single goal of this
mockup was to do some benchmarking, so there is no gtp-interface, but that
would be rather easy to add for those skilled in the art. The standard rules
are used with one exception, the Brown eye-rule is used. Multi-stone suicide
is explicitly forbidden. No superko checks are performed during playouts,
and there is no mercy-rule either.

 

I would be particularly interested to know if a 64-bit OS will improve the
performance significantly, as it does have a lot less register pressure.
Moving to a 64-bit OS will also make some instructions available that allow
for further simplifications/efficiency improvements in the code. As will
enabling SSE3 or SSE4.1 instructions.

 

One note, the random number generator used is from Agner Fog's public
library and was not included in the attachment. You will have to provide a
random number generator yourself or download the appropriate version from
his well-known website (www.agner.org/optimize)

 

I hope this is the start of a discussion about further improvements that can
be made, but you may use this code as you wish with no restrictions or
guarantees. See the readme.txt file included in the attachment for more
details (also partially included below).

 

Comments and feedback welcome,

 

René van de Veerdonk

 

Excerpts from the readme.txt file:

 

Introduction:

=============

 

Mockup of a high performance implementation of the game of Go using bitmaps

as the principle datastructures. This implementation is limited to measuring

the performance of the principle components by playing random playouts from

an empty 9x9 board, using Chinese scoring, not allowing pass moves until

no other moves are available. Suicide is not allowed, neither for single

or multi-stone groups. No checks are implemented for superko, but simple

ko's are disallowed explicitly.

 

Some implementation details:

============================

 

Benchmarking is done in test_board.cpp, where the number of playouts (n) and

repeats (j) for each test is set. Benchmarking is done starting from an
empty

9x9 board until two (three with ko) passes are played in sequence. The

benchmark playout follow the same scheme as method
board_t::play_random_game,

but with timing instrumentation added.

 

Clock counts (cc) are measured using the cc_time_of macro. Total run-time

measurement relies on the windows library.

 

For each move during a playout, the program goes through a sequence of

(1) identify all legal moves (~50cc)

    this follows Brown's standard definition; this definition differs from

    most Monte-Carlo engines and has a different set of pros and cons;

    bitmaps allow all moves to be processed in parallel

(2) pick a uniformly random move from the available choices (~75cc)

    relies on a modulus operation; is not 100% uniform to save speed

(3) playing it on the board (~170cc)

    lengthiest method; but rather straightforward

 

Playouts end after 256 moves or whenever two passes are played in succession

(three in case of a ko). The routines for each of these basic steps are

contained in the file board.cpp. The underlying datastructures are 128-bit

bitmaps, for which the sse2 instructions used to manipulate them are
contained

in the file bitmap.cpp.

 

Further improvements can be expected from using any 64-bit OS that supports

several more assembly instructions, and from using sse4.1 instructions.

 

Example output (on a Intel 2.40 GHz Centrino Duo labtop):

=========================================================

 

[game] = 30469.7, [moves] = 111.071

[game] = 30245.4, [moves] = 111.068

[game] = 30221.7, [moves] = 111.089

[game] = 30264.8, [moves] = 111.122

[game] = 30205.8, [moves] = 111.101

 

[game] = 77.1084 kpps, [moves] = 111.023

[game] = 78.0488 kpps, [moves] = 111.045

[game] = 77.1084 kpps, [moves] = 111.046

[game] = 78.0488 kpps, [moves] = 111.131

[game] = 78.0488 kpps, [moves] = 111.082

 

[legal] 110/51, [pick] 110/74, [play] 106/169, [score] 40, [win] 0.4187

[legal] 111/51, [pick] 111/74, [play] 106/168, [score] 42, [win] 0.4201

[legal] 111/52, [pick] 111/74, [play] 106/169, [score] 43, [win] 0.4276

[legal] 111/52, [pick] 111/75, [play] 106/170, [score] 41, [win] 0.4092

[legal] 111/51, [pick] 111/74, [play] 106/171, [score] 45, [win] 0.4221

 

finished, press any key to exit

 

Explanation of output:

======================

 

There are three sets of repeated (5x) measurements, performed and timed

independently to avoid cross-contamination of inserted serialization

instructions and timing overhead (which still does not quite prevent

reordering of instructions across timing boundaries).

 

[game]  cc's per full playout

[moves] average number of moves per playout

 

[game]  thousands of playouts per second (kpps)

[moves] average number of moves per playout

 

"calls per playout" / "cc's per call"

[legal] identifies all legal moves on the board

[pick]  picks a random move

[play]  processes a move

[score] scores a final position

[win]   winrate for black

 

Acknowledgements:

=================

 

Antoine de Maricourt, for his very helpful description of his bitmap go

  implementation in detail on the computer-go mailing list:

  http://computer-go.org/pipermail/computer-go/2006-November/007136.html

Gunnar Farneback, for providing the well documented Brown engine,

  whoms legal move definition is used in this mockup

Lukasz Lew, for the high performance Libego implementation, which has
inspired

  various parts of this mockup as well

Agner Fog, for the Mersenne Twister and clock-cycle timing library

Computer-go mailing list, where many ideas are posted and discussed

 

 

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to