David,
Confession: I have not tested 19x19. As you have noted, and others
before you over the years, a 19x19 board does not fit in one but three
128-bit registers and there would be a rather big penalty as a result,
perhaps (likely?) wiping out all of the benefits of bitmaps. Antoine
voiced his disappointment about the speed advantage even on 9x9 in the
e-mail to the list that served as my basis. My hope, as Hideko Kato
points out, is in longer registers or perhaps being able to port this to
a GPU. A 64-bit OS with twice the number registers would also surely
help out. Nevertheless, I was surprised that I was able to get to almost
the same level of speed as Libego provides.
The goals for this mockup were very humble: (1) to provide a basic
implementation to see what it can do (and if I could do it), and (2)
learn how to work with assembly and/or intrinsic functions (I had never
done that before). It may not be too hard to try 19x19 and just see what
happens. I may attempt this as a follow-up.
René van de Veerdonk
2009/8/23 David Fotland <fotl...@smart-games.com
<mailto:fotl...@smart-games.com>>
How much would you lose for 19x19 board? A board representation is
not very interesting unless it scales to 19 line boards.
David
*From:* computer-go-boun...@computer-go.org
<mailto:computer-go-boun...@computer-go.org>
[mailto:computer-go-boun...@computer-go.org
<mailto:computer-go-boun...@computer-go.org>] *On Behalf Of *René
van de Veerdonk
*Sent:* Sunday, August 23, 2009 1:11 PM
*To:* computer-go@computer-go.org <mailto:computer-go@computer-go.org>
*Subject:* [computer-go] Bitmap Go revisited and mockup implementation
Hi all,
After years of reading this very interesting list, I decided to make
my first contribution this week after reading once again about
bitmap go. There is no freely available implementation of a bitmap
based engine that I know of, so here is a start. Note that I am not
a professional programmer, self-taught, and this is a part-time
hobby, so do not expect perfect form.
The attachment is an attempt to create an efficient implementation
of a bitmap based light playout engine, following mostly the
recommendations as spelled out by Antoine de Maricourt in his
November 2006 message to this list. The application is single
threaded and achieves around 75-80 kpps on my Centrino Duo 2.4 GHz
Intel labtop. The mockup was developed in C/C++ using the Microsoft
Visual Studio 2008 IDE for Windows XP (32-bit) and undoubtedly
suffers from portability issues as a result. The 128-bit SSE
instructions are called through intrinsic functions, hopefully at
least these interfaces are the same for other compilers. The single
goal of this mockup was to do some benchmarking, so there is no
gtp-interface, but that would be rather easy to add for those
skilled in the art. The standard rules are used with one exception,
the Brown eye-rule is used. Multi-stone suicide is explicitly
forbidden. No superko checks are performed during playouts, and
there is no mercy-rule either.
I would be particularly interested to know if a 64-bit OS will
improve the performance significantly, as it does have a lot less
register pressure. Moving to a 64-bit OS will also make some
instructions available that allow for further
simplifications/efficiency improvements in the code. As will
enabling SSE3 or SSE4.1 instructions.
One note, the random number generator used is from Agner Fog's
public library and was not included in the attachment. You will have
to provide a random number generator yourself or download the
appropriate version from his well-known website
(www.agner.org/optimize <http://www.agner.org/optimize>)
I hope this is the start of a discussion about further improvements
that can be made, but you may use this code as you wish with no
restrictions or guarantees. See the readme.txt file included in the
attachment for more details (also partially included below).
Comments and feedback welcome,
René van de Veerdonk
Excerpts from the readme.txt file:
Introduction:
=============
Mockup of a high performance implementation of the game of Go using
bitmaps
as the principle datastructures. This implementation is limited to
measuring
the performance of the principle components by playing random
playouts from
an empty 9x9 board, using Chinese scoring, not allowing pass moves until
no other moves are available. Suicide is not allowed, neither for single
or multi-stone groups. No checks are implemented for superko, but simple
ko's are disallowed explicitly.
Some implementation details:
============================
Benchmarking is done in test_board.cpp, where the number of playouts
(n) and
repeats (j) for each test is set. Benchmarking is done starting from
an empty
9x9 board until two (three with ko) passes are played in sequence. The
benchmark playout follow the same scheme as method
board_t::play_random_game,
but with timing instrumentation added.
Clock counts (cc) are measured using the cc_time_of macro. Total
run-time
measurement relies on the windows library.
For each move during a playout, the program goes through a sequence of
(1) identify all legal moves (~50cc)
this follows Brown's standard definition; this definition
differs from
most Monte-Carlo engines and has a different set of pros and cons;
bitmaps allow all moves to be processed in parallel
(2) pick a uniformly random move from the available choices (~75cc)
relies on a modulus operation; is not 100% uniform to save speed
(3) playing it on the board (~170cc)
lengthiest method; but rather straightforward
Playouts end after 256 moves or whenever two passes are played in
succession
(three in case of a ko). The routines for each of these basic steps are
contained in the file board.cpp. The underlying datastructures are
128-bit
bitmaps, for which the sse2 instructions used to manipulate them are
contained
in the file bitmap.cpp.
Further improvements can be expected from using any 64-bit OS that
supports
several more assembly instructions, and from using sse4.1 instructions.
Example output (on a Intel 2.40 GHz Centrino Duo labtop):
=========================================================
[game] = 30469.7, [moves] = 111.071
[game] = 30245.4, [moves] = 111.068
[game] = 30221.7, [moves] = 111.089
[game] = 30264.8, [moves] = 111.122
[game] = 30205.8, [moves] = 111.101
[game] = 77.1084 kpps, [moves] = 111.023
[game] = 78.0488 kpps, [moves] = 111.045
[game] = 77.1084 kpps, [moves] = 111.046
[game] = 78.0488 kpps, [moves] = 111.131
[game] = 78.0488 kpps, [moves] = 111.082
[legal] 110/51, [pick] 110/74, [play] 106/169, [score] 40, [win] 0.4187
[legal] 111/51, [pick] 111/74, [play] 106/168, [score] 42, [win] 0.4201
[legal] 111/52, [pick] 111/74, [play] 106/169, [score] 43, [win] 0.4276
[legal] 111/52, [pick] 111/75, [play] 106/170, [score] 41, [win] 0.4092
[legal] 111/51, [pick] 111/74, [play] 106/171, [score] 45, [win] 0.4221
finished, press any key to exit
Explanation of output:
======================
There are three sets of repeated (5x) measurements, performed and timed
independently to avoid cross-contamination of inserted serialization
instructions and timing overhead (which still does not quite prevent
reordering of instructions across timing boundaries).
[game] cc's per full playout
[moves] average number of moves per playout
[game] thousands of playouts per second (kpps)
[moves] average number of moves per playout
"calls per playout" / "cc's per call"
[legal] identifies all legal moves on the board
[pick] picks a random move
[play] processes a move
[score] scores a final position
[win] winrate for black
Acknowledgements:
=================
Antoine de Maricourt, for his very helpful description of his bitmap go
implementation in detail on the computer-go mailing list:
http://computer-go.org/pipermail/computer-go/2006-November/007136.html
Gunnar Farneback, for providing the well documented Brown engine,
whoms legal move definition is used in this mockup
Lukasz Lew, for the high performance Libego implementation, which
has inspired
various parts of this mockup as well
Agner Fog, for the Mersenne Twister and clock-cycle timing library
Computer-go mailing list, where many ideas are posted and discussed
_______________________________________________
computer-go mailing list
computer-go@computer-go.org <mailto:computer-go@computer-go.org>
http://www.computer-go.org/mailman/listinfo/computer-go/
------------------------------------------------------------------------
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/