It would be interesting to see how your implementation performs on a
Core2 architecture. So how many 9x9 games does it currently play per
second? What about 19x19 games?
I don't know how many games it plays. As I said, I'm not really
interested in MC GO now. I played a few days with this code and did not
try to improve it. I had 450 cc/per move on average, for 110 random
moves, including random move generation. The eye code was not
integrated. I think this can be done in less than 100 additional cc per
move. Let's say that 50 cc can be gained by being more clever for the
random generation and optimizing a little bit the whole. Add some time
for initializing the board + scoring (let's say 1000cc as a whole). This
yields to 56000 cc (= 500 * 110 + 1000) per game. You are around 27k
games/s for a 1.5 GHz processor, or 43k games/s for a 2.4 GHZ processor.
On the core2, let's say you can take off 100 cc per move (possibly more,
but I'd like to have one to see how it behaves). This would lead you
around 55k games/s for 2.4 GHz processor.
Take these numbers with extreme care.
I'm not surprised by Ćukasz's numbers (about 30k games/s) on a Pentium M. First, he knows about Agner Fog's optimization guide: the only one you want to know if you try to optimize code, or you simply want to understand how modern processors work ;-) . Then because my general purpose board (not targeted for MC Go) is faster than the pure bitmap implementation.
For the random generator you use MT19937. Why did you choose this
generator and is it faster than C's rand() function?
I guess I'm like Don: I trust it, and it's fast enough. And I don't
think this is the bottleneck for this application. Better spend time on
random move (not number...) generation or efficient board representation.
Also I wonder what is the purpose for keeping bitmap of empty points,
since that is just ~(black bitboard | white bitboard). Same goes for
bitmap of points with a stone, which is just (black bitboard | white
bitboard) = ~bitmap of empty points.
The primary purpose was because of the random generator scheme I chose
(for step 1): get a random byte, perform a bit test, and continue until
you find an empty bit. This is a very simple scheme that could probably
be enhanced. It has the advantage that it is correct wrt uniform
distribution (among empty points) and that you can add/remove eyes or
captured stones very easily. Of course you can always compute the empty
bitmap from b/w bitmaps, but sometimes it is more suited (from the
processor point of view) to have redundant information coded into
different forms (even if it requires extra computation to keep this
information up to date).
Antoine
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/