---------- Forwarded message ----------
From: Alinabi <[EMAIL PROTECTED]>
Date: Jul 2, 3:34 pm
Subject: sub-optimal code for packed boolean arrays in Ada -- bug or
inherent limitation
To: comp.lang.ada
Hello, everyone.
I am writing a chess program in Ada, using bitboards for position
representation. Bitboards are just 64 bit integers in which each bit
represents a predicate about a square of the board. For example, you
would have a bitboard called WhitePawns, which has a one for every
square that contains a white pawn, and zero for every other square. In
Ada, there are two possible implementations of a bitboard: using
Unsigned_64, which would closely mirror the way things are done in C,
and using packed arrays of booleans. The C-like version, using
Unsigned_64, leads to some rather obfuscated code, so I decided to
investigate the efficiency of the code that gnat generates for some
basic operations on packed arrays, which I implemented in the
following package:
package Test is
type Index_T is new Integer range 0..63;
type Bitboard_T is private;
procedure Set (B : in out Bitboard_T; I : in Index_T);
procedure Clear (B : in out Bitboard_T; I : in Index_T);
procedure Clear (B : in out Bitboard_T);
procedure Flip (B : in out Bitboard_T; I : in Index_T);
function Is_Set(B : in Bitboard_T; I : in Index_T)
return Boolean ;
private
type Bitboard_T is array(Index_T) of Boolean;
pragma Pack(Bitboard_T);
pragma Inline_Always(Set, Clear, Flip, Is_Set);
end Test;
package body Test is
pragma Optimize(Time);
procedure Set(B : in out Bitboard_T; I : in Index_T) is
begin
B(I) := True;
end;
procedure Clear(B : in out Bitboard_T; I : in Index_T) is
begin
B(I) := False;
end;
procedure Clear(B : in out Bitboard_T) is
begin
B := B xor B;
end;
procedure Flip(B : in out Bitboard_T; I : in Index_T) is
begin
B(i) := not B(i);
end;
function Is_Set(B : in Bitboard_T; I : in Index_T)
return Boolean is
begin
return B(I);
end;
end Test;
When compiled with
gnatmake -g -gnatp -gnatn -march=opteron -O3 -fdump-tree-optimized
test.adb
using the ada compiler in gcc 4.1.2, the code generated is generally
optimal, or close to optimal. For example, the code generated for Set
is
procedure Set(B : in out Bitboard_T; I : in Index_T) is
0: 89 f1 mov %esi,%ecx
begin
B(I) := True;
2: b8 01 00 00 00 mov $0x1,%eax
7: 48 d3 e0 shl %cl,%rax
a: 48 09 f8 or %rdi,%rax
end;
d: c3 retq
The only notable exception is procedure Flip, which becomes
procedure Flip(B : in out Bitboard_T; I : in Index_T) is
30: 89 f1 mov %esi,%ecx
begin
B(i) := not B(i);
32: 48 c7 c0 fe ff ff ff mov $0xfffffffffffffffe,%rax
39: 48 d3 c0 rol %cl,%rax
3c: 48 21 f8 and %rdi,%rax
3f: 48 d3 ef shr %cl,%rdi
42: 83 f7 01 xor $0x1,%edi
45: 83 e7 01 and $0x1,%edi
48: 48 d3 e7 shl %cl,%rdi
4b: 48 09 f8 or %rdi,%rax
end;
4e: c3 retq
instead of the shorter
mov %esi, %ecx
mov 0x1, %rax
shl %cl, %rax
xor % rax, %rdx
retq
I don't know much (if anything) about compiler internals, so I was
wondering if I should file a bug report. Is there some good
theoretical justification for all those extraneous shifts and
rotations? I would think that if the compiler can figure out the best
way to set or clear a bit in a register using shift and logical ops,
than it should also be able to negate it efficiently.