dmd simple loop disassembly - redundant instruction?

Ivan Kazmenko Wed, 25 Dec 2013 04:06:12 -0800

Hello,

I am studying the difference between x86 generated code of DMDand C/C++ compilers on Windows (simply put: why exactly, and bywhat margin, DMD-compiled D code is often slower thanGCC-compiled C/C++ equivalent).


Now, I have this simple D program:

-----
immutable int MAX_N = 1_000_000;
void main () {
    int [MAX_N] a;
    foreach (i; 0..MAX_N)
        a[i] = i;
}
-----

(I know there's iota in std.range, and it turns out to be evenslower - but that's a high level function, and I'm trying tounderstand the lower-level details now.)

The assembly (dmd -O -release -inline -noboundscheck, thenobj2asm) has the following piece corresponding to the cycle:


-----
L2C:            mov     -03D0900h[EDX*4][EBP],EDX
                mov     ECX,EDX
                inc     EDX
                cmp     EDX,0F4240h
                jb      L2C
-----

Now, I am not exactly fluent in assembler, but the "mov ECX, EDX"seems unnecessary. The ECX register is explicitly used threetimes in the whole program, and it looks like this instructioncan at least be moved out of the loop, if not removed completely.Is it indeed a bug, or there's some reason here? And if theformer, where do I report it - at http://d.puremagic.com/issues/,as with the front-end?

I didn't try GDC or LDC since I didn't find a clear instructionfor using them under Win32. If there is one, please kindly pointme to it. I found a few explanations for GDC, but had a hardtime trying to figure out which is the most current one.

Note that the C++ version does the same with four instructionsinstead of five, as D version is expected to be if we remove theinstruction in question. Indeed, it goes like (code inside theloop):


-----
L3:
        movl    %eax, _a(,%eax,4)
        addl    $1, %eax
        cmpl    $1000000, %eax
        jne     L3
-----

The full assembly listings, and the source codes (D and C++), arehere:

http://acm.math.spbu.ru/~gassa/dlang/simple_loop/

I've tried a few other versions as well. Changing the loop to anexplicit "for (int i = 0; i < MAX_N; i++)" (a2.d) does not affectthe generated assembly. Making the array dynamic (a3.d) leads tofive instructions, all seemingly important. A __gshared staticarray (a4.d) gives the same seemingly unneeded instruction butwith EAX instead of ECX:


-----
L2:             mov     _D2a41aG1000000i[EDX*4],EDX
                mov     EAX,EDX
                inc     EDX
                cmp     EDX,0F4240h
                jb      L2
-----

Ivan Kazmenko.

dmd simple loop disassembly - redundant instruction?

Reply via email to