On Mon, Feb 16, 2009 at 11:19 AM, Narasimha Datta <datt...@yahoo.com> wrote:
> Hello,
>
> Here's a simple memory copy macro:
>
> #define MYMEMCOPY(dp, sp, len) \
> do { \
>        long __len = len; \
>        while (--__len >= 0) \
>                (dp)[__len] = (sp)[__len]; \
> } while (0)
>
> void foo(unsigned char *dp, const unsigned char *sp, unsigned long size) {
>        MYMEMCOPY(dp, sp, size);
> }
>
> void bar(unsigned char *dp, const unsigned char *sp) {
>        MYMEMCOPY(dp, sp, 128);
> }
>
> The code fragments generated for the foo and bar functions with -O and -O2 
> optimizations respectively is as follows:
>
> /* ===== With -O switch ===== */
> /* function foo */
> .L4:
>        movzbl  -1(%rcx), %eax
>        movb    %al, -1(%rdx)
>        subq    $1, %rcx
>        subq    $1, %rdx
>        subq    $1, %r8
>        jns     .L4
>
> /* function bar */
>        movl    $126, %edx
> .L8:
> .LBB3:
>        .loc 1 13 0
>        movzbl  1(%rdx,%rsi), %eax
>        movb    %al, 1(%rdx,%rdi)
>        subq    $1, %rdx
>        cmpq    $-2, %rdx
>        jne     .L8
>
> /* ===== With -O2 switch =====*/
> /* function foo */
> .L4:
>        movzbl  -1(%rsi), %eax
>        addq    $1, %rdi
>        subq    $1, %rsi
>        movb    %al, -1(%rcx)
>        subq    $1, %rcx
>        cmpq    %rdx, %rdi
>        jne     .L4
>
> /* function bar */
>        movl    $126, %edx
> .L9:
> .LBB3:
>        .loc 1 13 0
>        movzbl  1(%rdx,%rsi), %eax
>        movb    %al, 1(%rdx,%rdi)
>        subq    $1, %rdx
>        cmpq    $-2, %rdx
>        jne     .L9
>
> Now my questions are:
> (i) Why does the compiler generate an addq, cmpq and jne for the foo function 
> with -O2? Isn't subq/jns more efficient, as seen from the output from -O?
> (ii) For function bar, why is the "cmpq $-2, %rdx" instruction generated? 
> Won't it be better to count down from 128 to 0 instead of 126 to -2?
>
> Here's my OS and compiler version (I'm running a 64-bit FreeBSD):
> $ uname -a
> FreeBSD xxx 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Wed Nov 12 18:54:21 PST 2008  
>    r...@wc7:/usr/obj/usr/src/sys/SMKERNEL  amd64
> $ cc --version
> cc (GCC) 4.2.1 20070719  [FreeBSD]
> Copyright (C) 2007 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> And these are the commands I used to compile the program:
> cc -S -O -g test.c
> cc -S -O2 -g test.c
>
> Any pointers would be appreciated. Thanks!

1) Try a more recent GCC
2) Use memcpy.  It is properly inlined/optimized.

Richard.

Reply via email to