Andrey Chernov wrote:
On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote:
* "Andrey A. Chernov" <[EMAIL PROTECTED]> [ 2007-10-27 ]
        [ cvs commit: src/include _ctype.h ]
ache        2007-10-27 22:32:28 UTC

  FreeBSD src repository

  Modified files:
include _ctype.h Log:
  Micro-optimization of prev. commit, change
  (_c < 0 || _c >= 128) to (_c & ~0x7F)
Isn't that a non-optimization in code and a minor pessimization of readability?
Maybe I'm getting rusty, but those seem to result in nearly identical code on
i386 with a relatively modern GCC.  Did you look at the compiler output for this
optimization?  Is there a specific expensive instruction you're trying to avoid?
For such thoroughyl bit-aligned range checks, you shouldn't even get a branch
for the former case.  Is there a platform other than i386 I should look at where
the previous expression is more clearly pessimized?  Or a different compiler
than GCC?

For ones who doubts there two tests compiled with -O2. As you may see the result is almost identical (andl vs cmpl):
-------------------- a.c --------------------
main () {

        int c;

        return (c & ~0x7f) ? 0 : c * 2;
}
-------------------- a.s --------------------
        .file   "a.c"
        .text
        .p2align 4,,15
.globl main
        .type   main, @function
main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        movl    %eax, %edx
        andl    $-128, %edx
        addl    %eax, %eax
        cmpl    $1, %edx
        sbbl    %edx, %edx
        pushl   %ebp
        andl    %edx, %eax
        movl    %esp, %ebp
        pushl   %ecx
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.2.1 20070719  [FreeBSD]"
-------------------- a1.c --------------------
main () {

        int c;

        return (c < 0 || c >= 128) ? 0 : c * 2;


}
-------------------- a1.s --------------------
        .file   "a1.c"
        .text
        .p2align 4,,15
.globl main
        .type   main, @function
main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        addl    %eax, %eax
        cmpl    $128, %eax
        sbbl    %edx, %edx
        andl    %edx, %eax
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ecx
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.2.1 20070719  [FreeBSD]"

Your example is invalid. The value of c is undefined in this function and you see random garbage as result (for example in the code snippet you see the c * 2 (addl %eax, %eax) and after that is the cmpl, which uses %eax, too). In fact it would be perfectly legal for the compiler to always return 0, call abort(), or let demons fly out of your nose.

Also the example is still unrealistic: You usually don't multiply chars by two. Lets try something more realistic: an ASCII filter

int filter_ascii0(int c)
{
        return c < 0 || c >= 128 ? '?' : c;
}

int filter_ascii1(int c)
{
        return c & ~0x7F ? '?' : c;
}

Especially mind that c is not dead after the condition. Even if your example did not used an undefined value, the value of c is dead after the test, which is unlikely for typical string handling code.

And now the compiled code (GCC 3.4.6 with -O2 -march=athlon-xp -fomit-frame-pointer - I used these switches to get more compact code. It has no influence on the condition test.):

00000000 <filter_ascii0>:
   0:   8b 54 24 04             mov    0x4(%esp),%edx
   4:   b8 3f 00 00 00          mov    $0x3f,%eax
   9:   83 fa 7f                cmp    $0x7f,%edx
   c:   0f 46 c2                cmovbe %edx,%eax
   f:   c3                      ret

00000010 <filter_ascii1>:
  10:   8b 54 24 04             mov    0x4(%esp),%edx
  14:   b8 3f 00 00 00          mov    $0x3f,%eax
  19:   f7 c2 80 ff ff ff       test   $0xffffff80,%edx
  1f:   0f 44 c2                cmove  %edx,%eax
  22:   c3                      ret

You see there is a test instruction used in filter_ascii1, because the value in %edx does not die at the test, but is used again in the cmove.

        Christoph
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to