On Thu, 1 Nov 2007, Christoph Mallon wrote:

Andrey Chernov wrote:
On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote:
* "Andrey A. Chernov" <[EMAIL PROTECTED]> [ 2007-10-27 ]
        [ cvs commit: src/include _ctype.h ]
ache        2007-10-27 22:32:28 UTC

  FreeBSD src repository

  Modified files:
    include              _ctype.h   Log:
  Micro-optimization of prev. commit, change
  (_c < 0 || _c >= 128) to (_c & ~0x7F)
Isn't that a non-optimization in code and a minor pessimization of readability?
...
For ones who doubts there two tests compiled with -O2. As you may see the result is almost identical (andl vs cmpl):

We never doubted that it was a small negative or non-optimization :-).

Look closer and you will see that the andl version takes 2 extra
instructions, since both versions are smart enough to avoid a branch,
and for this they need the result of the condition code generated by
the andl or cmpl, and the cmpl generates the desired condition code
directly while 2 more instructions are needed after the andl.

-------------------- a.c --------------------
main () {

        int c;

        return (c & ~0x7f) ? 0 : c * 2;
}

This example has many flaws as pointed out by Cristoph:
- c is uninitialized
- the result depends on c in a way that is quite different than the table
  lookup for ctype.  The above expression happens to be more optimizable.

-------------------- a.s --------------------
        .file   "a.c"
        .text
        .p2align 4,,15
.globl main
        .type   main, @function
main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        movl    %eax, %edx             <--- extra instruction since andl
                                            clobbers a register.  Normally,
                                            testl should be used to avoid
                                            this clobber.
        andl    $-128, %edx            <--- this sets %edx to something
                                            and also sets the condition
                                            codes, but not like we want
        addl    %eax, %eax             <--- c * 2
        cmpl    $1, %edx               <--- this sets the condition codes
                                            like we want
        sbbl    %edx, %edx             <--- turn condition codes into a
                                            mask in %edx: mask = 0xffffffff
                                            if the result should be c *2
                                            and mask = 0 if the result should
                                            be 0
        pushl   %ebp
        andl    %edx, %eax             <--- result = (c * 2) & mask
        movl    %esp, %ebp             <--- why is it bothering to set up
                                            a frame this late?
        pushl   %ecx
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.2.1 20070719  [FreeBSD]"
-------------------- a1.c --------------------
main () {

        int c;

        return (c < 0 || c >= 128) ? 0 : c * 2;


}
-------------------- a1.s --------------------
        .file   "a1.c"
        .text
        .p2align 4,,15
.globl main
        .type   main, @function
main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        addl    %eax, %eax
        cmpl    $128, %eax             <--- cmpl puts result in condition
                                            codes directly where we want it
        sbbl    %edx, %edx             <--- same masking stuff ...
        andl    %edx, %eax
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ecx
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.2.1 20070719  [FreeBSD]"

Your example is invalid. The value of c is undefined in this function and you see random garbage as result (for example in the code snippet you see the c * 2 (addl %eax, %eax) and after that is the cmpl, which uses %eax, too). In fact it would be perfectly legal for the compiler to always return 0, call abort(), or let demons fly out of your nose.

However, the uninitialized c = %eax seems to be transformed correctly in
both cases.  The first case even preserves %eax from the andl.


Also the example is still unrealistic: You usually don't multiply chars by two. Lets try something more realistic: an ASCII filter

Indeed.

Bruce
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to