Andrey Chernov wrote:
On Mon, Oct 29, 2007 at 09:48:16PM +0100, Christoph Mallon wrote:
Andrey A. Chernov wrote:
ache        2007-10-27 22:32:28 UTC
  FreeBSD src repository
  Modified files:
    include              _ctype.h   Log:
  Micro-optimization of prev. commit, change
  (_c < 0 || _c >= 128) to (_c & ~0x7F)
    Revision  Changes    Path
  1.33      +1 -1      src/include/_ctype.h
Actually this is rather a micro-pessimisation. Every compiler worth its money transforms the range check into single unsigned comparison. The latter test on the other hand on x86 gets probably transformed into a test instruction. This instruction has no form with sign extended 8bit immediate, but only with 32bit immediate. This results in a significantly longer opcode (three bytes more) than a single (unsigned)_c > 127, which a sane compiler produces. I suspect some RISC machines need one more instruction for the "micro-optimised" code, too. In theory GCC could transform the _c & ~0x7F back into a (unsigned)_c > 127, but it does not do this (the only compiler I found, which does this transformation, is LLVM).
Further IMO it is hard to decipher what _c & ~0x7F is supposed to do.

1. My variant is compiler optimization level independent. F.e. without optimization completely there is no range check transform you talk about at all and very long asm code is generated. I also mean the case where gcc optimization bug was avoided, removing optimization (like compiling large part of Xorg server recently), using non-gcc compilers etc. cases.

Compiling without any optimisations makes the code slow for a zillion other reasons (no load/store optimisations, constant folding, common subexpression elimination, if-conversion, partial redundant expression elimination, strength reduction, reassociation, code placement, and many more), so a not transformed range check is really not of any concern.

2. _c & ~0x7F comes right from is{w}ascii() so there is no such enormously
big problems to decifer. I just want to keep all ctype in style.

Repeating cryptic code does not make it better, IMO.

3. I see no "longer opcode (three bytes more)" you talk about in my tests (andl vs cmpl was there, no testl).

See the reply to the mail with your code example.

        Christoph
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to