> > GCC bootstrap : > > Total number of zero-extension instructions before : 1456 > Total number of zero-extension instructions after : 5814 > No impact on boot-strap time.
You sure you have these numbers the right way around ? Shouldn't the number of zero-extension instructions after the patch be less than the number of zero-extension instructions before or is this a regression ? Thanks, Ramana > > > I have attached the latest patch : > > > On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther > <richard.guent...@gmail.com> wrote: >> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam<tmsri...@google.com> wrote: >>> Hi, >>> >>> Here is a patch to eliminate redundant zero-extension instructions >>> on x86_64. >>> >>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified >>> that the results are the same with/without this patch. >> >> The patch misses testcases. Why does zee run after register allocation? >> Your examples suggest that it will free hard registers so doing it before >> regalloc looks odd. >> >> What is the compile-time impact of your patch on say, gcc bootstrap? >> How many percent of instructions are removed as useless zero-extensions >> during gcc bootstrap? How much do CSiBE numbers improve? >> >> Thanks, >> Richard. >> >>> >>> Problem Description : >>> --------------------------------- >>> >>> This pass is intended to be applicable only to targets that implicitly >>> zero-extend 64-bit registers after writing to their lower 32-bit half. >>> For instance, x86_64 zero-extends the upper bits of a register >>> implicitly whenever an instruction writes to its lower 32-bit half. >>> For example, the instruction *add edi,eax* also zero-extends the upper >>> 32-bits of rax after doing the addition. These zero extensions come >>> for free and GCC does not always exploit this well. That is, it has >>> been observed that there are plenty of cases where GCC explicitly >>> zero-extends registers for x86_64 that are actually useless because >>> these registers were already implicitly zero-extended in a prior >>> instruction. This pass tries to eliminate such useless zero extension >>> instructions. >>> >>> Motivating Example I : >>> ---------------------------------- >>> For this program : >>> ********************************************** >>> bad_code.c >>> >>> int mask[1000]; >>> >>> int foo(unsigned x) >>> { >>> if (x < 10) >>> x = x * 45; >>> else >>> x = x * 78; >>> return mask[x]; >>> } >>> ********************************************** >>> >>> $ gcc -O2 bad_code.c >>> ........ >>> 400315: b8 4e 00 00 00 mov $0x4e,%eax >>> 40031a: 0f af f8 imul %eax,%edi >>> 40031d: 89 ff mov %edi,%edi >>> ---> Useless zero extend. >>> 40031f: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax >>> 400326: c3 retq >>> ...... >>> 400330: ba 2d 00 00 00 mov $0x2d,%edx >>> 400335: 0f af fa imul %edx,%edi >>> 400338: 89 ff mov %edi,%edi ---> >>> Useless zero extend. >>> 40033a: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax >>> 400341: c3 retq >>> >>> $ gcc -O2 -fzee bad_code.c >>> ...... >>> 400315: 6b ff 4e imul $0x4e,%edi,%edi >>> 400318: 8b 04 bd 40 19 40 00 mov 0x401940(,%rdi,4),%eax >>> 40031f: c3 retq >>> 400320: 6b ff 2d imul $0x2d,%edi,%edi >>> 400323: 8b 04 bd 40 19 40 00 mov 0x401940(,%rdi,4),%eax >>> 40032a: c3 retq >>> >>> >>> >>> Thanks, >>> >>> Sriraman M Tallam. >>> Google, Inc. >>> tmsri...@google.com >>> >> >