Sorry, it is the other way around. Total number of zero-extension instructions before : 5814 Total number of zero-extension instructions after : 1456
Thanks for pointing it. On Wed, Sep 23, 2009 at 4:10 PM, Ramana Radhakrishnan <raman...@gmail.com> wrote: >> >> GCC bootstrap : >> >> Total number of zero-extension instructions before : 1456 >> Total number of zero-extension instructions after : 5814 >> No impact on boot-strap time. > > > You sure you have these numbers the right way around ? Shouldn't the > number of zero-extension instructions after the patch be less than the > number of zero-extension instructions before or is this a regression > ? > > Thanks, > Ramana > >> >> >> I have attached the latest patch : >> >> >> On Sun, Aug 9, 2009 at 2:15 PM, Richard Guenther >> <richard.guent...@gmail.com> wrote: >>> On Sat, Aug 8, 2009 at 11:59 PM, Sriraman Tallam<tmsri...@google.com> wrote: >>>> Hi, >>>> >>>> Here is a patch to eliminate redundant zero-extension instructions >>>> on x86_64. >>>> >>>> Tested: Ran the gcc regresssion testsuite on x86_64-linux and verified >>>> that the results are the same with/without this patch. >>> >>> The patch misses testcases. Why does zee run after register allocation? >>> Your examples suggest that it will free hard registers so doing it before >>> regalloc looks odd. >>> >>> What is the compile-time impact of your patch on say, gcc bootstrap? >>> How many percent of instructions are removed as useless zero-extensions >>> during gcc bootstrap? How much do CSiBE numbers improve? >>> >>> Thanks, >>> Richard. >>> >>>> >>>> Problem Description : >>>> --------------------------------- >>>> >>>> This pass is intended to be applicable only to targets that implicitly >>>> zero-extend 64-bit registers after writing to their lower 32-bit half. >>>> For instance, x86_64 zero-extends the upper bits of a register >>>> implicitly whenever an instruction writes to its lower 32-bit half. >>>> For example, the instruction *add edi,eax* also zero-extends the upper >>>> 32-bits of rax after doing the addition. These zero extensions come >>>> for free and GCC does not always exploit this well. That is, it has >>>> been observed that there are plenty of cases where GCC explicitly >>>> zero-extends registers for x86_64 that are actually useless because >>>> these registers were already implicitly zero-extended in a prior >>>> instruction. This pass tries to eliminate such useless zero extension >>>> instructions. >>>> >>>> Motivating Example I : >>>> ---------------------------------- >>>> For this program : >>>> ********************************************** >>>> bad_code.c >>>> >>>> int mask[1000]; >>>> >>>> int foo(unsigned x) >>>> { >>>> if (x < 10) >>>> x = x * 45; >>>> else >>>> x = x * 78; >>>> return mask[x]; >>>> } >>>> ********************************************** >>>> >>>> $ gcc -O2 bad_code.c >>>> ........ >>>> 400315: b8 4e 00 00 00 mov $0x4e,%eax >>>> 40031a: 0f af f8 imul %eax,%edi >>>> 40031d: 89 ff mov %edi,%edi >>>> ---> Useless zero extend. >>>> 40031f: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax >>>> 400326: c3 retq >>>> ...... >>>> 400330: ba 2d 00 00 00 mov $0x2d,%edx >>>> 400335: 0f af fa imul %edx,%edi >>>> 400338: 89 ff mov %edi,%edi ---> >>>> Useless zero extend. >>>> 40033a: 8b 04 bd 60 19 40 00 mov 0x401960(,%rdi,4),%eax >>>> 400341: c3 retq >>>> >>>> $ gcc -O2 -fzee bad_code.c >>>> ...... >>>> 400315: 6b ff 4e imul $0x4e,%edi,%edi >>>> 400318: 8b 04 bd 40 19 40 00 mov 0x401940(,%rdi,4),%eax >>>> 40031f: c3 retq >>>> 400320: 6b ff 2d imul $0x2d,%edi,%edi >>>> 400323: 8b 04 bd 40 19 40 00 mov 0x401940(,%rdi,4),%eax >>>> 40032a: c3 retq >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Sriraman M Tallam. >>>> Google, Inc. >>>> tmsri...@google.com >>>> >>> >> >