On 3 October 2017 at 14:10, Jeff Law <l...@redhat.com> wrote: > On 10/03/2017 03:00 PM, Marc Glisse wrote: >> On Tue, 3 Oct 2017, Jakub Jelinek wrote: >> >>> On Tue, Oct 03, 2017 at 12:54:39PM -0700, Prathamesh Kulkarni wrote: >>>> Hi, >>>> This follow-up patch implements the patterns mentioned in $subject. >>>> Bootstrap+test in progress on x86_64-unknown-linux-gnu and >>>> aarch64-linux-gnu. >>>> OK to commit if passes ? >>>> >>>> Thanks, >>>> Prathamesh >>> >>>> 2017-10-03 Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> >>>> >>>> * match.pd ((X >> CST) == 0 -> X < (1 << CST)): New pattern. >>>> ((X >> CST) != 0 -> X >= (1 << CST)): Likewise. >> >> build_int_cstu doesn't work for vectors, you want build_one_cst. I never >> know if we should check single_use or not :-( Thanks for the pointers, I wasn't aware about build_one_cst. >> >>>> testsuite/ >>>> * gcc.dg/tree-ssa/cmpdiv.c: Add test-cases f3 and f4. >>> >>> Why this way and not the other way around? >> >> For high level gimple optimizations, X < CST is more convenient (and >> smaller, just one insn) than (X >> CST) == 0. > Right. One could easily argue that Prathamesh's form should be the > preferred form because it is simpler at the gimple level -- and that > x86-isms should be dealt with later in the pipeline. > > >> >>> E.g. i?86/x86_64 and various other targets have shift instructions which >>> set condition codes, so (X >> 51) == 0 is certainly smaller >>> smaller and I believe cheaper than the latter. >>> Try: >>> void foo (void); >>> >>> void >>> bar (unsigned long long x) >>> { >>> if ((x >> 51) == 0) >>> foo (); >>> } >>> >>> void >>> baz (unsigned long long x) >>> { >>> if (x < (1LL << 51)) >>> foo (); >>> } >>> with -Os on x86_64, the first function is 4 insns, 12 bytes, >>> the second one 5 insns, 21 bytes. Indeed, I can reproduce this behavior. Thanks for pointing it out. >>> >>> I wonder if this kind of instruction selection stuff shouldn't be >>> done in target.pd instead, with input from the target. > Right, but I think that argues that Prathamesh's patch is the right > direction and that to move forward what needs to happen is something > needs to be fixed at the gimple/rtl border to ensure we get good x86 code. > >> >> At a late stage, maybe during an RTL pass or expansion (or just before >> expansion) it would indeed be good to generate a shift for such >> comparisons, on targets where that sets a cc. The lack of this >> transformation could be considered a blocker for the other one, to avoid >> regressing on bar. > Right. In fact, I think Jakub's test ought to be added to this work as > part of its basic testing. Um, how to write the above-test case for bar() in DejaGNU format ? Is it possible to test for number of insns or to use nm to test for size of a function with dejagnu directive ? Or should I scan for "cmpq" since the pattern transforms a right shift and cmp against 0 to cmp between operands ?
Thanks, Prathamesh > > jeff >