On 6/5/19 3:04 PM, Richard Biener wrote: > On Wed, Jun 5, 2019 at 2:09 PM Martin Liška <mli...@suse.cz> wrote: >> >> On 6/5/19 1:13 PM, Richard Biener wrote: >>> On Wed, Jun 5, 2019 at 12:56 PM Martin Liška <mli...@suse.cz> wrote: >>>> >>>> Hi. >>>> >>>> I'm suggesting one multiplication simplification pattern. >>>> >>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >>>> >>>> Ready to be installed? >>> >>> + (if (INTEGRAL_TYPE_P (type) >>> + && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION (type))) >>> + && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION >>> (type)))) >>> >>> && wi::eq_p (wi::bit_or (get_nonzero_bits (@1), get_nonzero_bits (@2)), >>> 1)) >>> >>> (I think literal 1 still works)? >> >> Yep, I can confirm that. >> >>> How does it behave for singed/unsigned 1-bit >>> bitfields? A gimple testcase maybe necessary to see. >> >> Can we really have a mult that will have a bitfield type? > > As said you probably need a GIMPLE testcase to avoid > promoting to int. Oh, and that doesn't work yet because > we cannot "parse" bit-precision types for temporaries. > > struct X { int a : 1; int b : 1; }; > > int foo (struct X *p) > { > return p->a; > } > > produces > > int __GIMPLE (ssa) > foo (struct X * p) > { > int D_1913; > <unnamed-signed:1> _1; > ... > > we have similar issues with dumping of vector types but > there at least one can use a typedef and manual editing. > For bit-precision types we need to invent a "C" extension > (thus also for vectors). > > Anyway...
I see, I'm sending updated version of the patch I've been just testing. It's addressing Richard Sandifords's note. May I install it after testing? > >> $ cat gcc/testsuite/gcc.dg/pr87954-2.c >> #define __GFP_DMA 1u >> #define __GFP_RECLAIM 0x10u >> >> struct bt >> { >> unsigned int v:1; >> }; >> >> unsigned int >> imul(unsigned int flags) >> { >> struct bt is_dma, is_rec; >> >> is_dma.v = !!(flags & __GFP_DMA); >> is_rec.v = !!(flags & __GFP_RECLAIM); >> >> return is_rec.v * !is_dma.v; >> } >> >> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87954-2.c >> -fdump-tree-optimized=/dev/stdout -O2 >> >> ;; Function imul (imul, funcdef_no=0, decl_uid=1909, cgraph_uid=1, >> symbol_order=0) >> >> imul (unsigned int flags) >> { >> struct bt is_dma; >> _Bool _1; >> unsigned int _2; >> _Bool _3; >> unsigned char _4; >> _Bool _6; >> unsigned int _9; >> <unnamed-unsigned:1> _11; >> unsigned char _14; >> >> <bb 2> [local count: 1073741824]: >> _1 = (_Bool) flags_7(D); >> _2 = flags_7(D) & 16; >> _3 = _2 != 0; >> is_dma.v = _1; >> _4 = BIT_FIELD_REF <is_dma, 8, 0>; >> _14 = ~_4; >> _6 = (_Bool) _14; >> _11 = _3 & _6; >> _9 = (unsigned int) _11; >> is_dma ={v} {CLOBBER}; >> return _9; >> } >> >>> >>> Does this mean we want to turn plus into bit_ior when >>> get_nonzero_bits() & get_nonzero_bits() is zero? >> >> That's quite interesting transformation, I'll add it as a follow up patch. > > I was just curious - maybe we should do the reverse instead? > For mult vs. bit-and I think the latter will be "faster" (well, probably not > even that...). But for plus vs or? Hmm, expected speed up will be probably very small. Martin > > >>> >>> X * [0, 1] -> X & sign-extend-from-bit-1 also works I guess, but >>> multiplication >>> looks more canonical. >> Ok here. >> >> Martin >> >>> >>> Thanks, >>> Richard. >>> >>>> Thanks, >>>> Martin >>>> >>>> gcc/ChangeLog: >>>> >>>> 2019-06-05 Martin Liska <mli...@suse.cz> >>>> >>>> PR tree-optimization/87954 >>>> * match.pd: Simplify mult where both arguments are 0 or 1. >>>> >>>> gcc/testsuite/ChangeLog: >>>> >>>> 2019-06-05 Martin Liska <mli...@suse.cz> >>>> >>>> PR tree-optimization/87954 >>>> * gcc.dg/pr87954.c: New test. >>>> --- >>>> gcc/match.pd | 8 ++++++++ >>>> gcc/testsuite/gcc.dg/pr87954.c | 21 +++++++++++++++++++++ >>>> 2 files changed, 29 insertions(+) >>>> create mode 100644 gcc/testsuite/gcc.dg/pr87954.c >>>> >>>> >>
>From ef2699c2e41df4bb7667ce44248795dd511e0f7f Mon Sep 17 00:00:00 2001 From: Martin Liska <mli...@suse.cz> Date: Wed, 5 Jun 2019 11:58:57 +0200 Subject: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954). gcc/ChangeLog: 2019-06-05 Martin Liska <mli...@suse.cz> PR tree-optimization/87954 * match.pd: Simplify mult where both arguments are 0 or 1. gcc/testsuite/ChangeLog: 2019-06-05 Martin Liska <mli...@suse.cz> PR tree-optimization/87954 * gcc.dg/pr87954.c: New test. --- gcc/match.pd | 8 ++++++++ gcc/testsuite/gcc.dg/pr87954.c | 21 +++++++++++++++++++++ 2 files changed, 29 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/pr87954.c diff --git a/gcc/match.pd b/gcc/match.pd index 02e0471dd4e..88dae4231d8 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -217,6 +217,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || !COMPLEX_FLOAT_TYPE_P (type))) (negate @0))) +/* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 } */ +(simplify + (mult SSA_NAME@1 SSA_NAME@2) + (if (INTEGRAL_TYPE_P (type) + && get_nonzero_bits (@1) == 1 + && get_nonzero_bits (@2) == 1) + (bit_and @1 @2))) + /* Transform x * { 0 or 1, 0 or 1, ... } into x & { 0 or -1, 0 or -1, ...}, unless the target has native support for the former but not the latter. */ (simplify diff --git a/gcc/testsuite/gcc.dg/pr87954.c b/gcc/testsuite/gcc.dg/pr87954.c new file mode 100644 index 00000000000..620657cb1f5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr87954.c @@ -0,0 +1,21 @@ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#define __GFP_DMA 1u +#define __GFP_RECLAIM 0x10u + +#define KMALLOC_DMA 2 +#define KMALLOC_RECLAIM 1 + +unsigned int +imul(unsigned int flags) +{ + int is_dma, type_dma, is_rec; + + is_dma = !!(flags & __GFP_DMA); + type_dma = is_dma * KMALLOC_DMA; + is_rec = !!(flags & __GFP_RECLAIM); + + return type_dma + (is_rec * !is_dma) * KMALLOC_RECLAIM; +} + +/* { dg-final { scan-tree-dump-times { \* } 1 "optimized" } } */ -- 2.21.0