Re: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

Martin Liška Wed, 05 Jun 2019 06:35:46 -0700

On 6/5/19 3:04 PM, Richard Biener wrote:
> On Wed, Jun 5, 2019 at 2:09 PM Martin Liška <mli...@suse.cz> wrote:
>>
>> On 6/5/19 1:13 PM, Richard Biener wrote:
>>> On Wed, Jun 5, 2019 at 12:56 PM Martin Liška <mli...@suse.cz> wrote:
>>>>
>>>> Hi.
>>>>
>>>> I'm suggesting one multiplication simplification pattern.
>>>>
>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>
>>>> Ready to be installed?
>>>
>>> +  (if (INTEGRAL_TYPE_P (type)
>>> +       && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION (type)))
>>> +       && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION 
>>> (type))))
>>>
>>>   && wi::eq_p (wi::bit_or (get_nonzero_bits (@1), get_nonzero_bits  (@2)), 
>>> 1))
>>>
>>> (I think literal 1 still works)?
>>
>> Yep, I can confirm that.
>>
>>> How does it behave for  singed/unsigned 1-bit
>>> bitfields?  A gimple testcase maybe necessary to see.
>>
>> Can we really have a mult that will have a bitfield type?
> 
> As said you probably need a GIMPLE testcase to avoid
> promoting to int.  Oh, and that doesn't work yet because
> we cannot "parse" bit-precision types for temporaries.
> 
> struct X { int a : 1; int b : 1; };
> 
> int foo (struct X *p)
> {
>   return p->a;
> }
> 
> produces
> 
> int __GIMPLE (ssa)
> foo (struct X * p)
> {
>   int D_1913;
>   <unnamed-signed:1> _1;
> ...
> 
> we have similar issues with dumping of vector types but
> there at least one can use a typedef and manual editing.
> For bit-precision types we need to invent a "C" extension
> (thus also for vectors).
> 
> Anyway...


I see, I'm sending updated version of the patch I've been just testing.
It's addressing Richard Sandifords's note.

May I install it after testing?

> 
>> $ cat gcc/testsuite/gcc.dg/pr87954-2.c
>> #define __GFP_DMA 1u
>> #define __GFP_RECLAIM 0x10u
>>
>> struct bt
>> {
>>   unsigned int v:1;
>> };
>>
>> unsigned int
>> imul(unsigned int flags)
>> {
>>   struct bt is_dma, is_rec;
>>
>>   is_dma.v = !!(flags & __GFP_DMA);
>>   is_rec.v = !!(flags & __GFP_RECLAIM);
>>
>>   return is_rec.v * !is_dma.v;
>> }
>>
>> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87954-2.c 
>> -fdump-tree-optimized=/dev/stdout -O2
>>
>> ;; Function imul (imul, funcdef_no=0, decl_uid=1909, cgraph_uid=1, 
>> symbol_order=0)
>>
>> imul (unsigned int flags)
>> {
>>   struct bt is_dma;
>>   _Bool _1;
>>   unsigned int _2;
>>   _Bool _3;
>>   unsigned char _4;
>>   _Bool _6;
>>   unsigned int _9;
>>   <unnamed-unsigned:1> _11;
>>   unsigned char _14;
>>
>>   <bb 2> [local count: 1073741824]:
>>   _1 = (_Bool) flags_7(D);
>>   _2 = flags_7(D) & 16;
>>   _3 = _2 != 0;
>>   is_dma.v = _1;
>>   _4 = BIT_FIELD_REF <is_dma, 8, 0>;
>>   _14 = ~_4;
>>   _6 = (_Bool) _14;
>>   _11 = _3 & _6;
>>   _9 = (unsigned int) _11;
>>   is_dma ={v} {CLOBBER};
>>   return _9;
>> }
>>
>>>
>>> Does this mean we want to turn plus into bit_ior when
>>> get_nonzero_bits() & get_nonzero_bits() is zero?
>>
>> That's quite interesting transformation, I'll add it as a follow up patch.
> 
> I was just curious - maybe we should do the reverse instead?
> For mult vs. bit-and I think the latter will be "faster" (well, probably not
> even that...).  But for plus vs or?

Hmm, expected speed up will be probably very small.

Martin

> 
> 
>>>
>>> X * [0, 1] -> X & sign-extend-from-bit-1 also works I guess, but 
>>> multiplication
>>> looks more canonical.
>> Ok here.
>>
>> Martin
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks,
>>>> Martin
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>> 2019-06-05  Martin Liska  <mli...@suse.cz>
>>>>
>>>>         PR tree-optimization/87954
>>>>         * match.pd: Simplify mult where both arguments are 0 or 1.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> 2019-06-05  Martin Liska  <mli...@suse.cz>
>>>>
>>>>         PR tree-optimization/87954
>>>>         * gcc.dg/pr87954.c: New test.
>>>> ---
>>>>  gcc/match.pd                   |  8 ++++++++
>>>>  gcc/testsuite/gcc.dg/pr87954.c | 21 +++++++++++++++++++++
>>>>  2 files changed, 29 insertions(+)
>>>>  create mode 100644 gcc/testsuite/gcc.dg/pr87954.c
>>>>
>>>>
>>

>From ef2699c2e41df4bb7667ce44248795dd511e0f7f Mon Sep 17 00:00:00 2001
From: Martin Liska <mli...@suse.cz>
Date: Wed, 5 Jun 2019 11:58:57 +0200
Subject: [PATCH] Simplify mult where both arguments are 0 or 1 (PR
 tree-optimization/87954).

gcc/ChangeLog:

2019-06-05  Martin Liska  <mli...@suse.cz>

	PR tree-optimization/87954
	* match.pd: Simplify mult where both arguments are 0 or 1.

gcc/testsuite/ChangeLog:

2019-06-05  Martin Liska  <mli...@suse.cz>

	PR tree-optimization/87954
	* gcc.dg/pr87954.c: New test.
---
 gcc/match.pd                   |  8 ++++++++
 gcc/testsuite/gcc.dg/pr87954.c | 21 +++++++++++++++++++++
 2 files changed, 29 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr87954.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 02e0471dd4e..88dae4231d8 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -217,6 +217,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
            || !COMPLEX_FLOAT_TYPE_P (type)))
    (negate @0)))
 
+/* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 } */
+(simplify
+ (mult SSA_NAME@1 SSA_NAME@2)
+  (if (INTEGRAL_TYPE_P (type)
+       && get_nonzero_bits (@1) == 1
+       && get_nonzero_bits (@2) == 1)
+   (bit_and @1 @2)))
+
 /* Transform x * { 0 or 1, 0 or 1, ... } into x & { 0 or -1, 0 or -1, ...},
    unless the target has native support for the former but not the latter.  */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/pr87954.c b/gcc/testsuite/gcc.dg/pr87954.c
new file mode 100644
index 00000000000..620657cb1f5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87954.c
@@ -0,0 +1,21 @@
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#define __GFP_DMA 1u
+#define __GFP_RECLAIM 0x10u
+
+#define KMALLOC_DMA 2
+#define KMALLOC_RECLAIM 1
+
+unsigned int
+imul(unsigned int flags)
+{
+  int is_dma, type_dma, is_rec;
+
+  is_dma = !!(flags & __GFP_DMA);
+  type_dma = is_dma * KMALLOC_DMA;
+  is_rec = !!(flags & __GFP_RECLAIM);
+
+  return type_dma + (is_rec * !is_dma) * KMALLOC_RECLAIM;
+}
+
+/* { dg-final { scan-tree-dump-times { \* } 1 "optimized" } } */
-- 
2.21.0

Re: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

Reply via email to