Re: [match.pd] Fix for PR35691

Prathamesh Kulkarni Tue, 08 Nov 2016 03:13:14 -0800

On 8 November 2016 at 13:23, Richard Biener <rguent...@suse.de> wrote:
> On Mon, 7 Nov 2016, Prathamesh Kulkarni wrote:
>
>> On 7 November 2016 at 23:06, Prathamesh Kulkarni
>> <prathamesh.kulka...@linaro.org> wrote:
>> > On 7 November 2016 at 15:43, Richard Biener <rguent...@suse.de> wrote:
>> >> On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote:
>> >>
>> >>> On 4 November 2016 at 13:41, Richard Biener <rguent...@suse.de> wrote:
>> >>> > On Thu, 3 Nov 2016, Marc Glisse wrote:
>> >>> >
>> >>> >> On Thu, 3 Nov 2016, Richard Biener wrote:
>> >>> >>
>> >>> >> > > > > The transform would also work for vectors (element_precision 
>> >>> >> > > > > for
>> >>> >> > > > > the test but also a value-matching zero which should ensure 
>> >>> >> > > > > the
>> >>> >> > > > > same number of elements).
>> >>> >> > > > Um sorry, I didn't get how to check vectors to be of equal 
>> >>> >> > > > length by a
>> >>> >> > > > matching zero.
>> >>> >> > > > Could you please elaborate on that ?
>> >>> >> > >
>> >>> >> > > He may have meant something like:
>> >>> >> > >
>> >>> >> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
>> >>> >> >
>> >>> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which 
>> >>> >> > was the
>> >>> >> > point of the pattern.
>> >>> >>
>> >>> >> Oups, that's what I had written first, and then I somehow managed to 
>> >>> >> confuse
>> >>> >> myself enough to remove it so as to remove the call to types_match :-(
>> >>> >>
>> >>> >> > > So the last operand is checked with operand_equal_p instead of
>> >>> >> > > integer_zerop. But the fact that we could compute bit_ior on the
>> >>> >> > > comparison results should already imply that the number of 
>> >>> >> > > elements is the
>> >>> >> > > same.
>> >>> >> >
>> >>> >> > Though for equality compares we also allow scalar results IIRC.
>> >>> >>
>> >>> >> Oh, right, I keep forgetting that :-( And I have no idea how to 
>> >>> >> generate one
>> >>> >> for a testcase, at least until the GIMPLE FE lands...
>> >>> >>
>> >>> >> > > On platforms that have IOR on floats (at least x86 with SSE, 
>> >>> >> > > maybe some
>> >>> >> > > vector mode on s390?), it would be cool to do the same for floats 
>> >>> >> > > (most
>> >>> >> > > likely at the RTL level).
>> >>> >> >
>> >>> >> > On GIMPLE view-converts could come to the rescue here as well.  Or 
>> >>> >> > we cab
>> >>> >> > just allow bit-and/or on floats as much as we allow them on 
>> >>> >> > pointers.
>> >>> >>
>> >>> >> Would that generate sensible code on targets that do not have logic 
>> >>> >> insns for
>> >>> >> floats? Actually, even on x86_64 that generates inefficient code, so 
>> >>> >> there
>> >>> >> would be some work (for instance grep finds no gen_iordf3, only 
>> >>> >> gen_iorv2df3).
>> >>> >>
>> >>> >> I am also a bit wary of doing those obfuscating optimizations too 
>> >>> >> early...
>> >>> >> a==0 is something that other optimizations might use. long
>> >>> >> c=(long&)a|(long&)b; (double&)c==0; less so...
>> >>> >>
>> >>> >> (and I am assuming that signaling NaNs don't make the whole 
>> >>> >> transformation
>> >>> >> impossible, which might be wrong)
>> >>> >
>> >>> > Yeah.  I also think it's not so much important - I just wanted to 
>> >>> > mention
>> >>> > vectors...
>> >>> >
>> >>> > Btw, I still think we need a more sensible infrastructure for passes
>> >>> > to gather, analyze and modify complex conditions.  (I'm always pointing
>> >>> > to tree-affine.c as an, albeit not very good, example for handling
>> >>> > a similar problem)
>> >>> Thanks for mentioning the value-matching capture @@, I wasn't aware of
>> >>> this match.pd feature.
>> >>> The current patch keeps it restricted to only bitwise operators on 
>> >>> integers.
>> >>> Bootstrap+test running on x86_64-unknown-linux-gnu.
>> >>> OK to commit if passes ?
>> >>
>> >> +/* PR35691: Transform
>> >> +   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
>> >> +   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
>> >> +
>> >>
>> >> Please omit the vertical space
>> >>
>> >> +(for bitop (bit_and bit_ior)
>> >> +     cmp (eq ne)
>> >> + (simplify
>> >> +  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))
>> >>
>> >> if you capture the first integer_zerop as @2 then you can re-use it...
>> >>
>> >> +   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>> >> +       && INTEGRAL_TYPE_P (TREE_TYPE (@1))
>> >> +       && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE
>> >> (@1)))
>> >> +    (cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0));
>> >>
>> >> ... here inplace of the { build_zero_cst ... }.
>> >>
>> >> Ok with that changes.
>> > Thanks, committed the attached version as r241915.
>> ugh, the svn commit message has:
>>
>> testsuite/
>> * gcc.dg/pr35691-1.c: New test-case.
>> * gcc.dg/pr35691-4.c: Likewise.
>>
>> pr35691-4.c was a typo, should be pr35691-2.c :/
>> However testsuite/ChangeLog correctly has entry for pr35691-2.c
>> Is it possible to edit the commit message for r241915 ?
>> Sorry about this.
>
> No, just leave it as-is.
Hi,
Chritstophe reported to me that the commit caused test-cases
pr35691-1.c and pr35691-2.c (which were added by the commit)
to FAIL for cortex-a5:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/241915/arm-none-linux-gnueabihf/diff-gcc-rh60-arm-none-linux-gnueabihf-arm-cortex-a5-vfpv3-d16-fp16.txt


It seems truth_andif_expr is not simplified to bit_and_expr on
cortex-a5 as for x86_64 (and other arm variants).
The differences in dumps start from 004t.gimple for pr35691-1.c:

x86_64 gimple dump:
foo (int z0, unsigned int z1)
{
  int D.1800;
  int t0;
  int t1;
  int t2;

  _1 = z0 == 0;
  t0 = (int) _1;
  _2 = z1 == 0;
  t1 = (int) _2;
  _3 = t0 != 0;
  _4 = t1 != 0;
  _5 = _3 & _4;
  t2 = (int) _5;
  D.1800 = t2;
  return D.1800;
}

cortex-a5 gimple dump:
foo (int z0, unsigned int z1)
{
  int iftmp.0;
  int D.4176;
  int t0;
  int t1;
  int t2;

  _1 = z0 == 0;
  t0 = (int) _1;
  _2 = z1 == 0;
  t1 = (int) _2;
  if (t0 != 0) goto <D.4174>; else goto <D.4172>;
  <D.4174>:
  if (t1 != 0) goto <D.4175>; else goto <D.4172>;
  <D.4175>:
  iftmp.0 = 1;
  goto <D.4173>;
  <D.4172>:
  iftmp.0 = 0;
  <D.4173>:
  t2 = iftmp.0;
  D.4176 = t2;
  return D.4176;
}

Since the pattern expects truth_andif_expr to be converted to bit_and_expr,
it fails to match for cortex-a5.
This seems to happen only for cortex-a5 (the other variants a9, a15,
a57 are OK).

Is my assumption that truth_andif_expr would be always converted to bit_and_expr
for above case incorrect ?

Thanks,
Prathamesh
>
> Richard.
>
>> Regards,
>> Prathamesh
>> >
>> >>
>> >> Richard.
>>
>>
>
> --
> Richard Biener <rguent...@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nuernberg)

Re: [match.pd] Fix for PR35691

Reply via email to