On Sat, 15 Oct 2016, Prathamesh Kulkarni wrote:

> On 13 October 2016 at 13:22, Marc Glisse <marc.gli...@inria.fr> wrote:
> > On Thu, 13 Oct 2016, Prathamesh Kulkarni wrote:
> >
> >> On 12 October 2016 at 14:43, Richard Biener <rguent...@suse.de> wrote:
> >>>
> >>> On Wed, 12 Oct 2016, Marc Glisse wrote:
> >>>
> >>>> On Wed, 12 Oct 2016, Prathamesh Kulkarni wrote:
> >>>>
> >>>>> I was having a look at PR71636 and added the following pattern to
> >>>>> match.pd:
> >>>>> x & ((1U << b) - 1) -> x & ~(~0U << b)
> >>>>> However the transform is useful only if the target supports "andnot"
> >>>>> instruction.
> >>>>
> >>>>
> >>>> rth was selling the transformation as a canonicalization, which is
> >>>> beneficial
> >>>> when there is an andnot instruction, and neutral otherwise, so it could
> >>>> be
> >>>> done always.
> >>>
> >>>
> >>> Well, its three instructions to three instructions and a more expensive
> >>> constant(?).  ~0U might not be available as immediate for the shift
> >>> instruction and 1U << b might be available as a bit-set instruction ...
> >>> (vs. the andnot).
> >
> >
> > True, I hadn't thought of bit-set.
> >
> >>> So yes, we might decide to canonicalize to andnot (and decide that
> >>> three binary to two binary and one unary op is "better").
> >>>
> >>> So no excuse to explore the target specific .pd fragment idea ... :/
> >>
> >> Hi,
> >> I have attached patch that adds the transform.
> >> Does that look OK ?
> >
> >
> > Why bit_not of build_zero_cst instead of build_all_ones_cst, as suggested in
> > the PR? If we only do the transformation when (1<<b)-1 is fed to a bit_and,
> > then we probably want to require that it has a single use (maybe even the
> > shift).
> >
> >> I am not sure how to write test-cases for it though.
> >> For the test-case:
> >> unsigned f(unsigned x, unsigned b)
> >> {
> >>  unsigned t1 = 1U << b;
> >>  unsigned t2 = t1 - 1;
> >>  unsigned t3 = x & t2;
> >>  return t3;
> >> }
> >>
> >> forwprop dump shows:
> >> Applying pattern match.pd:523, gimple-match.c:47419
> >> gimple_simplified to _6 = 4294967295 << b_1(D);
> >> _8 = ~_6;
> >> t3_5 = x_4(D) & _8;
> >>
> >> I could scan for "_6 = 4294967295 << b_1(D);"  however I suppose
> >> ~0 would depend on width of int and not always be 4294967295 ?
> >> Or should I scan for "_6 = 4294967295 << b_1(D);"
> >> and add /* { dg-require-effective int32 } */  to the test-case ?
> >
> >
> > You could check that you have ~, or that you don't have " 1 << ".
> Thanks for the suggestions.
> Does the attached patch look OK ?
> 
> For test-cases, scan-tree-dump-not "1 <<" works well for pr71636-1.c
> which tests GENERIC folding,
> however for GIMPLE folding, "1 << " still remains in the forwprop dump
> because dce isn't
> run to remove unused values.
> 
> For the test-case:
> unsigned f(unsigned x, unsigned b)
> {
>   unsigned t1 = 1U << b;
>   unsigned t2 = t1 - 1;
>   unsigned t3 = x & t2;
>   return t3;
> }
> 
> forwprop dump shows:
> Applying pattern match.pd:523, gimple-match.c:47418
> gimple_simplified to _6 = 4294967295 << b_1(D);
> _8 = ~_6;
> t3_5 = x_4(D) & _8;
> f (unsigned int x, unsigned int b)
> {
>   unsigned int t3;
>   unsigned int t2;
>   unsigned int t1;
>   unsigned int _6;
>   unsigned int _8;
> 
>   <bb 2>:
>   t1_2 = 1 << b_1(D);
>   t2_3 = t1_2 + 4294967295;
>   _6 = 4294967295 << b_1(D);
>   _8 = ~_6;
>   t3_5 = x_4(D) & _8;
>   return t3_5;
> 
> }
> 
> Instead I scanned for _8 = ~_6 with:
> /* { dg-final { scan-tree-dump "_\[0-9\] = ~_\[0-9\]" "forwprop1" } } */
> because rhs has bit_not and lhs doesn't.
> Is that OK ?

That's ok -- note I usually scan cddce1 instead of forwprop to have
DCE run on the IL.

Ok (with or without changing to scan cddce1 instead for not-1<<).

Thanks,
Richard.

> Bootstrap+tested on x86_64-unknown-linux-gnu.
> Cross-tested on arm*-*-*, aarch64*-*-*
> 
> Thanks,
> Prathamesh
> >
> > --
> > Marc Glisse
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Reply via email to