On Wed, Jul 28, 2021 at 2:45 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
> > Hi Marc, > > Thanks for the feedback. After some quality time in gdb, I now appreciate > that > match.pd behaves (subtly) differently between generic and gimple, and the > trees actually being passed to tree_nonzero_bits were not quite what I had > expected. Sorry for my confusion, the revised patch below is now much > shorter > (and my follow-up patch that was originally to tree_nonzero_bits looks like > it > now needs to be to get_nonzero_bits!). > > This revised patch has been retested on 864_64-pc-linux-gnu with a > "make bootstrap" and "make -k check" with no new failures. > > Ok for mainline? > > 2021-07-28 Roger Sayle <ro...@nextmovesoftware.com> > Marc Glisse <marc.gli...@inria.fr> > > gcc/ChangeLog > * match.pd (bit_ior, bit_xor): Canonicalize (X*C1)|(X*C2) and > (X*C1)^(X*C2) as X*(C1+C2), and related variants, using > tree_nonzero_bits to ensure that operands are bit-wise disjoint. > > gcc/testsuite/ChangeLog > * gcc.dg/fold-ior-4.c: New test. > > Hi, This patch introduces a regression on aarch64: FAIL: gcc.target/aarch64/pr100056.c scan-assembler-not \\t[us]bfiz\\tw[0-9]+, w[0-9]+, 11 Can you check? Thanks, Christophe > Roger > -- > > -----Original Message----- > From: Marc Glisse <marc.gli...@inria.fr> > Sent: 26 July 2021 16:45 > To: Roger Sayle <ro...@nextmovesoftware.com> > Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org> > Subject: Re: [PATCH] Fold (X<<C1)^(X<<C2) to a multiplication when > possible. > > On Mon, 26 Jul 2021, Roger Sayle wrote: > > > The one aspect that's a little odd is that each transform is paired > > with a convert@1 variant, using the efficient match machinery to > > expose any zero extension to fold-const.c's tree_nonzero_bits > functionality. > > Copying the first transform for context > > +(for op (bit_ior bit_xor) > + (simplify > + (op (mult:s@0 @1 INTEGER_CST@2) > + (mult:s@3 @1 INTEGER_CST@4)) > + (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type) > + && (tree_nonzero_bits (@0) & tree_nonzero_bits (@3)) == 0) > + (mult @1 > + { wide_int_to_tree (type, wi::to_wide (@2) + wi::to_wide (@4)); > }))) > +(simplify > + (op (mult:s@0 (convert@1 @2) INTEGER_CST@3) > + (mult:s@4 (convert@1 @2) INTEGER_CST@5)) > + (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type) > + && (tree_nonzero_bits (@0) & tree_nonzero_bits (@4)) == 0) > + (mult @1 > + { wide_int_to_tree (type, wi::to_wide (@3) + wi::to_wide (@5)); > }))) > > Could you explain how the convert helps exactly? > > -- > Marc Glisse >