On 23 November 2016 at 17:30, Michael Collison <michael.colli...@arm.com> wrote: > Hi Christophe, > > This is not a regression per se; the patch causes the test case to generate > one less instruction overall, but one additional 'and'. Trunk before the > patch (-O2): > > foo: > and w0, w0, 255 > lsl w1, w0, 20 > orr w0, w1, w0, lsl 8 > mov w1, 65024 > movk w1, 0xfe0, lsl 16 > and w0, w0, w1 > ret > > The new trunk aarch64 compiler with the patch generates fewer instructions > but one additional 'and' > > foo: > and w0, w0, 255 > lsl w1, w0, 20 > orr w0, w1, w0, lsl 8 > and x0, x0, 268434944 > and x0, x0, -2031617 > ret >
OK, by "regression" I meant that a test used to pass and now fails. It looks like you have to update it to take into account the new, better code. Christophe > -----Original Message----- > From: Christophe Lyon [mailto:christophe.l...@linaro.org] > Sent: Wednesday, November 23, 2016 6:42 AM > To: James Greenhalgh > Cc: Michael Collison; gcc-patches@gcc.gnu.org; nd > Subject: Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions > > Hi Michael, > > > On 21 November 2016 at 10:52, James Greenhalgh <james.greenha...@arm.com> > wrote: >> On Fri, Nov 18, 2016 at 07:41:58AM +0000, Michael Collison wrote: >>> James, >>> >>> I incorporated all your suggestions, and successfully bootstrapped >>> and re-ran the testsuite. >>> >>> Okay for trunk? >>> >>> 2016-11-18 Michael Collison <michael.colli...@arm.com> >>> >>> * config/aarch64/aarch64-protos.h >>> (aarch64_and_split_imm1, aarch64_and_split_imm2) >>> (aarch64_and_bitmask_imm): New prototypes >>> * config/aarch64/aarch64.c (aarch64_and_split_imm1): >>> New overloaded function to create bit mask covering the >>> lowest to highest bits set. >>> (aarch64_and_split_imm2): New overloaded functions to create bit >>> mask of zeros between first and last bit set. >>> (aarch64_and_bitmask_imm): New function to determine if a integer >>> is a valid two instruction "and" operation. >>> * config/aarch64/aarch64.md:(and<mode>3): New define_insn and _split >>> allowing wider range of constants with "and" operations. >>> * (ior<mode>3, xor<mode>3): Use new LOGICAL2 iterator to prevent >>> "and" operator from matching restricted constant range used for >>> ior and xor operators. >>> * config/aarch64/constraints.md (UsO constraint): New SImode >>> constraint >>> for constants in "and" operantions. >>> (UsP constraint): New DImode constraint for constants in "and" >>> operations. >>> * config/aarch64/iterators.md (lconst2): New mode iterator. >>> (LOGICAL2): New code iterator. >>> * config/aarch64/predicates.md (aarch64_logical_and_immediate): New >>> predicate >>> (aarch64_logical_and_operand): New predicate allowing extended >>> constants >>> for "and" operations. >>> * testsuite/gcc.target/aarch64/and_const.c: New test to verify >>> additional constants are recognized and fewer instructions generated. >>> * testsuite/gcc.target/aarch64/and_const2.c: New test to verify >>> additional constants are recognized and fewer instructions generated. >>> >> >>> diff --git a/gcc/config/aarch64/aarch64-protos.h >>> b/gcc/config/aarch64/aarch64-protos.h >>> index 3cdd69b..7ef8cdf 100644 >>> --- a/gcc/config/aarch64/aarch64-protos.h >>> +++ b/gcc/config/aarch64/aarch64-protos.h >>> @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; >>> HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, >>> unsigned); int aarch64_get_condition_code (rtx); bool >>> aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode); >>> +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT >>> +val_in); unsigned HOST_WIDE_INT aarch64_and_split_imm2 >>> +(HOST_WIDE_INT val_in); bool aarch64_and_bitmask_imm (unsigned >>> +HOST_WIDE_INT val_in, machine_mode mode); >>> int aarch64_branch_cost (bool, bool); enum aarch64_symbol_type >>> aarch64_classify_symbolic_expression (rtx); bool >>> aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); diff --git >>> a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index >>> 3e663eb..8e33c40 100644 >>> --- a/gcc/config/aarch64/aarch64.c >>> +++ b/gcc/config/aarch64/aarch64.c >>> @@ -3600,6 +3600,43 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, >>> machine_mode mode) >>> return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26]; >>> } >>> >>> +/* Create mask of ones, covering the lowest to highest bits set in >>> +VAL_IN. */ >>> + >>> +unsigned HOST_WIDE_INT >>> +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) { >>> + int lowest_bit_set = ctz_hwi (val_in); >>> + int highest_bit_set = floor_log2 (val_in); >>> + gcc_assert (val_in != 0); >> >> The comment above the function should make this precondition clear. Or >> you could pick a well defined behaviour for when val_in == 0 (return >> 0?), if that fits the rest of the algorithm. >> >> Otherwise, this patch looks OK to me. >> >> Thanks, >> James >> > > This patch (r242739) causes a regression on aarch64: > > gcc.dg/zero_bits_compound-2.c scan-assembler-times \\(and: 2 now fails. > > Christophe.