Hi Christophe, This is not a regression per se; the patch causes the test case to generate one less instruction overall, but one additional 'and'. Trunk before the patch (-O2):
foo: and w0, w0, 255 lsl w1, w0, 20 orr w0, w1, w0, lsl 8 mov w1, 65024 movk w1, 0xfe0, lsl 16 and w0, w0, w1 ret The new trunk aarch64 compiler with the patch generates fewer instructions but one additional 'and' foo: and w0, w0, 255 lsl w1, w0, 20 orr w0, w1, w0, lsl 8 and x0, x0, 268434944 and x0, x0, -2031617 ret -----Original Message----- From: Christophe Lyon [mailto:christophe.l...@linaro.org] Sent: Wednesday, November 23, 2016 6:42 AM To: James Greenhalgh Cc: Michael Collison; gcc-patches@gcc.gnu.org; nd Subject: Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions Hi Michael, On 21 November 2016 at 10:52, James Greenhalgh <james.greenha...@arm.com> wrote: > On Fri, Nov 18, 2016 at 07:41:58AM +0000, Michael Collison wrote: >> James, >> >> I incorporated all your suggestions, and successfully bootstrapped >> and re-ran the testsuite. >> >> Okay for trunk? >> >> 2016-11-18 Michael Collison <michael.colli...@arm.com> >> >> * config/aarch64/aarch64-protos.h >> (aarch64_and_split_imm1, aarch64_and_split_imm2) >> (aarch64_and_bitmask_imm): New prototypes >> * config/aarch64/aarch64.c (aarch64_and_split_imm1): >> New overloaded function to create bit mask covering the >> lowest to highest bits set. >> (aarch64_and_split_imm2): New overloaded functions to create bit >> mask of zeros between first and last bit set. >> (aarch64_and_bitmask_imm): New function to determine if a integer >> is a valid two instruction "and" operation. >> * config/aarch64/aarch64.md:(and<mode>3): New define_insn and _split >> allowing wider range of constants with "and" operations. >> * (ior<mode>3, xor<mode>3): Use new LOGICAL2 iterator to prevent >> "and" operator from matching restricted constant range used for >> ior and xor operators. >> * config/aarch64/constraints.md (UsO constraint): New SImode constraint >> for constants in "and" operantions. >> (UsP constraint): New DImode constraint for constants in "and" >> operations. >> * config/aarch64/iterators.md (lconst2): New mode iterator. >> (LOGICAL2): New code iterator. >> * config/aarch64/predicates.md (aarch64_logical_and_immediate): New >> predicate >> (aarch64_logical_and_operand): New predicate allowing extended >> constants >> for "and" operations. >> * testsuite/gcc.target/aarch64/and_const.c: New test to verify >> additional constants are recognized and fewer instructions generated. >> * testsuite/gcc.target/aarch64/and_const2.c: New test to verify >> additional constants are recognized and fewer instructions generated. >> > >> diff --git a/gcc/config/aarch64/aarch64-protos.h >> b/gcc/config/aarch64/aarch64-protos.h >> index 3cdd69b..7ef8cdf 100644 >> --- a/gcc/config/aarch64/aarch64-protos.h >> +++ b/gcc/config/aarch64/aarch64-protos.h >> @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; >> HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, >> unsigned); int aarch64_get_condition_code (rtx); bool >> aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode); >> +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT >> +val_in); unsigned HOST_WIDE_INT aarch64_and_split_imm2 >> +(HOST_WIDE_INT val_in); bool aarch64_and_bitmask_imm (unsigned >> +HOST_WIDE_INT val_in, machine_mode mode); >> int aarch64_branch_cost (bool, bool); enum aarch64_symbol_type >> aarch64_classify_symbolic_expression (rtx); bool >> aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); diff --git >> a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index >> 3e663eb..8e33c40 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -3600,6 +3600,43 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, >> machine_mode mode) >> return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26]; >> } >> >> +/* Create mask of ones, covering the lowest to highest bits set in >> +VAL_IN. */ >> + >> +unsigned HOST_WIDE_INT >> +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) { >> + int lowest_bit_set = ctz_hwi (val_in); >> + int highest_bit_set = floor_log2 (val_in); >> + gcc_assert (val_in != 0); > > The comment above the function should make this precondition clear. Or > you could pick a well defined behaviour for when val_in == 0 (return > 0?), if that fits the rest of the algorithm. > > Otherwise, this patch looks OK to me. > > Thanks, > James > This patch (r242739) causes a regression on aarch64: gcc.dg/zero_bits_compound-2.c scan-assembler-times \\(and: 2 now fails. Christophe.