Re: [PATCH][AArch64] Allow multiple-of-8 immediate offsets for TImode LDP/STP

Kyrill Tkachov Mon, 01 Aug 2016 03:03:08 -0700

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00737.html


Thanks,
Kyrill

On 13/07/16 17:14, Kyrill Tkachov wrote:

Hi all,

The most common way to load and store TImode value in aarch64 is to perform an 
LDP/STP of two X-registers.
This is the *movti_aarch64 pattern in aarch64.md.
There is a bug in the logic in aarch64_classify_address where it validates the 
offset in the address used
to load a TImode value. It passes down TImode to the 
aarch64_offset_7bit_signed_scaled_p check which rejects
offsets that are not a multiple of the mode size of TImode (16). However, this 
is too conservative as X-reg LDP/STP
instructions accept immediate offsets that are a multiple of 8.

Also, considering that the definition of aarch64_offset_7bit_signed_scaled_p is:
  return (offset >= -64 * GET_MODE_SIZE (mode)
      && offset < 64 * GET_MODE_SIZE (mode)
      && offset % GET_MODE_SIZE (mode) == 0);

I think the range check may even be wrong for TImode as this will accept 
offsets in the range [-1024, 1024)
(as long as they are a multiple of 16)
whereas X-reg LDP/STP instructions only accept offsets in the range [-512, 512).
So since the check is for an X-reg LDP/STP address we should be passing down 
DImode.

This patch does that and enables more aggressive generation of REG+IMM 
addressing modes for 64-bit aligned
TImode values, eliminating many address calculation instructions.
For the testcase in the patch we currently generate:
bar:
        add     x1, x1, 8
        add     x0, x0, 8
        ldp     x2, x3, [x1]
        stp     x2, x3, [x0]
        ret

whereas with this patch we generate:
bar:
        ldp     x2, x3, [x1, 8]
        stp     x2, x3, [x0, 8]
        ret

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-07-13  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    * config/aarch64/aarch64.c (aarch64_classify_address): Use DImode when
    performing aarch64_offset_7bit_signed_scaled_p check for TImode LDP/STP
    addresses.

2016-07-13  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    * gcc.target/aarch64/ldp_stp_unaligned_1.c: New test.

Re: [PATCH][AArch64] Allow multiple-of-8 immediate offsets for TImode LDP/STP

Reply via email to