Ping. https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00737.html
Thanks, Kyrill On 13/07/16 17:14, Kyrill Tkachov wrote:
Hi all, The most common way to load and store TImode value in aarch64 is to perform an LDP/STP of two X-registers. This is the *movti_aarch64 pattern in aarch64.md. There is a bug in the logic in aarch64_classify_address where it validates the offset in the address used to load a TImode value. It passes down TImode to the aarch64_offset_7bit_signed_scaled_p check which rejects offsets that are not a multiple of the mode size of TImode (16). However, this is too conservative as X-reg LDP/STP instructions accept immediate offsets that are a multiple of 8. Also, considering that the definition of aarch64_offset_7bit_signed_scaled_p is: return (offset >= -64 * GET_MODE_SIZE (mode) && offset < 64 * GET_MODE_SIZE (mode) && offset % GET_MODE_SIZE (mode) == 0); I think the range check may even be wrong for TImode as this will accept offsets in the range [-1024, 1024) (as long as they are a multiple of 16) whereas X-reg LDP/STP instructions only accept offsets in the range [-512, 512). So since the check is for an X-reg LDP/STP address we should be passing down DImode. This patch does that and enables more aggressive generation of REG+IMM addressing modes for 64-bit aligned TImode values, eliminating many address calculation instructions. For the testcase in the patch we currently generate: bar: add x1, x1, 8 add x0, x0, 8 ldp x2, x3, [x1] stp x2, x3, [x0] ret whereas with this patch we generate: bar: ldp x2, x3, [x1, 8] stp x2, x3, [x0, 8] ret Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2016-07-13 Kyrylo Tkachov <kyrylo.tkac...@arm.com> * config/aarch64/aarch64.c (aarch64_classify_address): Use DImode when performing aarch64_offset_7bit_signed_scaled_p check for TImode LDP/STP addresses. 2016-07-13 Kyrylo Tkachov <kyrylo.tkac...@arm.com> * gcc.target/aarch64/ldp_stp_unaligned_1.c: New test.