Hi all, This ICE in this PR occurs when we're trying to split unaligned_loaddi into two SImode unaligned loads. The problem is in the addressing mode. When reload was picking the addressing mode we accepted an offset of -256 because the mode in the pattern is advertised as DImode and that was accepted by the legitimate address hooks because they thought it was a NEON load (DImode is in VALID_NEON_DREG_MODE). However, the splitter wants to generate two normal SImode unaligned loads using that address, for which -256 is not valid, so we ICE in gen_lowpart.
The only way unaligned_loaddi could be generated was through the gen_movmem_ldrd_strd expansion that implements a memmove using LDRD and STRD sequences. If the memmove source is not aligned we can't use LDRDs so the code generates unaligned_loaddi patterns and expects them to be split into two normal loads after reload. Similarly for unaligned store destinations. This patch just explicitly generates the two unaligned SImode loads or stores when appropriate inside gen_movmem_ldrd_strd. This makes the unaligned_loaddi and unaligned_storedi patterns unused, so we can remove them. This patch fixes the ICe in gcc.target/aarch64/advsimd-intrinsics/vldX.c seen with -mthumb -mcpu=cortex-a15 -mfpu=neon-vfpv4 -mfloat-abi=hard -mfp16-format=ieee so no new testcase is added. Bootstrapped and tested on arm-none-linux-gnueabihf. Ok for trunk? Thanks, Kyrill 2015-11-10 Kyrylo Tkachov <kyrylo.tkac...@arm.com> PR target/68149 * config/arm/arm.md (unaligned_loaddi): Delete. (unaligned_storedi): Likewise. * config/arm/arm.c (gen_movmem_ldrd_strd): Don't generate unaligned DImode memory ops. Instead perform two back-to-back unalgined SImode ops.
commit 51849126dbef9ebdd95e0ee4dbcd84361e22c992 Author: Kyrylo Tkachov <kyrylo.tkac...@arm.com> Date: Tue Nov 3 17:36:38 2015 +0000 [ARM] Fix ICE in unaligned_loaddi split diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 71e704c..eafcb9c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -14911,21 +14911,41 @@ gen_movmem_ldrd_strd (rtx *operands) if (!(dst_aligned || src_aligned)) return arm_gen_movmemqi (operands); - src = adjust_address (src, DImode, 0); - dst = adjust_address (dst, DImode, 0); + /* If the either src or dst is unaligned we'll be accessing it as pairs + of unaligned SImode accesses. Otherwise we can generate DImode + ldrd/strd instructions. */ + src = adjust_address (src, src_aligned ? DImode : SImode, 0); + dst = adjust_address (dst, dst_aligned ? DImode : SImode, 0); + while (len >= 8) { len -= 8; reg0 = gen_reg_rtx (DImode); + rtx low_reg = NULL_RTX; + rtx hi_reg = NULL_RTX; + + if (!src_aligned || !dst_aligned) + { + low_reg = gen_lowpart (SImode, reg0); + hi_reg = gen_highpart_mode (SImode, DImode, reg0); + } if (src_aligned) emit_move_insn (reg0, src); else - emit_insn (gen_unaligned_loaddi (reg0, src)); + { + emit_insn (gen_unaligned_loadsi (low_reg, src)); + src = next_consecutive_mem (src); + emit_insn (gen_unaligned_loadsi (hi_reg, src)); + } if (dst_aligned) emit_move_insn (dst, reg0); else - emit_insn (gen_unaligned_storedi (dst, reg0)); + { + emit_insn (gen_unaligned_storesi (dst, low_reg)); + dst = next_consecutive_mem (dst); + emit_insn (gen_unaligned_storesi (dst, hi_reg)); + } src = next_consecutive_mem (src); dst = next_consecutive_mem (dst); diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 1e40b17..42f961f 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4362,59 +4362,6 @@ (define_insn "unaligned_storehi" (set_attr "predicable_short_it" "yes,no") (set_attr "type" "store1")]) -;; Unaligned double-word load and store. -;; Split after reload into two unaligned single-word accesses. -;; It prevents lower_subreg from splitting some other aligned -;; double-word accesses too early. Used for internal memcpy. - -(define_insn_and_split "unaligned_loaddi" - [(set (match_operand:DI 0 "s_register_operand" "=l,r") - (unspec:DI [(match_operand:DI 1 "memory_operand" "o,o")] - UNSPEC_UNALIGNED_LOAD))] - "unaligned_access && TARGET_32BIT" - "#" - "&& reload_completed" - [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_UNALIGNED_LOAD)) - (set (match_dup 2) (unspec:SI [(match_dup 3)] UNSPEC_UNALIGNED_LOAD))] - { - operands[2] = gen_highpart (SImode, operands[0]); - operands[0] = gen_lowpart (SImode, operands[0]); - operands[3] = gen_highpart (SImode, operands[1]); - operands[1] = gen_lowpart (SImode, operands[1]); - - /* If the first destination register overlaps with the base address, - swap the order in which the loads are emitted. */ - if (reg_overlap_mentioned_p (operands[0], operands[1])) - { - std::swap (operands[1], operands[3]); - std::swap (operands[0], operands[2]); - } - } - [(set_attr "arch" "t2,any") - (set_attr "length" "4,8") - (set_attr "predicable" "yes") - (set_attr "type" "load2")]) - -(define_insn_and_split "unaligned_storedi" - [(set (match_operand:DI 0 "memory_operand" "=o,o") - (unspec:DI [(match_operand:DI 1 "s_register_operand" "l,r")] - UNSPEC_UNALIGNED_STORE))] - "unaligned_access && TARGET_32BIT" - "#" - "&& reload_completed" - [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_UNALIGNED_STORE)) - (set (match_dup 2) (unspec:SI [(match_dup 3)] UNSPEC_UNALIGNED_STORE))] - { - operands[2] = gen_highpart (SImode, operands[0]); - operands[0] = gen_lowpart (SImode, operands[0]); - operands[3] = gen_highpart (SImode, operands[1]); - operands[1] = gen_lowpart (SImode, operands[1]); - } - [(set_attr "arch" "t2,any") - (set_attr "length" "4,8") - (set_attr "predicable" "yes") - (set_attr "type" "store2")]) - (define_insn "*extv_reg" [(set (match_operand:SI 0 "s_register_operand" "=r")