[Bug target/66433] New: Arm NEON postincrement optimization missed

y.usishchev at samsung dot com Fri, 05 Jun 2015 08:02:19 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66433


            Bug ID: 66433
           Summary: Arm NEON postincrement optimization missed
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: y.usishchev at samsung dot com
  Target Milestone: ---

Created attachment 35701
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35701&action=edit
test with vld and vst

GCC from trunk, configured with --target=armv7l-tizen-linux-gnueabi with
options "-O2 -mfpu=neon" on attached testcase does not generate autoincrement
for vld/vst instructions.

auto-inc-dec pass ignores possibilities of optimization vld/vst instructions:
for code

for () { //some loop
  s0_32x4 = vld1q_u32(s);
  s1_32x4 = vld1q_u32(s+4);
  s+=8;
  ...
}

gcc generates

vld1.32 {d6-d7}, [r1]
add.w   r4, r1, #16
adds    r1, #32
vld1.32 {d28-d29}, [r4]

instead of

vld1.32 {d6-d7}, [r1]!
vld1.32 {d28-d29}, [r1]!

This is caused by presumably wrong cost estimation:
vld1.32 instruction without increment costs 4, but with increment its cost is
16 (gcc/config/arm/arm.c:9415):

    case MEM:
      if (REG_P (XEXP (x, 0)))
  *cost = COSTS_N_INSNS (1);
      ...
      else
  *cost = COSTS_N_INSNS (ARM_NUM_REGS (mode));

[Bug target/66433] New: Arm NEON postincrement optimization missed

Reply via email to