On Fri, Jun 12, 2026 at 3:31 PM Ciprian Arbone via Gcc <[email protected]> wrote: > > Hello, > > We recently enabled LDMIA/STMIA instructions for Thumb-1 (Cortex-M0+) by > modifying ARM_AUTOINC_VALID_FOR_MODE_P to allow auto-increment addressing > for THUMB1 targets. However, we've discovered that IVOPTs generates > suboptimal code for simple loops due to incorrect addressing mode selection. > > Consider this test case: > > void test(int *a, int *b, int size) > { > for (int i = 0; i < size; i++) > { > a[i] = b[i] * a[i]; > } > } > > GCC currently generates: > > ldmia r0!, {r4} > ldmia r1!, {r6} > subs r5, r0, #4 > ... > str r4, [r5, #0] > > The issue occurs because IVOPTs selects a candidate with the lowest cost that > has the following structure: > > Candidate xxx: > Incr POS: after use 0 > IV struct: > Type: unsigned int > Base: (unsigned int) a_13(D) > Step: 4 > > This results in the following loop structure: > > loop-preheader: > r0 = a > jump loop-exiting > > loop-header: > load-from [r0] > increment r0 > store-to [r0, #-4] > > loop-exiting: > jump loop-header > > **Issue 1:** IVOPTs recognizes both patterns as valid post-increment with > offset zero: > - "load-from [r0]; increment r0" → recognized as post-inc from offset 0 > - "increment r0; store-to [r0, #-4]" → also recognized as post-inc from > offset 0 > > The code in tree-ssa-loop-ivopts.cc:get_address_cost() applies the adjustment: > > if (stmt_after_increment (data->current_loop, cand, use->stmt)) > ainc_offset += ainc_step; > cost = get_address_cost_ainc (ainc_step, ainc_offset, > addr_mode, mem_mode, as, speed); > > However, Thumb-1 does not support negative immediate offsets in addressing > modes. The pattern "increment r0; store-to [r0, #-4]" can never be realized > as a post-increment store on Thumb-1, yet IVOPTs assigns it a low cost. > > **Question 1:** Should get_address_cost() verify that an addressing mode is > actually valid on the target before assigning auto-increment cost? Currently, > it appears to assume validity without checking target constraints.
It should end up calling the legitimize_address_p target hook to verify validity. > **Issue 2:** IVOPTs also assigns low cost to another candidate: > > Candidate yyy: > Incr POS: before exit test > IV struct: > Type: unsigned int > Base: (unsigned int) a_13(D) > Step: 4 > > This produces: > > loop-preheader: > r0 = &a[0] > jump loop-exiting > > loop-header: > load-from [r0, #-4] > store-to [r0] > > loop-exiting: > increment r0 > jump loop-header > > IVOPTs considers that the increment in the loop-exiting block can be paired > with "load-from [r0, #-4]" in the loop-header block, despite them being in > different basic blocks. > > **Question 2:** Should get_address_cost() verify that the candidate increment > and use->stmt are in the same basic block when cand->pos == IP_NORMAL? > Cross-block pairing seems problematic for post-increment addressing mode > costing. If the pairing is wrong it should be fixed, where's that pairing done? > > Both issues suggest that IVOPTs may need additional validation to ensure: > 1. The selected addressing mode is actually supported by the target > 2. The increment and memory operation are properly co-located for IP_NORMAL > candidates > > Best regards, > Ciprian Arbone
