On Thu, Feb 20, 2025 at 3:17 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Thu, Feb 20, 2025 at 5:37 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Wed, Feb 19, 2025 at 10:09 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > ... > > > > My algorithm keeps a list of registers which can access the stack > > > > starting with SP and FP. If any registers are derived from the list, > > > > add them to the list until all used registers are exhausted. If > > > > any stores use the register on the list, update the stack alignment. > > > > The missing part is that it doesn't check if the register is actually > > > > used for memory access. > > > > > > > > Here is the v2 patch to also check if the register is used for memory > > > > access. > > > > > > @@ -8473,16 +8482,15 @@ static void > > > ix86_update_stack_alignment (rtx, const_rtx pat, void *data) > > > { > > > /* This insn may reference stack slot. Update the maximum stack slot > > > - alignment. */ > > > + alignment if the memory is reference by the specified register. */ > > > > > > referenced > > > > Fixed. > > > > > + stack_access_data *p = (stack_access_data *) data; > > > subrtx_iterator::array_type array; > > > FOR_EACH_SUBRTX (iter, array, pat, ALL) > > > - if (MEM_P (*iter)) > > > + if (MEM_P (*iter) && reg_mentioned_p (p->reg, *iter)) > > > > > > @@ -8494,7 +8502,7 @@ ix86_find_all_reg_use_1 (rtx set, HARD_REG_SET > > > &stack_slot_access, > > > auto_bitmap &worklist) > > > { > > > rtx dest = SET_DEST (set); > > > - if (!REG_P (dest)) > > > + if (!GENERAL_REG_P (dest)) > > > return; > > > > > > The above change is not needed now that the register in the memory > > > reference is checked. The compiler can generate GPR->XMM->GPR > > > sequence. > > > > Fixed. > > > > > @@ -8630,8 +8638,8 @@ ix86_find_max_used_stack_alignment (unsigned int > > > &stack_alignment, > > > if (!NONJUMP_INSN_P (insn)) > > > continue; > > > > > > - note_stores (insn, ix86_update_stack_alignment, > > > - &stack_alignment); > > > + stack_access_data data = { DF_REF_REG (ref), &stack_alignment }; > > > + note_stores (insn, ix86_update_stack_alignment, &data); > > > > > > Do we need FOR_EACH_SUBRTX here instead of note_stores to also process > > > reads from stack? Reads should also be aligned. > > > > > > > Yes, we do since ix86_update_stack_alignment gets the RTL pattern, > > not operand: > > > > Breakpoint 1, ix86_update_stack_alignment (pat=0x7fffe966a108, > > data=0x7fffffffce00) > > at /export/gnu/import/git/gitlab/x86-gcc/gcc/config/i386/i386.cc:8486 > > 8486 stack_access_data *p = (stack_access_data *) data; > > (set (reg/f:DI 20 xmm0 [orig:126 _53 ] [126]) > > (reg/f:DI 0 ax [orig:126 _53 ] [126])) > > (gdb) > > > > FOR_EACH_SUBRTX is needed to check for the memory > > operand referenced by the stack access register. > > > > Here is the v3 patch. OK for master? > > > > -- > > H.J. > > Here is the v4 patch to move > > stack_access_data data = { nullptr, &stack_alignment }; > > out of the loop.
Let's revert the original patch that caused PR 118936 for gcc-15. It is too invasive for stage 4. We will retry for gcc-16. Uros.