Vladimir Makarov <vmaka...@redhat.com> writes: > On 1/11/25 1:15 PM, Denis Chertykov wrote: >> The fix for PR117868. >> >> In brief: >> this is an LRA bug derived from reuse spilling slots after frame >> pointer spilling. >> The slot was created for QImode (1 byte) and it was reused after >> spilling of the >> frame pointer for TImode register (16 bytes long) and it overlaps >> other slots. >> > Denis, thank you for working on the PR, finding the reason for wrong > code generation, and proposing the patch. >> >> Ok for trunk ? >> >> > I'd prefer something like the patch in the attachment. > > It is simpler and even removing more LRA code than adding one. > > But most important, it generates smaller reserved stack space as QI > and TI pseudos will share the same slot (of TImode) vs 2 slots QI and > TI in your fix.
Your patch seems wrong to me. The original code: --------------------------part of lra-spill.cc ----------------------- slots_num = 0; 1 assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n); 1 for (i = 0; i < n; i++) 1 if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX) 1 assign_mem_slot (pseudo_regnos[i]); - if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0) { /* Assign stack slots to spilled pseudos assigned to fp. */ 2 assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2); 2 for (i = 0; i < n2; i++) 2 if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX) 2 assign_mem_slot (pseudo_regnos[i]); } ---------------------------------------------------------------------- The original code consists of two passes of the stack allocation process separated by call `lra_update_fp2sp_elimination()'. It's right because real allocation is made by `assign_mem_slot()' and the AVR port used frame size as a trigger for frame pointer elimination. (If we have a frame then we can't eliminate frame pointer) The first pass allocate stack for spilled registers after that LRA asks target about possibility of fp2sp elimination (call `lra_update_fp2sp_elimination()') and after that LRA allocates a memory slots for registers spilled in the process of spilling hard frame pointer. In your patch call of `lra_update_fp2sp_elimination()' happened before real stack allocation that will be done by `assign_mem_slot()' and information about fp2sp elimination will be wrong. Denis. > > I'll test my patch and submit it on the next week. > >> >> PR rtl-optimization/117868 >> gcc/ >> * lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Reuse slots >> only without allocated memory or only with equal or smaller registers >> with equal or smaller alignment. >> (lra_spill): Print slot size as width. >> >> >> diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc >> index db78dcd28a3..93a0c92db9f 100644 >> --- a/gcc/lra-spills.cc >> +++ b/gcc/lra-spills.cc >> @@ -386,7 +386,18 @@ assign_stack_slot_num_and_sort_pseudos (int >> *pseudo_regnos, int n) >> && ! (lra_intersected_live_ranges_p >> (slots[j].live_ranges, >> lra_reg_info[regno].live_ranges))) >> - break; >> + { >> + /* A slot without allocated memory can be shared. */ >> + if (slots[j].mem == NULL_RTX) >> + break; >> + >> + /* A slot with allocated memory can be shared only with equal >> + or smaller register with equal or smaller alignment. */ >> + if (slots[j].align >= spill_slot_alignment (mode) >> + && compare_sizes_for_sort (slots[j].size, >> + GET_MODE_SIZE (mode)) != -1) >> + break; >> + } >> } >> if (j >= slots_num) >> { >> @@ -656,8 +667,7 @@ lra_spill (void) >> for (i = 0; i < slots_num; i++) >> { >> fprintf (lra_dump_file, " Slot %d regnos (width = ", i); >> - print_dec (GET_MODE_SIZE (GET_MODE (slots[i].mem)), >> - lra_dump_file, SIGNED); >> + print_dec (slots[i].size, lra_dump_file, SIGNED); >> fprintf (lra_dump_file, "):"); >> for (curr_regno = slots[i].regno;; >> curr_regno = pseudo_slots[curr_regno].next - pseudo_slots) >> > > diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc > index db78dcd28a3..e0d88e86c18 100644 > --- a/gcc/lra-spills.cc > +++ b/gcc/lra-spills.cc > @@ -635,17 +635,12 @@ lra_spill (void) > n = assign_spill_hard_regs (pseudo_regnos, n); > slots_num = 0; > assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n); > - for (i = 0; i < n; i++) > + if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos + n)) > 0) > + /* Assign stack slots to spilled pseudos assigned to fp. */ > + assign_stack_slot_num_and_sort_pseudos (pseudo_regnos + n, n2); > + for (i = 0; i < n + n2; i++) > if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX) > assign_mem_slot (pseudo_regnos[i]); > - if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0) > - { > - /* Assign stack slots to spilled pseudos assigned to fp. */ > - assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2); > - for (i = 0; i < n2; i++) > - if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX) > - assign_mem_slot (pseudo_regnos[i]); > - } > if (n + n2 > 0 && crtl->stack_alignment_needed) > /* If we have a stack frame, we must align it now. The stack size > may be a part of the offset computation for register