On 1/11/25 1:15 PM, Denis Chertykov wrote:
The fix for PR117868.
In brief:
this is an LRA bug derived from reuse spilling slots after frame
pointer spilling.
The slot was created for QImode (1 byte) and it was reused after
spilling of the
frame pointer for TImode register (16 bytes long) and it overlaps
other slots.
Denis, thank you for working on the PR, finding the reason for wrong
code generation, and proposing the patch.
Ok for trunk ?
I'd prefer something like the patch in the attachment.
It is simpler and even removing more LRA code than adding one.
But most important, it generates smaller reserved stack space as QI and
TI pseudos will share the same slot (of TImode) vs 2 slots QI and TI in
your fix.
I'll test my patch and submit it on the next week.
PR rtl-optimization/117868
gcc/
* lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Reuse slots
only without allocated memory or only with equal or smaller registers
with equal or smaller alignment.
(lra_spill): Print slot size as width.
diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
index db78dcd28a3..93a0c92db9f 100644
--- a/gcc/lra-spills.cc
+++ b/gcc/lra-spills.cc
@@ -386,7 +386,18 @@ assign_stack_slot_num_and_sort_pseudos (int
*pseudo_regnos, int n)
&& ! (lra_intersected_live_ranges_p
(slots[j].live_ranges,
lra_reg_info[regno].live_ranges)))
- break;
+ {
+ /* A slot without allocated memory can be shared. */
+ if (slots[j].mem == NULL_RTX)
+ break;
+
+ /* A slot with allocated memory can be shared only with equal
+ or smaller register with equal or smaller alignment. */
+ if (slots[j].align >= spill_slot_alignment (mode)
+ && compare_sizes_for_sort (slots[j].size,
+ GET_MODE_SIZE (mode)) != -1)
+ break;
+ }
}
if (j >= slots_num)
{
@@ -656,8 +667,7 @@ lra_spill (void)
for (i = 0; i < slots_num; i++)
{
fprintf (lra_dump_file, " Slot %d regnos (width = ", i);
- print_dec (GET_MODE_SIZE (GET_MODE (slots[i].mem)),
- lra_dump_file, SIGNED);
+ print_dec (slots[i].size, lra_dump_file, SIGNED);
fprintf (lra_dump_file, "):");
for (curr_regno = slots[i].regno;;
curr_regno = pseudo_slots[curr_regno].next - pseudo_slots)
diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
index db78dcd28a3..e0d88e86c18 100644
--- a/gcc/lra-spills.cc
+++ b/gcc/lra-spills.cc
@@ -635,17 +635,12 @@ lra_spill (void)
n = assign_spill_hard_regs (pseudo_regnos, n);
slots_num = 0;
assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
- for (i = 0; i < n; i++)
+ if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos + n)) > 0)
+ /* Assign stack slots to spilled pseudos assigned to fp. */
+ assign_stack_slot_num_and_sort_pseudos (pseudo_regnos + n, n2);
+ for (i = 0; i < n + n2; i++)
if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
assign_mem_slot (pseudo_regnos[i]);
- if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
- {
- /* Assign stack slots to spilled pseudos assigned to fp. */
- assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);
- for (i = 0; i < n2; i++)
- if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
- assign_mem_slot (pseudo_regnos[i]);
- }
if (n + n2 > 0 && crtl->stack_alignment_needed)
/* If we have a stack frame, we must align it now. The stack size
may be a part of the offset computation for register