On 1/18/25 1:19 AM, Denis Chertykov wrote:
Vladimir Makarov <vmaka...@redhat.com> writes:
I'd prefer something like the patch in the attachment.
It is simpler and even removing more LRA code than adding one.
But most important, it generates smaller reserved stack space as QI
and TI pseudos will share the same slot (of TImode) vs 2 slots QI and
TI in your fix.
Your patch seems wrong to me.
Yes, you are right. My patch is wrong. I did not managed to create a
simpler or better patch solving the problem.
So please go ahead and commit your original patch. Thank you.
The original code:
--------------------------part of lra-spill.cc -----------------------
slots_num = 0;
1 assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
1 for (i = 0; i < n; i++)
1 if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
1 assign_mem_slot (pseudo_regnos[i]);
- if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
{
/* Assign stack slots to spilled pseudos assigned to fp. */
2 assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);
2 for (i = 0; i < n2; i++)
2 if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
2 assign_mem_slot (pseudo_regnos[i]);
}
----------------------------------------------------------------------
The original code consists of two passes of the stack allocation process
separated
by call `lra_update_fp2sp_elimination()'.
It's right because real allocation is made by `assign_mem_slot()' and
the AVR port used frame size as a trigger for frame pointer elimination.
(If we have a frame then we can't eliminate frame pointer)
The first pass allocate stack for spilled registers
after that LRA asks target about possibility of fp2sp elimination
(call `lra_update_fp2sp_elimination()')
and after that LRA allocates a memory slots for registers spilled in the
process of
spilling hard frame pointer.
In your patch call of `lra_update_fp2sp_elimination()' happened before
real stack allocation that will be done by `assign_mem_slot()' and information
about fp2sp elimination will be wrong.
I'll test my patch and submit it on the next week.
PR rtl-optimization/117868
gcc/
* lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Reuse slots
only without allocated memory or only with equal or smaller registers
with equal or smaller alignment.
(lra_spill): Print slot size as width.
diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
index db78dcd28a3..93a0c92db9f 100644
--- a/gcc/lra-spills.cc
+++ b/gcc/lra-spills.cc
@@ -386,7 +386,18 @@ assign_stack_slot_num_and_sort_pseudos (int
*pseudo_regnos, int n)
&& ! (lra_intersected_live_ranges_p
(slots[j].live_ranges,
lra_reg_info[regno].live_ranges)))
- break;
+ {
+ /* A slot without allocated memory can be shared. */
+ if (slots[j].mem == NULL_RTX)
+ break;
+
+ /* A slot with allocated memory can be shared only with equal
+ or smaller register with equal or smaller alignment. */
+ if (slots[j].align >= spill_slot_alignment (mode)
+ && compare_sizes_for_sort (slots[j].size,
+ GET_MODE_SIZE (mode)) != -1)
+ break;
+ }
}
if (j >= slots_num)
{
@@ -656,8 +667,7 @@ lra_spill (void)
for (i = 0; i < slots_num; i++)
{
fprintf (lra_dump_file, " Slot %d regnos (width = ", i);
- print_dec (GET_MODE_SIZE (GET_MODE (slots[i].mem)),
- lra_dump_file, SIGNED);
+ print_dec (slots[i].size, lra_dump_file, SIGNED);
fprintf (lra_dump_file, "):");
for (curr_regno = slots[i].regno;;
curr_regno = pseudo_slots[curr_regno].next - pseudo_slots)
diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
index db78dcd28a3..e0d88e86c18 100644
--- a/gcc/lra-spills.cc
+++ b/gcc/lra-spills.cc
@@ -635,17 +635,12 @@ lra_spill (void)
n = assign_spill_hard_regs (pseudo_regnos, n);
slots_num = 0;
assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
- for (i = 0; i < n; i++)
+ if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos + n)) > 0)
+ /* Assign stack slots to spilled pseudos assigned to fp. */
+ assign_stack_slot_num_and_sort_pseudos (pseudo_regnos + n, n2);
+ for (i = 0; i < n + n2; i++)
if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
assign_mem_slot (pseudo_regnos[i]);
- if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
- {
- /* Assign stack slots to spilled pseudos assigned to fp. */
- assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);
- for (i = 0; i < n2; i++)
- if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
- assign_mem_slot (pseudo_regnos[i]);
- }
if (n + n2 > 0 && crtl->stack_alignment_needed)
/* If we have a stack frame, we must align it now. The stack size
may be a part of the offset computation for register