Vladimir Makarov <vmaka...@redhat.com> writes:

> On 1/11/25 1:15 PM, Denis Chertykov wrote:
>> The fix for PR117868.
>>
>> In brief:
>> this is an LRA bug derived from reuse spilling slots after frame
>> pointer spilling.
>> The slot was created for QImode (1 byte) and it was reused after
>> spilling of the
>> frame pointer for TImode register (16 bytes long) and it overlaps
>> other slots.
>>
> Denis, thank you for working on the PR, finding the reason for wrong
> code generation, and proposing the patch.
>>
>> Ok for trunk ?
>>
>>
> I'd prefer something like the patch in the attachment.
>
> It is simpler and even removing more LRA code than adding one.
>
> But most important, it generates smaller reserved stack space as QI
> and TI pseudos will share the same slot (of TImode) vs 2 slots QI and
> TI in your fix.

Your patch seems wrong to me.

The original code:
--------------------------part of lra-spill.cc -----------------------
  slots_num = 0;
1  assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
1  for (i = 0; i < n; i++)
1    if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
1      assign_mem_slot (pseudo_regnos[i]);
-  if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
    {
      /* Assign stack slots to spilled pseudos assigned to fp.  */
2      assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);
2      for (i = 0; i < n2; i++)
2       if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
2         assign_mem_slot (pseudo_regnos[i]);
    }
----------------------------------------------------------------------

The original code consists of two passes of the stack allocation process 
separated
by call `lra_update_fp2sp_elimination()'.
It's right because real allocation is made by `assign_mem_slot()' and
the AVR port used frame size as a trigger for frame pointer elimination.
(If we have a frame then we can't eliminate frame pointer)

The first pass allocate stack for spilled registers
after that LRA asks target about possibility of fp2sp elimination
(call `lra_update_fp2sp_elimination()')
and after that LRA allocates a memory slots for registers spilled in the 
process of
spilling hard frame pointer.

In your patch call of `lra_update_fp2sp_elimination()' happened before
real stack allocation that will be done by `assign_mem_slot()' and information
about fp2sp elimination will be wrong.

Denis.


>
> I'll test my patch and submit it on the next week.
>
>>
>>     PR rtl-optimization/117868
>> gcc/
>>     * lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Reuse slots
>>     only without allocated memory or only with equal or smaller registers
>>     with equal or smaller alignment.
>>     (lra_spill): Print slot size as width.
>>
>>
>> diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
>> index db78dcd28a3..93a0c92db9f 100644
>> --- a/gcc/lra-spills.cc
>> +++ b/gcc/lra-spills.cc
>> @@ -386,7 +386,18 @@ assign_stack_slot_num_and_sort_pseudos (int
>> *pseudo_regnos, int n)
>>          && ! (lra_intersected_live_ranges_p
>>                (slots[j].live_ranges,
>>                 lra_reg_info[regno].live_ranges)))
>> -          break;
>> +          {
>> +        /* A slot without allocated memory can be shared.  */
>> +        if (slots[j].mem == NULL_RTX)
>> +          break;
>> +
>> +        /* A slot with allocated memory can be shared only with equal
>> +           or smaller register with equal or smaller alignment. */
>> +        if (slots[j].align >= spill_slot_alignment (mode)
>> +            && compare_sizes_for_sort (slots[j].size,
>> +                           GET_MODE_SIZE (mode)) != -1)
>> +          break;
>> +          }
>>      }
>>        if (j >= slots_num)
>>      {
>> @@ -656,8 +667,7 @@ lra_spill (void)
>>        for (i = 0; i < slots_num; i++)
>>      {
>>        fprintf (lra_dump_file, "  Slot %d regnos (width = ", i);
>> -      print_dec (GET_MODE_SIZE (GET_MODE (slots[i].mem)),
>> -             lra_dump_file, SIGNED);
>> +      print_dec (slots[i].size, lra_dump_file, SIGNED);
>>        fprintf (lra_dump_file, "):");
>>        for (curr_regno = slots[i].regno;;
>>             curr_regno = pseudo_slots[curr_regno].next - pseudo_slots)
>>
>
> diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
> index db78dcd28a3..e0d88e86c18 100644
> --- a/gcc/lra-spills.cc
> +++ b/gcc/lra-spills.cc
> @@ -635,17 +635,12 @@ lra_spill (void)
>    n = assign_spill_hard_regs (pseudo_regnos, n);
>    slots_num = 0;
>    assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
> -  for (i = 0; i < n; i++)
> +  if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos + n)) > 0)
> +    /* Assign stack slots to spilled pseudos assigned to fp.  */
> +    assign_stack_slot_num_and_sort_pseudos (pseudo_regnos + n, n2);
> +  for (i = 0; i < n + n2; i++)
>      if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
>        assign_mem_slot (pseudo_regnos[i]);
> -  if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
> -    {
> -      /* Assign stack slots to spilled pseudos assigned to fp.  */
> -      assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);
> -      for (i = 0; i < n2; i++)
> -     if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
> -       assign_mem_slot (pseudo_regnos[i]);
> -    }
>    if (n + n2 > 0 && crtl->stack_alignment_needed)
>      /* If we have a stack frame, we must align it now.  The stack size
>         may be a part of the offset computation for register

Reply via email to