[cc-ing Eric as RTL maintainer]

Matthew Fortune <matthew.fort...@imgtec.com> writes:
> Bernd Edlinger <bernd.edlin...@hotmail.de> writes:
>> Matthew Fortune <matthew.fort...@imgtec.com> writes:
>> > Has the patch been tested beyond just building GCC? I can do a
>> > test run for you if you don't have things set up to do one yourself.
>> 
>> I built a cross-gcc with all languages and a cross-glibc, but I have
>> not set up an emulation environment, so if you could give it a test
>> that would be highly welcome.
>
> mipsel-linux-gnu test results are the same before and after this patch.
>
> Please go ahead and commit.

I still object to this.  And it feels like the patch was posted
as though it was a new one in order to avoid answering the objections
that were raised when it was last posted:

  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02218.html

IMO the problem is that rtx_addr_can_trap_p_1 duplicates a large
bit of LRA/reload logic:

------------------------------------------------------------------------
/* Compute an approximation for the offset between the register
   FROM and TO for the current function, as it was at the start
   of the routine.  */

static HOST_WIDE_INT
get_initial_register_offset (int from, int to)
{
#ifdef ELIMINABLE_REGS
  static const struct elim_table_t
  {
    const int from;
    const int to;
  } table[] = ELIMINABLE_REGS;
  HOST_WIDE_INT offset1, offset2;
  unsigned int i, j;

  if (to == from)
    return 0;

  /* It is not safe to call INITIAL_ELIMINATION_OFFSET
     before the reload pass.  We need to give at least
     an estimation for the resulting frame size.  */
  if (! reload_completed)
    {
      offset1 = crtl->outgoing_args_size + get_frame_size ();
#if !STACK_GROWS_DOWNWARD
      offset1 = - offset1;
#endif
      if (to == STACK_POINTER_REGNUM)
        return offset1;
      else if (from == STACK_POINTER_REGNUM)
        return - offset1;
      else
        return 0;
     }

  for (i = 0; i < ARRAY_SIZE (table); i++)
      if (table[i].from == from)
        {
          if (table[i].to == to)
            {
              INITIAL_ELIMINATION_OFFSET (table[i].from, table[i].to,
                                          offset1);
              return offset1;
            }
          for (j = 0; j < ARRAY_SIZE (table); j++)
            {
              if (table[j].to == to
                  && table[j].from == table[i].to)
                {
                  INITIAL_ELIMINATION_OFFSET (table[i].from, table[i].to,
                                              offset1);
                  INITIAL_ELIMINATION_OFFSET (table[j].from, table[j].to,
                                              offset2);
                  return offset1 + offset2;
                }
              if (table[j].from == to
                  && table[j].to == table[i].to)
                {
                  INITIAL_ELIMINATION_OFFSET (table[i].from, table[i].to,
                                              offset1);
                  INITIAL_ELIMINATION_OFFSET (table[j].from, table[j].to,
                                              offset2);
                  return offset1 - offset2;
                }
            }
        }
      else if (table[i].to == from)
        {
          if (table[i].from == to)
            {
              INITIAL_ELIMINATION_OFFSET (table[i].from, table[i].to,
                                          offset1);
              return - offset1;
            }
          for (j = 0; j < ARRAY_SIZE (table); j++)
            {
              if (table[j].to == to
                  && table[j].from == table[i].from)
                {
                  INITIAL_ELIMINATION_OFFSET (table[i].from, table[i].to,
                                              offset1);
                  INITIAL_ELIMINATION_OFFSET (table[j].from, table[j].to,
                                              offset2);
                  return - offset1 + offset2;
                }
              if (table[j].from == to
                  && table[j].to == table[i].from)
                {
                  INITIAL_ELIMINATION_OFFSET (table[i].from, table[i].to,
                                              offset1);
                  INITIAL_ELIMINATION_OFFSET (table[j].from, table[j].to,
                                              offset2);
                  return - offset1 - offset2;
                }
            }
        }

  /* If the requested register combination was not found,
     try a different more simple combination.  */
  if (from == ARG_POINTER_REGNUM)
    return get_initial_register_offset (HARD_FRAME_POINTER_REGNUM, to);
  else if (to == ARG_POINTER_REGNUM)
    return get_initial_register_offset (from, HARD_FRAME_POINTER_REGNUM);
  else if (from == HARD_FRAME_POINTER_REGNUM)
    return get_initial_register_offset (FRAME_POINTER_REGNUM, to);
  else if (to == HARD_FRAME_POINTER_REGNUM)
    return get_initial_register_offset (from, FRAME_POINTER_REGNUM);
  else
    return 0;

#else
  HOST_WIDE_INT offset;

  if (to == from)
    return 0;

  if (reload_completed)
    {
      INITIAL_FRAME_POINTER_OFFSET (offset);
    }
  else
    {
      offset = crtl->outgoing_args_size + get_frame_size ();
#if !STACK_GROWS_DOWNWARD
      offset = - offset;
#endif
    }

  if (to == STACK_POINTER_REGNUM)
    return offset;
  else if (from == STACK_POINTER_REGNUM)
    return - offset;
  else
    return 0;

#endif
}
------------------------------------------------------------------------

Under the current interface macros like INITIAL_ELIMINATION_OFFSET
are expected to trigger the calculation of the target's frame layout
(since you need that information to answer the question).
To me it seems wrong that we're attempting to call that sort of
macro in a query routine like rtx_addr_can_trap_p_1.

The frame layout is fixed for each LRA/reload cycle.  The layout
for the last of those cycles carries forward through the rest
of compilation.

IMO we should cache the information we need at the start of each
LRA/reload cycle.  This will be more robust, because there will
be no accidental changes to global state either during or after
LRA/reload.  It will also be more efficient because
rtx_addr_can_trap_p_1 can read the cached variables rather
than calling back into the target.

Thanks,
Richard

Reply via email to