On Fri, Dec 20, 2013 at 8:06 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> Hi!
>
> Honza recently changed the i?86 backend, so that it often doesn't
> do -maccumulate-outgoing-args by default on x86_64.
> Unfortunately, on some of the here included testcases this regressed
> quite a bit the generated code.  As AVX vectors are used, the dynamic
> realignment code needs to assume e.g. that some of them will need to be
> spilled, and for -mno-accumulate-outgoing-args the code needs to set
> need_drap early as well.  But in when emitting the prologue/epilogue,
> if need_drap is set, we don't perform the optimization for leaf functions
> which have zero size stack frame, thus we end up with uselessly doing
> dynamic stack realignment, setting up DRAP that nothing uses and later on
> restore everything back.
>
> This patch improves it, if the DRAP register isn't live at the start of
> entry bb successor and we aren't going to realign the stack, we don't
> need DRAP at all, and even if we need DRAP register, that can't be the sole
> reason for doing stack realignment, the prologue code is able to set up DRAP
> even without dynamic stack realignment.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2013-12-20  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/59501
>         * config/i386/i386.c (ix86_save_reg): Don't return true for drap_reg
>         if !crtl->stack_realign_needed.
>         (ix86_finalize_stack_realign_flags): If drap_reg isn't live on entry
>         and stack_realign_needed will be false, clear drap_reg and need_drap.
>         Optimize leaf functions that don't need stack frame even if
>         crtl->need_drap.
>
>         * gcc.target/i386/pr59501-1.c: New test.
>         * gcc.target/i386/pr59501-1a.c: New test.
>         * gcc.target/i386/pr59501-2.c: New test.
>         * gcc.target/i386/pr59501-2a.c: New test.
>         * gcc.target/i386/pr59501-3.c: New test.
>         * gcc.target/i386/pr59501-3a.c: New test.
>         * gcc.target/i386/pr59501-4.c: New test.
>         * gcc.target/i386/pr59501-4a.c: New test.
>         * gcc.target/i386/pr59501-5.c: New test.
>         * gcc.target/i386/pr59501-6.c: New test.
>
> --- gcc/config/i386/i386.c.jj   2013-12-19 13:35:23.000000000 +0100
> +++ gcc/config/i386/i386.c      2013-12-20 11:44:14.389310804 +0100
> @@ -9235,7 +9235,9 @@ ix86_save_reg (unsigned int regno, bool
>         }
>      }
>
> -  if (crtl->drap_reg && regno == REGNO (crtl->drap_reg))
> +  if (crtl->drap_reg
> +      && regno == REGNO (crtl->drap_reg)
> +      && crtl->stack_realign_needed)
>      return true;
>
>    return (df_regs_ever_live_p (regno)
> @@ -10473,12 +10475,23 @@ ix86_finalize_stack_realign_flags (void)
>        return;
>      }
>
> +  /* If drap has been set, but it actually isn't live at the start
> +     of the function and !stack_realign, there is no reason to set it up.  */
> +  if (crtl->drap_reg && !stack_realign)
> +    {
> +      basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb;
> +      if (! REGNO_REG_SET_P (DF_LR_IN (bb), REGNO (crtl->drap_reg)))
> +       {
> +         crtl->drap_reg = NULL_RTX;
> +         crtl->need_drap = false;
> +       }
> +    }
> +
>    /* If the only reason for frame_pointer_needed is that we conservatively
>       assumed stack realignment might be needed, but in the end nothing that
>       needed the stack alignment had been spilled, clear frame_pointer_needed
>       and say we don't need stack realignment.  */
>    if (stack_realign
> -      && !crtl->need_drap
>        && frame_pointer_needed
>        && crtl->is_leaf
>        && flag_omit_frame_pointer
> @@ -10516,6 +10529,18 @@ ix86_finalize_stack_realign_flags (void)
>               }
>         }
>
> +      /* If drap has been set, but it actually isn't live at the start
> +        of the function, there is no reason to set it up.  */
> +      if (crtl->drap_reg)
> +       {
> +         basic_block bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb;
> +         if (! REGNO_REG_SET_P (DF_LR_IN (bb), REGNO (crtl->drap_reg)))
> +           {
> +             crtl->drap_reg = NULL_RTX;
> +             crtl->need_drap = false;
> +           }
> +       }
> +
>        frame_pointer_needed = false;
>        stack_realign = false;
>        crtl->max_used_stack_slot_alignment = incoming_stack_boundary;

Looks good to me.  But I can't approve it.

Thanks.

-- 
H.J.

Reply via email to