Sorry for the slow review.

Matthieu Longo <matthieu.lo...@arm.com> writes:
> This patch is only a refactoring of the existing implementation
> of PAuth and returned-address signing. The existing behavior is
> preserved.
>
> _Unwind_FrameState already contains several CIE and FDE information
> (see the attributes below the comment "The information we care
> about from the CIE/FDE" in libgcc/unwind-dw2.h).
> The patch aims at moving the information from DWARF CIE (signing
> key stored in the augmentation string) and FDE (the used signing
> method) into _Unwind_FrameState along the already-stored CIE and
> FDE information.
> Note: those information have to be saved in frame_state_reg_info
> instead of _Unwind_FrameState as they need to be savable by
> DW_CFA_remember_state and restorable by DW_CFA_restore_state, that
> both rely on the attribute "prev".
>
> Those new information in _Unwind_FrameState simplifies the look-up
> of the signing key when the return address is demangled. It also
> allows future signing methods to be easily added.
>
> _Unwind_FrameState is not a part of the public API of libunwind,
> so the change is backward compatible.
>
> A new architecture-specific handler MD_ARCH_EXTENSION_FRAME_INIT
> allows to reset values (if needed) in the frame state and unwind
> context before changing the frame state to the caller context.
>
> A new architecture-specific handler MD_ARCH_EXTENSION_CIE_AUG_HANDLER
> isolates the architecture-specific augmentation strings in AArch64
> backend, and allows others architectures to reuse augmentation
> strings that would have clashed with AArch64 DWARF extensions.
>
> aarch64_demangle_return_addr, DW_CFA_AARCH64_negate_ra_state and
> DW_CFA_val_expression cases in libgcc/unwind-dw2-execute_cfa.h
> were documented to clarify where the value of the RA state register
> is stored (FS and CONTEXT respectively).

The abstraction generally looks good.  My main comment is that,
if we are putting the new behaviour behind target macros (which I
agree is a good idea), could we also have a target macro to abstract:

> +#if defined(__aarch64__) && !defined (__ILP32__)
> +    unsigned char signing_key;
> +#endif

?  E.g. something like:

#ifdef MD_CFI_STATE
    MD_CFI_STATE md;
#endif

with aarch64 defining:

#define MD_CFI_STATE struct { unsigned char signing_key; }

(Names and organisation are just suggestions.)

It might be good to try running the patch through

  contrig/check_GNU_style.py

since the GCC coding standards are quite picky about formatting and
stylistic issues.  The main ones I spotted were:

- In files that already use block comments, all comments should be
  block comments rather than // comments.  The comments should end
  with ".  " (full stop and two spaces).

- Block comments are formatted as:

    /* Line 1
       Line 2.  */

  rather than as:

    /* Line 1
     * Line 2.  */

- Function names are generally all lowrecase (e.g. "ra" rather than "RA").

- In function calls, there should be a space between the function and
  the opening "("

- For pointer types, there should be a space before a "*" (or string of
  "*"s), but no space afterwards.

- "const" qualifiers generally go before the type that they qualify,
  rather than afterwards.

- The line width should be 80 characters or fewer.  (The patch was pretty
  good about this, but there were a couple of long lines).

- In multi-line conditions, the ||s and &&s go at the beginning of lines,
  rather than at the end.  (Same for infix operators in general.)

More detailed comments below.

>
> libgcc/ChangeLog:
>
>   * config/aarch64/aarch64-unwind.h
>   (AARCH64_DWARF_RA_STATE_MASK): The mask for RA state register.
>   (aarch64_RA_signing_method_t): The diversifiers used to sign a
>   function's return address.
>   (aarch64_pointer_auth_key): The key used to sign a function's
>   return address.
>   (aarch64_cie_signed_with_b_key): Deleted as the signing key is
>   available now in _Unwind_FrameState.
>   (MD_ARCH_EXTENSION_CIE_AUG_HANDLER): New CIE augmentation string
>   handler for architecture extensions.
>   (MD_ARCH_EXTENSION_FRAME_INIT): New architecture-extension
>   initialization routine for DWARF frame state and context before
>   execution of DWARF instructions.
>   (aarch64_context_RA_state_get): Read RA state register from CONTEXT.
>   (aarch64_RA_state_get): Read RA state register from FS.
>   (aarch64_RA_state_set): Write RA state register into FS.
>   (aarch64_RA_state_toggle): Toggle RA state register in FS.
>   (aarch64_cie_aug_handler): Handler AArch64 augmentation strings.
>   (aarch64_arch_extension_frame_init): Initialize defaults for the
>   signing key (PAUTH_KEY_A), and RA state register (RA_no_signing).
>   (aarch64_demangle_return_addr): Rely on the frame registers and
>   the signing_key attribute in _Unwind_FrameState.
>   * unwind-dw2-execute_cfa.h:
>   Use the right alias DW_CFA_AARCH64_negate_ra_state for __aarch64__
>   instead of DW_CFA_GNU_window_save.
>   (DW_CFA_AARCH64_negate_ra_state): Save the signing method in RA
>   state register. Toggle RA state register without resetting 'how'
>   to REG_UNSAVED.
>   * unwind-dw2.c:
>   (extract_cie_info): Save the signing key in the current
>   _Unwind_FrameState while parsing the augmentation data.
>   (uw_frame_state_for): Reset some attributes related to architecture
>   extensions in _Unwind_FrameState.
>   (uw_update_context): Move authentication code to AArch64 unwinding.
>   * unwind-dw2.h (enum register_rule): Give a name to the existing
>   enum for the register rules, and replace 'unsigned char' by 'enum
>   register_rule' to facilitate debugging in GDB.
>   (_Unwind_FrameState): Add a new architecture-extension attribute
>   to store the signing key.
> ---
>  libgcc/config/aarch64/aarch64-unwind.h | 154 ++++++++++++++++++++-----
>  libgcc/unwind-dw2-execute_cfa.h        |  34 ++++--
>  libgcc/unwind-dw2.c                    |  19 ++-
>  libgcc/unwind-dw2.h                    |  17 ++-
>  4 files changed, 175 insertions(+), 49 deletions(-)
>
> diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
> b/libgcc/config/aarch64/aarch64-unwind.h
> index daf96624b5e..cc225a7e207 100644
> --- a/libgcc/config/aarch64/aarch64-unwind.h
> +++ b/libgcc/config/aarch64/aarch64-unwind.h
> @@ -25,55 +25,155 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
>  If not, see
>  #if !defined (AARCH64_UNWIND_H) && !defined (__ILP32__)
>  #define AARCH64_UNWIND_H
>  
> -#define DWARF_REGNUM_AARCH64_RA_STATE 34
> +#include "ansidecl.h"
> +#include <stdbool.h>
> +
> +#define AARCH64_DWARF_REGNUM_RA_STATE 34
> +#define AARCH64_DWARF_RA_STATE_MASK   0x1
> +
> +/* The diversifiers used to sign a function's return address. */
> +typedef enum
> +{
> +  AARCH64_RA_no_signing = 0x0,
> +  AARCH64_RA_signing_SP = 0x1,
> +} __attribute__((packed)) aarch64_RA_signing_method_t;
> +
> +/* The key used to sign a function's return address. */
> +typedef enum {
> +  AARCH64_PAUTH_KEY_A,
> +  AARCH64_PAUTH_KEY_B,
> +} __attribute__((packed)) aarch64_pointer_auth_key;
> +
> +#define MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug) \
> +  aarch64_cie_aug_handler(fs, aug)
> +
> +#define MD_ARCH_EXTENSION_FRAME_INIT(context, fs) \
> +  aarch64_arch_extension_frame_init(context, fs)
>  
>  #define MD_DEMANGLE_RETURN_ADDR(context, fs, addr) \
>    aarch64_demangle_return_addr (context, fs, addr)
>  
> -static inline int
> -aarch64_cie_signed_with_b_key (struct _Unwind_Context *context)
> +static inline aarch64_RA_signing_method_t
> +aarch64_context_RA_state_get(struct _Unwind_Context *context)
> +{
> +  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
> +  return _Unwind_GetGR (context, index) & AARCH64_DWARF_RA_STATE_MASK;
> +}
> +
> +static inline aarch64_RA_signing_method_t
> +aarch64_fs_RA_state_get(_Unwind_FrameState const* fs)
>  {
> -  const struct dwarf_fde *fde = _Unwind_Find_FDE (context->bases.func,
> -                                               &context->bases);
> -  if (fde != NULL)
> +  const int index = AARCH64_DWARF_REGNUM_RA_STATE;
> +  return fs->regs.reg[index].loc.offset & AARCH64_DWARF_RA_STATE_MASK;
> +}
> +
> +static inline void
> +aarch64_fs_RA_state_set(_Unwind_FrameState *fs,
> +                     aarch64_RA_signing_method_t signing_method)
> +{
> +  fs->regs.reg[AARCH64_DWARF_REGNUM_RA_STATE].loc.offset = signing_method;
> +}
> +
> +static inline void
> +aarch64_fs_RA_state_toggle(_Unwind_FrameState *fs)
> +{
> +  aarch64_RA_signing_method_t signing_method = aarch64_fs_RA_state_get(fs);
> +  gcc_assert (signing_method == AARCH64_RA_no_signing ||
> +           signing_method == AARCH64_RA_signing_SP);
> +  aarch64_fs_RA_state_set(fs, (signing_method == AARCH64_RA_no_signing)
> +                       ? AARCH64_RA_signing_SP
> +                       : AARCH64_RA_no_signing);
> +}
> +
> +/* CIE handler for custom augmentation string.  */
> +static inline bool
> +aarch64_cie_aug_handler(_Unwind_FrameState *fs, unsigned char aug)
> +{
> +  // AArch64 B-key pointer authentication.
> +  if (aug == 'B')
>      {
> -      const struct dwarf_cie *cie = get_cie (fde);
> -      if (cie != NULL)
> -     {
> -       const unsigned char *aug_str = cie->augmentation;
> -       return __builtin_strchr ((const char *) aug_str,
> -                                'B') == NULL ? 0 : 1;
> -     }
> +      fs->regs.signing_key = AARCH64_PAUTH_KEY_B;
> +      return true;
>      }
> -  return 0;
> +  return false;
>  }
>  
> -/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT 
> and
> -   unwind frame info FS.  If ADDR_WORD is signed, we do address 
> authentication
> -   on it using CFA of current frame.  */
> +/* At the entrance of a new frame, some cached information from the CIE/FDE,
> + * and registers values related to architectural extensions require a default
> + * initialization.
> + * If any of those values related to architecture extensions had to be saved
> + * for the next frame, it should be done via the architecture extensions 
> handler
> + * MD_FROB_UPDATE_CONTEXT in uw_update_context_1 (libgcc/unwind-dw2.c).  */
> +static inline void
> +aarch64_arch_extension_frame_init (struct _Unwind_Context *context 
> ATTRIBUTE_UNUSED,
> +                                _Unwind_FrameState *fs)
> +{
> +  // Reset previously cached CIE/FDE information.
> +  fs->regs.signing_key = AARCH64_PAUTH_KEY_A;

How about making the comment:

  /* By default, DW_CFA_AARCH64_negate_ra_state assumes key A is being used
     for signing.  This can be overridden by adding 'B' to the augmentation
     string.  */

(if you agree that's a reasonable summary).

> +
> +  // Reset some registers.
> +  // Note: the associated fs->how table is automatically reset to 
> REG_UNSAVED in
> +  // uw_frame_state_for (libgcc/unwind-dw2.c). REG_UNSAVED means that 
> whatever
> +  // was in the previous context is the current register value. In other 
> words,
> +  // the next context inherits by default the previous value for a register.
> +  // To keep this logic coherent with some architecture extensions, we need 
> to
> +  // reinitialize fs->how for some registers to REG_ARCHEXT.

How about being a bit more specific about the intent:

  /* All registers are initially in state REG_UNSAVED, which indicates that
     they inherit register values from the previous frame.  However, the
     return address starts every frame in the "unsigned" state.  It also
     starts every frame in a state that supports the original toggle-based
     DW_CFA_AARCH64_negate_ra_state method of controlling RA signing.  */

> +  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +  fs->regs.how[reg] = REG_ARCHEXT;

Very minor, but I think this would be easier to read without the temporary.

> +  aarch64_fs_RA_state_set(fs, AARCH64_RA_no_signing);
> +}
>  
> +/* Do AArch64 private extraction on ADDR_WORD based on context info CONTEXT 
> and
> +   unwind frame info FS. If ADDR_WORD is signed, we do address authentication
> +   on it using CFA of current frame.
> +   Note: this function is called after uw_update_context_1 in 
> uw_update_context.
> +   uw_update_context_1 is the function in which val expression are computed. 
> Thus

expressions

> +   if DW_CFA_val_expression is used, the value of the RA state register is 
> stored
> +   in CONTEXT, not FS. (more details about when DW_CFA_val_expression is 
> used in
> +   the next comment below)  */
>  static inline void *
>  aarch64_demangle_return_addr (struct _Unwind_Context *context,
>                             _Unwind_FrameState *fs,
>                             _Unwind_Word addr_word)
>  {
>    void *addr = (void *)addr_word;
> -  const int reg = DWARF_REGNUM_AARCH64_RA_STATE;
> -
> -  if (fs->regs.how[reg] == REG_UNSAVED)
> -    return addr;
> +  const int reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +
> +  // In libgcc, REG_ARCHEXT means that the RA state register was set by an 
> AArch64
> +  // DWARF instruction and contains a valid value.

It's also used to describe the initial state.

> +  // Return-address signing state is normally toggled by 
> DW_CFA_AARCH64_negate_ra_state
> +  // (also knwon by its alias as DW_CFA_GNU_window_save).
> +  // However, RA state register can be set directly via DW_CFA_val_expression
> +  // too. GCC does not generate such CFI but some other compilers reportedly 
> do.

nit: drop the trailing full stop, since the following line continues
the sentence.

> +  // (see PR104689 for more details).
> +  // Any other value than REG_ARCHEXT should be interpreted as if the RA 
> state register
> +  // is set by another DWARF instruction, and the value is fetchable via 
> _Unwind_GetGR.
> +  aarch64_RA_signing_method_t signing_method = AARCH64_RA_no_signing;
> +  if (fs->regs.how[reg] == REG_ARCHEXT)
> +    {
> +      signing_method = aarch64_fs_RA_state_get(fs);
> +    }
> +  else if (fs->regs.how[reg] != REG_UNSAVED)
> +    {
> +      signing_method = aarch64_context_RA_state_get(context);
> +
> +      // If the value was set in context, CONTEXT->by_value was set to 1.
> +      // uw_install_context_1 contains an assert on by_value, to check that 
> all
> +      // registers values have been resolved before installing the context.
> +      // This will make the unwinding crash if we don't reset 
> CONTEXT->by_value,
> +      // and CONTEXT->reg[reg].
> +      context->reg[reg] = _Unwind_Get_Unwind_Context_Reg_Val(0x0);
> +      context->by_value[reg] = 0;

Is this change to the context necessary for the refactor, or is it
instead needed for the follow-on patches?  If it's part of the refactor,
could you explain why it's needed now but wasn't before?

This is the only part that didn't look like a no-op change.

> +    }
>  
> -  /* Return-address signing state is toggled by DW_CFA_GNU_window_save (where
> -     REG_UNSAVED/REG_UNSAVED_ARCHEXT means RA signing is disabled/enabled),
> -     or set by a DW_CFA_expression.  */
> -  if (fs->regs.how[reg] == REG_UNSAVED_ARCHEXT
> -      || (_Unwind_GetGR (context, reg) & 0x1) != 0)
> +  if (signing_method == AARCH64_RA_signing_SP)
>      {
>        _Unwind_Word salt = (_Unwind_Word) context->cfa;
> -      if (aarch64_cie_signed_with_b_key (context) != 0)
> +      if (fs->regs.signing_key == AARCH64_PAUTH_KEY_B)
>       return __builtin_aarch64_autib1716 (addr, salt);
>        return __builtin_aarch64_autia1716 (addr, salt);
>      }
> +  // else {} // signing_method == AARCH64_RA_no_signing, nothing to do.

IMO it'd be clearer without this line.  Alternatively, we could make it:

  else if (signing_method == AARCH64_RA_no_signing)
    return addr;

  gcc_unreachable ();

>  }
> diff --git a/libgcc/unwind-dw2-execute_cfa.h b/libgcc/unwind-dw2-execute_cfa.h
> index a6b249f9ad6..2ea78c4ef8d 100644
> --- a/libgcc/unwind-dw2-execute_cfa.h
> +++ b/libgcc/unwind-dw2-execute_cfa.h
> @@ -271,23 +271,33 @@
>             fs->regs.how[reg] = REG_SAVED_VAL_EXP;
>             fs->regs.reg[reg].loc.exp = insn_ptr;
>           }
> +       // Don't execute the expression, but jump over it by adding
> +       // DW_FORM_block's size to insn_ptr.
>         insn_ptr = read_uleb128 (insn_ptr, &utmp);
>         insn_ptr += utmp;
>         break;
>  
> -     case DW_CFA_GNU_window_save:
>  #if defined (__aarch64__) && !defined (__ILP32__)
> -       /* This CFA is multiplexed with Sparc.  On AArch64 it's used to toggle
> -          return address signing status.  REG_UNSAVED/REG_UNSAVED_ARCHEXT
> -          mean RA signing is disabled/enabled.  */
> -       reg = DWARF_REGNUM_AARCH64_RA_STATE;
> -       gcc_assert (fs->regs.how[reg] == REG_UNSAVED
> -                   || fs->regs.how[reg] == REG_UNSAVED_ARCHEXT);
> -       if (fs->regs.how[reg] == REG_UNSAVED)
> -         fs->regs.how[reg] = REG_UNSAVED_ARCHEXT;
> -       else
> -         fs->regs.how[reg] = REG_UNSAVED;
> +     case DW_CFA_AARCH64_negate_ra_state:
> +       // This CFA is multiplexed with Sparc.
> +       // On AArch64 it's used to toggle the status of return address signing
> +       // with SP as a diversifier.
> +       // - REG_ARCHEXT means that the RA state register in FS contains a
> +       // valid value, and that no other DWARF directive has changed the 
> value
> +       // of this register.
> +       // - any other value is not compatible with negating the RA state.
> +       reg = AARCH64_DWARF_REGNUM_RA_STATE;
> +
> +       // /!\ Mixing DW_CFA_val_expression with 
> DW_CFA_AARCH64_negate_ra_state
> +       // will result in an undefined behavior (likely an unwinding failure),
> +       // as the chronology of the DWARF directives will be broken.
> +       gcc_assert (fs->regs.how[reg] == REG_ARCHEXT);

Could we move the assert into aarch64_fs_RA_state_toggle, or do later
patches not want that?

> +
> +       // Toggle current state
> +       aarch64_fs_RA_state_toggle(fs);
> +       break;
>  #else
> +     case DW_CFA_GNU_window_save:
>         /* ??? Hardcoded for SPARC register window configuration.  */
>         if (__LIBGCC_DWARF_FRAME_REGISTERS__ >= 32)
>           for (reg = 16; reg < 32; ++reg)
> @@ -295,8 +305,8 @@
>               fs->regs.how[reg] = REG_SAVED_OFFSET;
>               fs->regs.reg[reg].loc.offset = (reg - 16) * sizeof (void *);
>             }
> -#endif
>         break;
> +#endif
>  
>       case DW_CFA_GNU_args_size:
>         insn_ptr = read_uleb128 (insn_ptr, &utmp);
> diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
> index 0849e89cd34..fb60853825d 100644
> --- a/libgcc/unwind-dw2.c
> +++ b/libgcc/unwind-dw2.c
> @@ -501,11 +501,11 @@ extract_cie_info (const struct dwarf_cie *cie, struct 
> _Unwind_Context *context,
>         fs->signal_frame = 1;
>         aug += 1;
>       }
> -      /* aarch64 B-key pointer authentication.  */
> -      else if (aug[0] == 'B')
> -     {
> -       aug += 1;
> -      }
> +
> +#if defined(MD_ARCH_EXTENSION_CIE_AUG_HANDLER)
> +      else if (MD_ARCH_EXTENSION_CIE_AUG_HANDLER(fs, aug[0]))
> +     aug += 1;
> +#endif
>  
>        /* Otherwise we have an unknown augmentation string.
>        Bail unless we saw a 'z' prefix.  */
> @@ -996,6 +996,9 @@ uw_frame_state_for (struct _Unwind_Context *context, 
> _Unwind_FrameState *fs)
>  
>    memset (&fs->regs.how[0], 0,
>         sizeof (*fs) - offsetof (_Unwind_FrameState, regs.how[0]));
> +#if defined(MD_ARCH_EXTENSION_FRAME_INIT)
> +  MD_ARCH_EXTENSION_FRAME_INIT(context, fs);
> +#endif
>    context->args_size = 0;
>    context->lsda = 0;
>  
> @@ -1197,7 +1200,11 @@ uw_update_context_1 (struct _Unwind_Context *context, 
> _Unwind_FrameState *fs)
>        {
>        case REG_UNSAVED:
>        case REG_UNDEFINED:
> -      case REG_UNSAVED_ARCHEXT:
> +      case REG_ARCHEXT: // If the value depends on an augmenter, then there 
> is
> +                     // no processing to do here, and the value computation
> +                     // should be delayed until the architecture handler
> +                     // computes the value correctly based on the augmenter
> +                     // information.

This would more usually be written as a comment above the case, rather than
to the right.

>       break;
>  
>        case REG_SAVED_OFFSET:
> diff --git a/libgcc/unwind-dw2.h b/libgcc/unwind-dw2.h
> index 0dd8611f697..b75f4c65f98 100644
> --- a/libgcc/unwind-dw2.h
> +++ b/libgcc/unwind-dw2.h
> @@ -22,16 +22,17 @@
>     see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>     <http://www.gnu.org/licenses/>.  */
>  
> -enum {
> +enum register_rule
> +{
>    REG_UNSAVED,
>    REG_SAVED_OFFSET,
>    REG_SAVED_REG,
>    REG_SAVED_EXP,
>    REG_SAVED_VAL_OFFSET,
>    REG_SAVED_VAL_EXP,
> -  REG_UNSAVED_ARCHEXT,               /* Target specific extension.  */
> +  REG_ARCHEXT,               /* Target specific extension.  */
>    REG_UNDEFINED
> -};
> +} __attribute__((packed));
>  
>  /* The result of interpreting the frame unwind info for a frame.
>     This is all symbolic at this point, as none of the values can
> @@ -49,7 +50,7 @@ typedef struct
>       const unsigned char *exp;
>        } loc;
>      } reg[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
> -    unsigned char how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
> +    enum register_rule how[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>  
>      enum {
>        CFA_UNSET,
> @@ -65,6 +66,14 @@ typedef struct
>      _Unwind_Sword cfa_offset;
>      _Unwind_Word cfa_reg;
>      const unsigned char *cfa_exp;
> +
> +    /* Architecture extensions information from CIE/FDE.
> +     * Note: those information have to be saved in struct 
> frame_state_reg_info

this information (singular)

Thanks for doing this.  It's definitely a cleaner structure than what
we had before.

Richard

> +     * instead of _Unwind_FrameState as DW_CFA_restore_state has to be able 
> to
> +     * restore them.  */
> +#if defined(__aarch64__) && !defined (__ILP32__)
> +    unsigned char signing_key;
> +#endif
>    } regs;
>  
>    /* The PC described by the current frame state.  */

Reply via email to