Yury Khrustalev <yury.khrusta...@arm.com> writes:
> From: Szabolcs Nagy <szabolcs.n...@arm.com>
>
> Builtin for chkfeat: the input argument is used to initialize x16 then
> execute chkfeat and return the updated x16.
>
> Note: ACLE __chkfeat(x) plans to flip the bits to be more intuitive
> (xor the input to output), but for the builtin that seems unnecessary
> complication.

Sounds good.

> gcc/ChangeLog:
>
>       * config/aarch64/aarch64-builtins.cc (enum aarch64_builtins):
>       Define AARCH64_BUILTIN_CHKFEAT.
>       (aarch64_general_init_builtins): Handle chkfeat.
>       (aarch64_general_expand_builtin): Handle chkfeat.
> ---
>  gcc/config/aarch64/aarch64-builtins.cc | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
> b/gcc/config/aarch64/aarch64-builtins.cc
> index 7d737877e0b..765f2091504 100644
> --- a/gcc/config/aarch64/aarch64-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-builtins.cc
> @@ -875,6 +875,8 @@ enum aarch64_builtins
>    AARCH64_PLDX,
>    AARCH64_PLI,
>    AARCH64_PLIX,
> +  /* Armv8.9-A / Armv9.4-A builtins.  */
> +  AARCH64_BUILTIN_CHKFEAT,
>    AARCH64_BUILTIN_MAX
>  };
>  
> @@ -2280,6 +2282,12 @@ aarch64_general_init_builtins (void)
>    if (!TARGET_ILP32)
>      aarch64_init_pauth_hint_builtins ();
>  
> +  tree ftype_chkfeat
> +    = build_function_type_list (uint64_type_node, uint64_type_node, NULL);
> +  aarch64_builtin_decls[AARCH64_BUILTIN_CHKFEAT]
> +    = aarch64_general_add_builtin ("__builtin_aarch64_chkfeat", 
> ftype_chkfeat,
> +                                AARCH64_BUILTIN_CHKFEAT);
> +
>    if (in_lto_p)
>      handle_arm_acle_h ();
>  }
> @@ -3484,6 +3492,16 @@ aarch64_general_expand_builtin (unsigned int fcode, 
> tree exp, rtx target,
>      case AARCH64_PLIX:
>        aarch64_expand_prefetch_builtin (exp, fcode);
>        return target;
> +
> +    case AARCH64_BUILTIN_CHKFEAT:
> +      {
> +     rtx x16_reg = gen_rtx_REG (DImode, R16_REGNUM);
> +     op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
> +     emit_move_insn (x16_reg, op0);
> +     expand_insn (CODE_FOR_aarch64_chkfeat, 0, 0);
> +     emit_move_insn (target, x16_reg);
> +     return target;

target isn't reuired to be nonnull, so this would be safer as:

  return copy_to_reg (x16_reg);

(I don't think it's worth complicating things by trying to reuse target,
since this code isn't going to be performance/memory critical.)

Looks good otherwise.

Thanks,
Richard

> +      }
>      }
>  
>    if (fcode >= AARCH64_SIMD_BUILTIN_BASE && fcode <= 
> AARCH64_SIMD_BUILTIN_MAX)

Reply via email to