Hi Christophe,

> -----Original Message-----
> From: Christophe Lyon <christophe.l...@linaro.org>
> Sent: Thursday, July 6, 2023 4:21 PM
> To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> <richard.sandif...@arm.com>
> Subject: Re: [PATCH] arm: Fix MVE intrinsics support with LTO (PR
> target/110268)
> 
> 
> 
> On Wed, 5 Jul 2023 at 19:07, Kyrylo Tkachov <kyrylo.tkac...@arm.com
> <mailto:kyrylo.tkac...@arm.com> > wrote:
> 
> 
>       Hi Christophe,
> 
>       > -----Original Message-----
>       > From: Christophe Lyon <christophe.l...@linaro.org
> <mailto:christophe.l...@linaro.org> >
>       > Sent: Monday, June 26, 2023 4:03 PM
>       > To: gcc-patches@gcc.gnu.org <mailto:gcc-patches@gcc.gnu.org> ;
> Kyrylo Tkachov <kyrylo.tkac...@arm.com
> <mailto:kyrylo.tkac...@arm.com> >;
>       > Richard Sandiford <richard.sandif...@arm.com
> <mailto:richard.sandif...@arm.com> >
>       > Cc: Christophe Lyon <christophe.l...@linaro.org
> <mailto:christophe.l...@linaro.org> >
>       > Subject: [PATCH] arm: Fix MVE intrinsics support with LTO (PR
> target/110268)
>       >
>       > After the recent MVE intrinsics re-implementation, LTO stopped
> working
>       > because the intrinsics would no longer be defined.
>       >
>       > The main part of the patch is simple and similar to what we do for
>       > AArch64:
>       > - call handle_arm_mve_h() from arm_init_mve_builtins to declare
> the
>       >   intrinsics when the compiler is in LTO mode
>       > - actually implement arm_builtin_decl for MVE.
>       >
>       > It was just a bit tricky to handle
> __ARM_MVE_PRESERVE_USER_NAMESPACE:
>       > its value in the user code cannot be guessed at LTO time, so we
> always
>       > have to assume that it was not defined.  The led to a few fixes in the
>       > way we register MVE builtins as placeholders or not.  Without this
>       > patch, we would just omit some versions of the inttrinsics when
>       > __ARM_MVE_PRESERVE_USER_NAMESPACE is true. In fact, like for
> the C/C++
>       > placeholders, we need to always keep entries for all of them to
> ensure
>       > that we have a consistent numbering scheme.
>       >
>       >       2023-06-26  Christophe Lyon   <christophe.l...@linaro.org
> <mailto:christophe.l...@linaro.org> >
>       >
>       >       PR target/110268
>       >       gcc/
>       >       * config/arm/arm-builtins.cc (arm_init_mve_builtins): Handle
> LTO.
>       >       (arm_builtin_decl): Hahndle MVE builtins.
>       >       * config/arm/arm-mve-builtins.cc (builtin_decl): New function.
>       >       (add_unique_function): Fix handling of
>       >       __ARM_MVE_PRESERVE_USER_NAMESPACE.
>       >       (add_overloaded_function): Likewise.
>       >       * config/arm/arm-protos.h (builtin_decl): New declaration.
>       >
>       >       gcc/testsuite/
>       >       * gcc.target/arm/pr110268-1.c: New test.
>       >       * gcc.target/arm/pr110268-2.c: New test.
>       > ---
>       >  gcc/config/arm/arm-builtins.cc            | 11 +++-
>       >  gcc/config/arm/arm-mve-builtins.cc        | 61 ++++++++++++----------
> -
>       >  gcc/config/arm/arm-protos.h               |  1 +
>       >  gcc/testsuite/gcc.target/arm/pr110268-1.c | 11 ++++
>       >  gcc/testsuite/gcc.target/arm/pr110268-2.c | 22 ++++++++
>       >  5 files changed, 76 insertions(+), 30 deletions(-)
>       >  create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-1.c
>       >  create mode 100644 gcc/testsuite/gcc.target/arm/pr110268-2.c
>       >
>       > diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-
> builtins.cc
>       > index 36365e40a5b..fca7dcaf565 100644
>       > --- a/gcc/config/arm/arm-builtins.cc
>       > +++ b/gcc/config/arm/arm-builtins.cc
>       > @@ -1918,6 +1918,15 @@ arm_init_mve_builtins (void)
>       >        arm_builtin_datum *d = &mve_builtin_data[i];
>       >        arm_init_builtin (fcode, d, "__builtin_mve");
>       >      }
>       > +
>       > +  if (in_lto_p)
>       > +    {
>       > +      arm_mve::handle_arm_mve_types_h ();
>       > +      /* Under LTO, we cannot know whether
>       > +      __ARM_MVE_PRESERVE_USER_NAMESPACE was defined, so
> assume
>       > it
>       > +      was not.  */
>       > +      arm_mve::handle_arm_mve_h (false);
>       > +    }
>       >  }
>       >
>       >  /* Set up all the NEON builtins, even builtins for instructions that
> are not
>       > @@ -2723,7 +2732,7 @@ arm_builtin_decl (unsigned code, bool
> initialize_p
>       > ATTRIBUTE_UNUSED)
>       >      case ARM_BUILTIN_GENERAL:
>       >        return arm_general_builtin_decl (subcode);
>       >      case ARM_BUILTIN_MVE:
>       > -      return error_mark_node;
>       > +      return arm_mve::builtin_decl (subcode);
>       >      default:
>       >        gcc_unreachable ();
>       >      }
>       > diff --git a/gcc/config/arm/arm-mve-builtins.cc
> b/gcc/config/arm/arm-mve-
>       > builtins.cc
>       > index 7033e41a571..e9a12f27411 100644
>       > --- a/gcc/config/arm/arm-mve-builtins.cc
>       > +++ b/gcc/config/arm/arm-mve-builtins.cc
>       > @@ -493,6 +493,16 @@ handle_arm_mve_h (bool
>       > preserve_user_namespace)
>       >                                    preserve_user_namespace);
>       >  }
>       >
>       > +/* Return the function decl with SVE function subcode CODE, or
>       > error_mark_node
>       > +   if no such function exists.  */
>       > +tree
>       > +builtin_decl (unsigned int code)
>       > +{
>       > +  if (code >= vec_safe_length (registered_functions))
>       > +    return error_mark_node;
>       > +  return (*registered_functions)[code]->decl;
>       > +}
>       > +
>       >  /* Return true if CANDIDATE is equivalent to MODEL_TYPE for
> overloading
>       >     purposes.  */
>       >  static bool
>       > @@ -849,7 +859,6 @@ function_builder::add_function (const
>       > function_instance &instance,
>       >      ? integer_zero_node
>       >      : simulate_builtin_function_decl (input_location, name, fntype,
>       >                                     code, NULL, attrs);
>       > -
>       >    registered_function &rfn = *ggc_alloc <registered_function> ();
>       >    rfn.instance = instance;
>       >    rfn.decl = decl;
>       > @@ -889,15 +898,12 @@ function_builder::add_unique_function
> (const
>       > function_instance &instance,
>       >    gcc_assert (!*rfn_slot);
>       >    *rfn_slot = &rfn;
>       >
>       > -  /* Also add the non-prefixed non-overloaded function, if the user
>       > namespace
>       > -     does not need to be preserved.  */
>       > -  if (!preserve_user_namespace)
>       > -    {
>       > -      char *noprefix_name = get_name (instance, false, false);
>       > -      tree attrs = get_attributes (instance);
>       > -      add_function (instance, noprefix_name, fntype, attrs,
> requires_float,
>       > -                 false, false);
>       > -    }
>       > +  /* Also add the non-prefixed non-overloaded function, as
> placeholder
>       > +     if the user namespace does not need to be preserved.  */
>       > +  char *noprefix_name = get_name (instance, false, false);
>       > +  attrs = get_attributes (instance);
>       > +  add_function (instance, noprefix_name, fntype, attrs,
> requires_float,
>       > +             false, preserve_user_namespace);
>       >
>       >    /* Also add the function under its overloaded alias, if we want
>       >       a separate decl for each instance of an overloaded function.  */
>       > @@ -905,20 +911,17 @@ function_builder::add_unique_function
> (const
>       > function_instance &instance,
>       >    if (strcmp (name, overload_name) != 0)
>       >      {
>       >        /* Attribute lists shouldn't be shared.  */
>       > -      tree attrs = get_attributes (instance);
>       > +      attrs = get_attributes (instance);
>       >        bool placeholder_p = !(m_direct_overloads ||
> force_direct_overloads);
>       >        add_function (instance, overload_name, fntype, attrs,
>       >                   requires_float, false, placeholder_p);
>       >
>       > -      /* Also add the non-prefixed overloaded function, if the user
> namespace
>       > -      does not need to be preserved.  */
>       > -      if (!preserve_user_namespace)
>       > -     {
>       > -       char *noprefix_overload_name = get_name (instance, false,
> true);
>       > -       tree attrs = get_attributes (instance);
>       > -       add_function (instance, noprefix_overload_name, fntype, attrs,
>       > -                     requires_float, false, placeholder_p);
>       > -     }
>       > +      /* Also add the non-prefixed overloaded function, as
> placeholder
>       > +      if the user namespace does not need to be preserved.  */
>       > +      char *noprefix_overload_name = get_name (instance, false,
> true);
>       > +      attrs = get_attributes (instance);
>       > +      add_function (instance, noprefix_overload_name, fntype, attrs,
>       > +                 requires_float, false, preserve_user_namespace ||
>       > placeholder_p);
>       >      }
>       >
>       >    obstack_free (&m_string_obstack, name);
>       > @@ -948,15 +951,15 @@
> function_builder::add_overloaded_function (const
>       > function_instance &instance,
>       >       = add_function (instance, name, m_overload_type, NULL_TREE,
>       >                       requires_float, true, m_direct_overloads);
>       >        m_overload_names.put (name, &rfn);
>       > -      if (!preserve_user_namespace)
>       > -     {
>       > -       char *noprefix_name = get_name (instance, false, true);
>       > -       registered_function &noprefix_rfn
>       > -         = add_function (instance, noprefix_name, m_overload_type,
>       > -                         NULL_TREE, requires_float, true,
>       > -                         m_direct_overloads);
>       > -       m_overload_names.put (noprefix_name, &noprefix_rfn);
>       > -     }
>       > +
>       > +      /* Also add the non-prefixed function, as placeholder if the
>       > +      user namespace does not need to be preserved.  */
>       > +      char *noprefix_name = get_name (instance, false, true);
>       > +      registered_function &noprefix_rfn
>       > +     = add_function (instance, noprefix_name, m_overload_type,
>       > +                     NULL_TREE, requires_float, true,
>       > +                     preserve_user_namespace || m_direct_overloads);
>       > +      m_overload_names.put (noprefix_name, &noprefix_rfn);
>       >      }
>       >  }
>       >
>       > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-
> protos.h
>       > index 7d73c66a15d..6186921011e 100644
>       > --- a/gcc/config/arm/arm-protos.h
>       > +++ b/gcc/config/arm/arm-protos.h
>       > @@ -232,6 +232,7 @@ const unsigned int ARM_BUILTIN_CLASS = (1
> <<
>       > ARM_BUILTIN_SHIFT) - 1;
>       >  namespace arm_mve {
>       >    void handle_arm_mve_types_h ();
>       >    void handle_arm_mve_h (bool);
>       > +  tree builtin_decl (unsigned);
>       >    tree resolve_overloaded_builtin (location_t, unsigned int,
>       >                                  vec<tree, va_gc> *);
>       >    bool check_builtin_call (location_t, vec<location_t>, unsigned int,
>       > diff --git a/gcc/testsuite/gcc.target/arm/pr110268-1.c
>       > b/gcc/testsuite/gcc.target/arm/pr110268-1.c
>       > new file mode 100644
>       > index 00000000000..d133011343c
>       > --- /dev/null
>       > +++ b/gcc/testsuite/gcc.target/arm/pr110268-1.c
>       > @@ -0,0 +1,11 @@
>       > +/* { dg-do link }  */
>       > +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>       > +/* { dg-add-options arm_v8_1m_mve } */
>       > +/* { dg-additional-options "-O2 -flto -specs=rdimon.specs" } */
> 
>       IIRC rdimon.specs is a thing coming out of libgloss/newlib so may not
> be valid for some other libcs?
> 
> 
> 
> Ha right. In my mind MVE implies arm-eabi target, which implies newlib so I
> overlooked this.
> 
> 
>       Would it make sense to also require a newlib effective target?
> 
> 
> I didn't know about this one ;-)
> 
> Revisiting this, I realized that -specs=rdimon.specs does not belong here, in
> fact I updated my dejagnu setup to add this (required) option automagically
> behind the scenes (via set_board_info ldflags), thus removing it from dg-
> additional-options.
> 
> In fact there's no reason to require newlib, we just need to be able to link. 
> In
> principle we could have linux+glibc on an MVE target.
> Unfortunately, it seems there's no such effective-target yet, the closest 
> being
> arm_arch_v8_1m_main_multilib.
> However this one also implies being able to execute the resulting binary,
> which may not be possible (actually it isn't in my current setup due to a
> bug/feature in qemu).
> 
> So I hacked additions to target-supports.exp, to add:
> check_effective_target_arm_arch_FUNC_link
> 
> check_effective_target_FUNC_link
> 
> to our existing list, and they call "check_no_compiler_messages ...
> executable" instead of check_runtime.
> 
> And this means the testcases now have:
> /* { dg-require-effective-target arm_arch_v8_1m_main_link } */ /* Make sure
> we have suitable multilibs to link successfully.  */
> 
> /* { dg-require-effective-target arm_v8_1m_mve_ok } */
> 
> 
> Ideally we should perhaps also increase the list of effective targets that 
> cover
> MVE, such that we also have "_multilib" etc but that seems a bit overkill?
> OTOH, the discrepancy above might be confusing (arm_arch_v8_1m_main vs
> arm_v8_1m_mve).
> 
> So.... are you OK with a separate patch to add
> check_effective_target_arm_arch_FUNC_link and
> check_effective_target_FUNC_link and this one updated with the dg-require-
> effective-target as above?

Yeah, I think that's a good solution.

> 
> Too bad things are so complicated with the many flavours we have ;-)

Yeah, it trips up almost every new addition to the testsuite these days ☹ we'll 
need to sit down and think about simplifying this all at some point.
Thanks,
Kyrill

> 
> Thanks,
> 
> Christophe
> 
> 
>       Ok otherwise.
>       Thanks,
>       Kyrill
> 
>       > +
>       > +#include <arm_mve.h>
>       > +
>       > +int main(int argc, char* argv[])
>       > +{
>       > +  return vaddvq(__arm_vdupq_n_s8 (argc));
>       > +}
>       > diff --git a/gcc/testsuite/gcc.target/arm/pr110268-2.c
>       > b/gcc/testsuite/gcc.target/arm/pr110268-2.c
>       > new file mode 100644
>       > index 00000000000..ad03fb37793
>       > --- /dev/null
>       > +++ b/gcc/testsuite/gcc.target/arm/pr110268-2.c
>       > @@ -0,0 +1,22 @@
>       > +/* { dg-do link }  */
>       > +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
>       > +/* { dg-add-options arm_v8_1m_mve } */
>       > +/* { dg-additional-options "-O2 -flto -specs=rdimon.specs" } */
>       > +
>       > +/* Check MVE intrinsics with LTO with
>       > __ARM_MVE_PRESERVE_USER_NAMESPACE and a
>       > +   user-overridden intrinsic.  */
>       > +
>       > +#define __ARM_MVE_PRESERVE_USER_NAMESPACE
>       > +#include <arm_mve.h>
>       > +
>       > +int global_int;
>       > +int32_t vaddvq(int8x16_t x)
>       > +{
>       > +  return global_int + __arm_vgetq_lane_s8 (x, 0);
>       > +}
>       > +
>       > +int main(int argc, char* argv[])
>       > +{
>       > +  global_int = argc;
>       > +  return vaddvq(__arm_vdupq_n_s8 (argc));
>       > +}
>       > --
>       > 2.34.1
> 
> 

Reply via email to