On Thu, May 25, 2017 at 09:56:20PM +0200, Florian Weimer wrote:
> On Thu, May 25, 2017 at 8:25 PM, Michael Meissner
> <meiss...@linux.vnet.ibm.com> wrote:
> > This patch adds the initial attribute((target_clone(...))) support to the
> 
> Patch seems to be missing.
> 
> Florian
> 

Sorry about that.

This patch adds the initial attribute((target_clone(...))) support to the
PowerPC.  It looks at the HWCAP bits for ISA 2.05 (power6), ISA 2.06 (power7),
ISA 2.07 (power8) and ISA 3.0 (power9) to determine which clone function to
run.  The implementation used the existing i386/x86_64 support for target_clone
as a template.

At the moment, it has the same basic flaw that the i386/x86_64 implementation
has, which is outside of the current module, the default version of the
function is exported.  It is only in the module that the function is defined in
that supports calling the different target clones.  I hope to add support in
the future to make the exported function be the ifunc handler and not the
default version.  However, I wanted to get the basic framework into the
compiler before tackling that issue.

I have tested these patches on a little endian power8 system and there were no
regressions.  Can I install it into the trunk?

[gcc]
2017-05-24  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * config/rs6000/rs6000.c (toplevel): Include attribs.h.
        (enum clone_list): New enumeration to give the target clones
        processors we generate code for.
        (rs6000_clone_map): New array to identify which clone processors
        the current program is running on.
        (TARGET_COMPARE_VERSION_PRIORITY): Define to enable the
        target_clone attribute.
        (TARGET_GENERATE_VERSION_DISPATCHER_BODY): Likewise.
        (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): Likewise.
        (TARGET_OPTION_FUNCTION_VERSIONS): Likewise.
        (cpu_expand_builtin): Add support for target_clone attribute.
        (rs6000_valid_attribute_p): Allow "default" attribute.
        (get_decl_name): New debug function to simplify printing the
        current function name in debugging statements.
        (rs6000_clone_priority): New functions to support the target_clone
        attribute, and be able to generate code to switch between ISA 2.05
        through ISA 3.0 (power6 through power9).
        (rs6000_compare_version_priority): Likewise.
        (rs6000_get_function_versions_dispatcher): Likewise.
        (make_resolver_func): Likewise.
        (add_condition_to_bb): Likewise.
        (dispatch_function_versions): Likewise.
        (rs6000_generate_version_dispatcher_body): Likewise.
        (rs6000_can_inline_p): Call get_decl_name for debugging usage.
        * doc/extend.texi (Common Function Attributes): Document that the
        PowerPC supports the target_clone attribute.

[gcc/testsuite]
2017-05-24  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * gcc.target/powerpc/clone1.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)    
(revision 248378)
+++ gcc/config/rs6000/rs6000.c  (.../gcc/config/rs6000) (working copy)
@@ -42,6 +42,7 @@
 #include "flags.h"
 #include "alias.h"
 #include "fold-const.h"
+#include "attribs.h"
 #include "stor-layout.h"
 #include "calls.h"
 #include "print-tree.h"
@@ -384,6 +385,34 @@ static const struct
   { "ieee128",         PPC_FEATURE2_HAS_IEEE128,       1 }
 };
 
+/* On PowerPC, we have a limited number of target clones that we care about
+   which means we can use an array to hold the options, rather than having more
+   elaborate data structures to identify each possible variation.  Order the
+   clones from the highest ISA to the least.  */
+enum clone_list {
+  CLONE_ISA_3_00,              /* ISA 3.00 (power9).  */
+  CLONE_ISA_2_07,              /* ISA 2.07 (power8).  */
+  CLONE_ISA_2_06,              /* ISA 2.06 (power7).  */
+  CLONE_ISA_2_05,              /* ISA 2.05 (power6).  */
+  CLONE_DEFAULT,               /* default clone.  */
+  CLONE_MAX
+};
+
+/* Map compiler ISA bits into HWCAP names.  */
+struct clone_map {
+  HOST_WIDE_INT isa_mask;      /* rs6000_isa mask */
+  const char *name;            /* name to use in __builtin_cpu_supports.  */
+};
+
+static const struct clone_map rs6000_clone_map[ (int)CLONE_MAX ] = {
+  { OPTION_MASK_P9_VECTOR,     "arch_3_00" },  /* ISA 3.00 (power9).  */
+  { OPTION_MASK_P8_VECTOR,     "arch_2_07" },  /* ISA 2.07 (power8).  */
+  { OPTION_MASK_POPCNTD,       "arch_2_06" },  /* ISA 2.06 (power7).  */
+  { OPTION_MASK_CMPB,          "arch_2_05" },  /* ISA 2.05 (power6).  */
+  { 0,                         "" },           /* Default options.  */
+};
+
+
 /* Newer LIBCs explicitly export this symbol to declare that they provide
    the AT_PLATFORM and AT_HWCAP/AT_HWCAP2 values in the TCB.  We emit a
    reference to this symbol whenever we expand a CPU builtin, so that
@@ -1969,6 +1998,21 @@ static const struct attribute_spec rs600
 
 #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS
 #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
+
+#undef TARGET_COMPARE_VERSION_PRIORITY
+#define TARGET_COMPARE_VERSION_PRIORITY rs6000_compare_version_priority
+
+#undef TARGET_GENERATE_VERSION_DISPATCHER_BODY
+#define TARGET_GENERATE_VERSION_DISPATCHER_BODY                                
\
+  rs6000_generate_version_dispatcher_body
+
+#undef TARGET_GET_FUNCTION_VERSIONS_DISPATCHER
+#define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER                                
\
+  rs6000_get_function_versions_dispatcher
+
+#undef TARGET_OPTION_FUNCTION_VERSIONS
+#define TARGET_OPTION_FUNCTION_VERSIONS common_function_versions
+
 
 
 /* Processor table.  */
@@ -15616,6 +15660,14 @@ cpu_expand_builtin (enum rs6000_builtins
 
 #ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
   tree arg = TREE_OPERAND (CALL_EXPR_ARG (exp, 0), 0);
+  /* Target clones creates an ARRAY_REF instead of STRING_CST, convert it back
+     to a STRING_CST.  */
+  if (TREE_CODE (arg) == ARRAY_REF
+      && TREE_CODE (TREE_OPERAND (arg, 0)) == STRING_CST
+      && TREE_CODE (TREE_OPERAND (arg, 1)) == INTEGER_CST
+      && compare_tree_int (TREE_OPERAND (arg, 1), 0) == 0)
+    arg = TREE_OPERAND (arg, 0);
+
   if (TREE_CODE (arg) != STRING_CST)
     {
       error ("builtin %s only accepts a string argument",
@@ -39743,6 +39795,14 @@ rs6000_valid_attribute_p (tree fndecl,
       fprintf (stderr, "--------------------\n");
     }
 
+  /* attribute((target("default"))) does nothing, beyond
+     affecting multi-versioning.  */
+  if (TREE_VALUE (args)
+      && TREE_CODE (TREE_VALUE (args)) == STRING_CST
+      && TREE_CHAIN (args) == NULL_TREE
+      && strcmp (TREE_STRING_POINTER (TREE_VALUE (args)), "default") == 0)
+    return true;
+
   old_optimize = build_optimization_node (&global_options);
   func_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl);
 
@@ -40175,6 +40235,486 @@ rs6000_disable_incompatible_switches (vo
 }
 
 
+/* Helper function for printing the function name when debugging.  */
+
+static inline const char *
+get_decl_name (tree fn)
+{
+  tree name;
+
+  if (!fn)
+    return "<null>";
+
+  name = DECL_NAME (fn);
+  if (!name)
+    return "<no-name>";
+
+  return IDENTIFIER_POINTER (name);
+}
+
+/* Return the clone id of the target we are compiling code for in a target
+   clone.  The clone id is ordered from 0 to CLONE_MAX-1 and gives the priority
+   list for the target clones (ordered from highest to lowest).  */
+
+static int
+rs6000_clone_priority (tree fndecl)
+{
+  tree fn_opts = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
+  HOST_WIDE_INT isa_masks;
+  int ret = (int) CLONE_DEFAULT;
+  tree attrs = lookup_attribute ("target", DECL_ATTRIBUTES (fndecl));
+  const char *attrs_str = NULL;
+
+  gcc_assert (attrs != NULL);
+  attrs = TREE_VALUE (TREE_VALUE (attrs));
+
+  gcc_assert (TREE_CODE (attrs) == STRING_CST);
+  attrs_str = TREE_STRING_POINTER (attrs);
+
+  /* Return priority zero for default function.  Return the ISA needed for the
+     function if it is not the default.  */
+  if (strcmp (attrs_str, "default") != 0)
+    {
+      if (fn_opts == NULL_TREE)
+       fn_opts = target_option_default_node;
+
+      if (!fn_opts || !TREE_TARGET_OPTION (fn_opts))
+       isa_masks = rs6000_isa_flags;
+      else
+       isa_masks = TREE_TARGET_OPTION (fn_opts)->x_rs6000_isa_flags;
+
+      for (ret = 0; ret < (int) CLONE_DEFAULT; ret++)
+       if ((rs6000_clone_map[ret].isa_mask & isa_masks) != 0)
+         break;
+    }
+
+  if (TARGET_DEBUG_TARGET)
+    fprintf (stderr, "rs6000_get_function_version_priority (%s) => %d\n",
+            get_decl_name (fndecl), (int) ret);
+
+  return ret;
+}
+
+/* This compares the priority of target features in function DECL1 and DECL2.
+   It returns positive value if DECL1 is higher priority, negative value if
+   DECL2 is higher priority and 0 if they are the same.  Note, priorities are
+   ordered from highest (0, CLONE_ISA_3_0) to lowest (CLONE_DEFAULT).  */
+
+static int
+rs6000_compare_version_priority (tree decl1, tree decl2)
+{
+  int priority1 = rs6000_clone_priority (decl1);
+  int priority2 = rs6000_clone_priority (decl2);
+  int ret = priority2 - priority1;
+
+  if (TARGET_DEBUG_TARGET)
+    fprintf (stderr, "rs6000_compare_version_priority (%s, %s) => %d\n",
+            get_decl_name (decl1), get_decl_name (decl2), ret);
+
+  return ret;
+}
+
+/* Make a dispatcher declaration for the multi-versioned function DECL.
+   Calls to DECL function will be replaced with calls to the dispatcher
+   by the front-end.  Returns the decl of the dispatcher function.  */
+
+static tree
+rs6000_get_function_versions_dispatcher (void *decl)
+{
+  tree fn = (tree) decl;
+  struct cgraph_node *node = NULL;
+  struct cgraph_node *default_node = NULL;
+  struct cgraph_function_version_info *node_v = NULL;
+  struct cgraph_function_version_info *first_v = NULL;
+
+  tree dispatch_decl = NULL;
+
+  struct cgraph_function_version_info *default_version_info = NULL;
+ 
+  gcc_assert (fn != NULL && DECL_FUNCTION_VERSIONED (fn));
+
+  if (TARGET_DEBUG_TARGET)
+    fprintf (stderr, "rs6000_get_function_versions_dispatcher (%s)\n",
+            get_decl_name (fn));
+
+  node = cgraph_node::get (fn);
+  gcc_assert (node != NULL);
+
+  node_v = node->function_version ();
+  gcc_assert (node_v != NULL);
+ 
+  if (node_v->dispatcher_resolver != NULL)
+    return node_v->dispatcher_resolver;
+
+  /* Find the default version and make it the first node.  */
+  first_v = node_v;
+  /* Go to the beginning of the chain.  */
+  while (first_v->prev != NULL)
+    first_v = first_v->prev;
+
+  default_version_info = first_v;
+  while (default_version_info != NULL)
+    {
+      const tree decl2 = default_version_info->this_node->decl;
+      if (is_function_default_version (decl2))
+        break;
+      default_version_info = default_version_info->next;
+    }
+
+  /* If there is no default node, just return NULL.  */
+  if (default_version_info == NULL)
+    return NULL;
+
+  /* Make default info the first node.  */
+  if (first_v != default_version_info)
+    {
+      default_version_info->prev->next = default_version_info->next;
+      if (default_version_info->next)
+        default_version_info->next->prev = default_version_info->prev;
+      first_v->prev = default_version_info;
+      default_version_info->next = first_v;
+      default_version_info->prev = NULL;
+    }
+
+  default_node = default_version_info->this_node;
+
+#if defined (ASM_OUTPUT_TYPE_DIRECTIVE)
+  if (targetm.has_ifunc_p ())
+    {
+      struct cgraph_function_version_info *it_v = NULL;
+      struct cgraph_node *dispatcher_node = NULL;
+      struct cgraph_function_version_info *dispatcher_version_info = NULL;
+
+      /* Right now, the dispatching is done via ifunc.  */
+      dispatch_decl = make_dispatcher_decl (default_node->decl);
+
+      dispatcher_node = cgraph_node::get_create (dispatch_decl);
+      gcc_assert (dispatcher_node != NULL);
+      dispatcher_node->dispatcher_function = 1;
+      dispatcher_version_info
+       = dispatcher_node->insert_new_function_version ();
+      dispatcher_version_info->next = default_version_info;
+      dispatcher_node->definition = 1;
+
+      /* Set the dispatcher for all the versions.  */
+      it_v = default_version_info;
+      while (it_v != NULL)
+       {
+         it_v->dispatcher_resolver = dispatch_decl;
+         it_v = it_v->next;
+       }
+    }
+  else
+#endif
+    {
+      error_at (DECL_SOURCE_LOCATION (default_node->decl),
+               "multiversioning needs ifunc which is not supported "
+               "on this target");
+    }
+
+  return dispatch_decl;
+}
+
+/* Make the resolver function decl to dispatch the versions of
+   a multi-versioned function,  DEFAULT_DECL.  Create an
+   empty basic block in the resolver and store the pointer in
+   EMPTY_BB.  Return the decl of the resolver function.  */
+
+static tree
+make_resolver_func (const tree default_decl,
+                   const tree dispatch_decl,
+                   basic_block *empty_bb)
+{
+  char *resolver_name;
+  tree decl, type, decl_name, t;
+  bool is_uniq = false;
+
+  /* IFUNC's have to be globally visible.  So, if the default_decl is
+     not, then the name of the IFUNC should be made unique.  */
+  if (TREE_PUBLIC (default_decl) == 0)
+    is_uniq = true;
+
+  /* Append the filename to the resolver function if the versions are
+     not externally visible.  This is because the resolver function has
+     to be externally visible for the loader to find it.  So, appending
+     the filename will prevent conflicts with a resolver function from
+     another module which is based on the same version name.  */
+  resolver_name = make_unique_name (default_decl, "resolver", is_uniq);
+
+  /* The resolver function should return a (void *). */
+  type = build_function_type_list (ptr_type_node, NULL_TREE);
+
+  decl = build_fn_decl (resolver_name, type);
+  decl_name = get_identifier (resolver_name);
+  SET_DECL_ASSEMBLER_NAME (decl, decl_name);
+
+  DECL_NAME (decl) = decl_name;
+  TREE_USED (decl) = 1;
+  DECL_ARTIFICIAL (decl) = 1;
+  DECL_IGNORED_P (decl) = 0;
+  /* IFUNC resolvers have to be externally visible.  */
+  TREE_PUBLIC (decl) = 1;
+  DECL_UNINLINABLE (decl) = 1;
+
+  /* Resolver is not external, body is generated.  */
+  DECL_EXTERNAL (decl) = 0;
+  DECL_EXTERNAL (dispatch_decl) = 0;
+
+  DECL_CONTEXT (decl) = NULL_TREE;
+  DECL_INITIAL (decl) = make_node (BLOCK);
+  DECL_STATIC_CONSTRUCTOR (decl) = 0;
+
+  if (DECL_COMDAT_GROUP (default_decl)
+      || TREE_PUBLIC (default_decl))
+    {
+      /* In this case, each translation unit with a call to this
+        versioned function will put out a resolver.  Ensure it
+        is comdat to keep just one copy.  */
+      DECL_COMDAT (decl) = 1;
+      make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl));
+    }
+  /* Build result decl and add to function_decl. */
+  t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node);
+  DECL_ARTIFICIAL (t) = 1;
+  DECL_IGNORED_P (t) = 1;
+  DECL_RESULT (decl) = t;
+
+  gimplify_function_tree (decl);
+  push_cfun (DECL_STRUCT_FUNCTION (decl));
+  *empty_bb = init_lowered_empty_function (decl, false, 0);
+
+  cgraph_node::add_new_function (decl, true);
+  symtab->call_cgraph_insertion_hooks (cgraph_node::get_create (decl));
+
+  pop_cfun ();
+
+  gcc_assert (dispatch_decl != NULL);
+  /* Mark dispatch_decl as "ifunc" with resolver as resolver_name.  */
+  DECL_ATTRIBUTES (dispatch_decl)
+    = make_attribute ("ifunc", resolver_name, DECL_ATTRIBUTES (dispatch_decl));
+
+  /* Create the alias for dispatch to resolver here.  */
+  /*cgraph_create_function_alias (dispatch_decl, decl);*/
+  cgraph_node::create_same_body_alias (dispatch_decl, decl);
+  XDELETEVEC (resolver_name);
+  return decl;
+}
+
+/* This adds a condition to the basic_block NEW_BB in function FUNCTION_DECL to
+   return a pointer to VERSION_DECL if we are running on a machine that
+   supports the index CLONE_ISA hardware architecture bits.  This function will
+   be called during version dispatch to decide which function version to
+   execute.  It returns the basic block at the end, to which more conditions
+   can be added.  */
+
+static basic_block
+add_condition_to_bb (tree function_decl, tree version_decl,
+                    int clone_isa, basic_block new_bb)
+{
+  gimple *return_stmt;
+  tree convert_expr, result_var;
+  gimple *convert_stmt;
+  gimple_seq gseq;
+  gimple *call_cond_stmt;
+  gimple *if_else_stmt;
+
+  basic_block bb1, bb2, bb3;
+  edge e12, e23;
+  tree cond_var,  predicate_decl, predicate_arg, bool_zero;
+  const char *arg_str;
+
+  push_cfun (DECL_STRUCT_FUNCTION (function_decl));
+
+  gcc_assert (new_bb != NULL);
+  gseq = bb_seq (new_bb);
+
+
+  convert_expr = build1 (CONVERT_EXPR, ptr_type_node,
+                        build_fold_addr_expr (version_decl));
+  result_var = create_tmp_var (ptr_type_node);
+  convert_stmt = gimple_build_assign (result_var, convert_expr); 
+  return_stmt = gimple_build_return (result_var);
+
+  if (clone_isa == (int)CLONE_DEFAULT)
+    {
+      gimple_seq_add_stmt (&gseq, convert_stmt);
+      gimple_seq_add_stmt (&gseq, return_stmt);
+      set_bb_seq (new_bb, gseq);
+      gimple_set_bb (convert_stmt, new_bb);
+      gimple_set_bb (return_stmt, new_bb);
+      pop_cfun ();
+      return new_bb;
+    }
+
+  bool_zero = build_int_cst (bool_int_type_node, 0);
+  cond_var = create_tmp_var (bool_int_type_node);
+  predicate_decl = rs6000_builtin_decls [(int) RS6000_BUILTIN_CPU_SUPPORTS];
+  arg_str = rs6000_clone_map[clone_isa].name;
+  predicate_arg = build_string_literal (strlen (arg_str) + 1, arg_str);
+  call_cond_stmt = gimple_build_call (predicate_decl, 1, predicate_arg);
+  gimple_call_set_lhs (call_cond_stmt, cond_var);
+
+  gimple_set_block (call_cond_stmt, DECL_INITIAL (function_decl));
+  gimple_set_bb (call_cond_stmt, new_bb);
+  gimple_seq_add_stmt (&gseq, call_cond_stmt);
+
+  if_else_stmt = gimple_build_cond (NE_EXPR, cond_var, bool_zero, NULL_TREE,
+                                   NULL_TREE);
+  gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl));
+  gimple_set_bb (if_else_stmt, new_bb);
+  gimple_seq_add_stmt (&gseq, if_else_stmt);
+
+  gimple_seq_add_stmt (&gseq, convert_stmt);
+  gimple_seq_add_stmt (&gseq, return_stmt);
+  set_bb_seq (new_bb, gseq);
+
+  bb1 = new_bb;
+  e12 = split_block (bb1, if_else_stmt);
+  bb2 = e12->dest;
+  e12->flags &= ~EDGE_FALLTHRU;
+  e12->flags |= EDGE_TRUE_VALUE;
+
+  e23 = split_block (bb2, return_stmt);
+
+  gimple_set_bb (convert_stmt, bb2);
+  gimple_set_bb (return_stmt, bb2);
+
+  bb3 = e23->dest;
+  make_edge (bb1, bb3, EDGE_FALSE_VALUE); 
+
+  remove_edge (e23);
+  make_edge (bb2, EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
+
+  pop_cfun ();
+
+  return bb3;
+}
+
+/* This function generates the dispatch function for multi-versioned functions.
+   DISPATCH_DECL is the function which will contain the dispatch logic.
+   FNDECLS are the function choices for dispatch, and is a tree chain.
+   EMPTY_BB is the basic block pointer in DISPATCH_DECL in which the dispatch
+   code is generated.  */
+
+static int
+dispatch_function_versions (tree dispatch_decl,
+                           void *fndecls_p,
+                           basic_block *empty_bb)
+{
+  int ix;
+  tree ele;
+  vec<tree> *fndecls;
+  tree clones[ (int)CLONE_MAX ];
+
+  if (TARGET_DEBUG_TARGET)
+    fputs ("dispatch_function_versions, top\n", stderr);
+
+  gcc_assert (dispatch_decl != NULL
+             && fndecls_p != NULL
+             && empty_bb != NULL);
+
+  /* fndecls_p is actually a vector.  */
+  fndecls = static_cast<vec<tree> *> (fndecls_p);
+
+  /* At least one more version other than the default.  */
+  gcc_assert (fndecls->length () >= 2);
+
+  /* The first version in the vector is the default decl.  */
+  memset ((void *) clones, '\0', sizeof (clones));
+  clones[ (int)CLONE_DEFAULT ] = (*fndecls)[0];
+
+  /* On the PowerPC, we do not need to call __builtin_cpu_init, if we are using
+     a new enough glibc.  If we ever need to call it, we would need to insert
+     the code here to do the call.  */
+
+  for (ix = 1; fndecls->iterate (ix, &ele); ++ix)
+    {
+      int priority = rs6000_clone_priority (ele);
+      if (!clones[priority])
+       clones[priority] = ele;
+    }
+
+  for (ix = 0; ix < (int)CLONE_MAX; ix++)
+    if (clones[ix])
+      {
+       if (TARGET_DEBUG_TARGET)
+         fprintf (stderr, "dispatch_function_versions, clone %d, %s\n",
+                  ix, get_decl_name (clones[ix]));
+
+       *empty_bb = add_condition_to_bb (dispatch_decl, clones[ix], ix,
+                                        *empty_bb);
+      }
+
+  return 0;
+}
+
+/* Generate the dispatching code body to dispatch multi-versioned function
+   DECL.  The target hook is called to process the "target" attributes and
+   provide the code to dispatch the right function at run-time.  NODE points
+   to the dispatcher decl whose body will be created.  */
+
+static tree 
+rs6000_generate_version_dispatcher_body (void *node_p)
+{
+  tree resolver_decl;
+  basic_block empty_bb;
+  tree default_ver_decl;
+  struct cgraph_node *versn;
+  struct cgraph_node *node;
+
+  struct cgraph_function_version_info *node_version_info = NULL;
+  struct cgraph_function_version_info *versn_info = NULL;
+
+  node = (cgraph_node *)node_p;
+
+  node_version_info = node->function_version ();
+  gcc_assert (node->dispatcher_function
+             && node_version_info != NULL);
+
+  if (node_version_info->dispatcher_resolver)
+    return node_version_info->dispatcher_resolver;
+
+  /* The first version in the chain corresponds to the default version.  */
+  default_ver_decl = node_version_info->next->this_node->decl;
+
+  /* node is going to be an alias, so remove the finalized bit.  */
+  node->definition = false;
+
+  resolver_decl = make_resolver_func (default_ver_decl,
+                                     node->decl, &empty_bb);
+
+  node_version_info->dispatcher_resolver = resolver_decl;
+
+  if (TARGET_DEBUG_TARGET)
+    fprintf (stderr, "rs6000_get_function_versions_dispatcher, %s\n",
+            get_decl_name (resolver_decl));
+
+  push_cfun (DECL_STRUCT_FUNCTION (resolver_decl));
+
+  auto_vec<tree, 2> fn_ver_vec;
+
+  for (versn_info = node_version_info->next; versn_info;
+       versn_info = versn_info->next)
+    {
+      versn = versn_info->this_node;
+      /* Check for virtual functions here again, as by this time it should
+        have been determined if this function needs a vtable index or
+        not.  This happens for methods in derived classes that override
+        virtual methods in base classes but are not explicitly marked as
+        virtual.  */
+      if (DECL_VINDEX (versn->decl))
+       sorry ("Virtual function multiversioning not supported");
+
+      fn_ver_vec.safe_push (versn->decl);
+    }
+
+  dispatch_function_versions (resolver_decl, &fn_ver_vec, &empty_bb);
+  cgraph_edge::rebuild_edges ();
+  pop_cfun ();
+  return resolver_decl;
+}
+
+
 /* Hook to determine if one function can safely inline another.  */
 
 static bool
@@ -40208,12 +40748,7 @@ rs6000_can_inline_p (tree caller, tree c
 
   if (TARGET_DEBUG_TARGET)
     fprintf (stderr, "rs6000_can_inline_p:, caller %s, callee %s, %s inline\n",
-            (DECL_NAME (caller)
-             ? IDENTIFIER_POINTER (DECL_NAME (caller))
-             : "<unknown>"),
-            (DECL_NAME (callee)
-             ? IDENTIFIER_POINTER (DECL_NAME (callee))
-             : "<unknown>"),
+            get_decl_name (caller), get_decl_name (callee),
             (ret ? "can" : "cannot"));
 
   return ret;
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi 
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/doc)      (revision 
248378)
+++ gcc/doc/extend.texi (.../gcc/doc)   (working copy)
@@ -3257,7 +3257,15 @@ For instance, on an x86, you could compi
 @code{target_clones("sse4.1,avx")}.  GCC creates two function clones,
 one compiled with @option{-msse4.1} and another with @option{-mavx}.
 It also creates a resolver function (see the @code{ifunc} attribute
-above) that dynamically selects a clone suitable for current architecture.
+above) that dynamically selects a clone suitable for current
+architecture.
+
+On a PowerPC, you could compile a function with
+@code{target_clones("cpu=power9,default")}.  GCC creates two function
+clones, one compiled with @option{-mcpu=power9} and another with the
+default options.  It also creates a resolver function (see the
+@code{ifunc} attribute above) that dynamically selects a clone
+suitable for current architecture.
 
 @item unused
 @cindex @code{unused} function attribute
Index: gcc/testsuite/gcc.target/powerpc/clone1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/clone1.c   
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)
     (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/clone1.c   
(.../gcc/testsuite/gcc.target/powerpc)  (revision 248446)
@@ -0,0 +1,19 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O2" } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+
+__attribute__((target_clones("cpu=power9,default")))
+long mod_func (long a, long b)
+{
+  return a % b;
+}
+
+long mod_func_or (long a, long b, long c)
+{
+  return mod_func (a, b) | c;
+}
+
+/* { dg-final { scan-assembler-times {\mdivd\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mmulld\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmodsd\M} 1 } } */

Reply via email to