On Thu, Sep 26, 2013 at 02:31:33PM -0500, Aldy Hernandez wrote: > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -42806,6 +42806,43 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val) > return val; > } > > +/* Return the default vector mangling ISA code when none is specified > + in a `processor' clause. */ > + > +static char > +ix86_cilkplus_default_vector_mangling_isa_code (struct cgraph_node *clone > + ATTRIBUTE_UNUSED) > +{ > + return 'x'; > +}
I think rth was suggesting using vecsize_mangle, vecsize_modifier or something else, instead of ISA, because it won't represent the ISA on all targets. It is just some magic letter used in mangling of the simd functions. > + > + /* To distinguish from an OpenMP simd clone, Cilk Plus functions to > + be cloned have a distinctive artificial label in addition to "omp > + declare simd". */ > + bool cilk_clone = flag_enable_cilkplus > + && lookup_attribute ("cilk plus elemental", > + DECL_ATTRIBUTES (new_node->symbol.decl)); Formatting. I'd say it should be bool cilk_clone = (flag_enable_cilkplus && lookup_attribute ("cilk plus elemental", DECL_ATTRIBUTES (new_node->symbol.decl))); > + if (cilk_clone) > + remove_attribute ("cilk plus elemental", > + DECL_ATTRIBUTES (new_node->symbol.decl)); I think it doesn't make sense to remove the attribute. > + pretty_printer vars_pp; Do you really need two different pretty printers? Can't you just print "_ZGV%c%c%d into pp (is pp_printf that cheap, wouldn't it be better to pp_string (&pp, "_ZGV"), 2 pp_character + one pp_decimal_int?), and then do the loop over the args, which right now writes into vars_pp and finally pp_underscore and pp_string the normally mangled name? > +/* Create a simd clone of OLD_NODE and return it. */ > + > +static struct cgraph_node * > +simd_clone_create (struct cgraph_node *old_node) > +{ > + struct cgraph_node *new_node; > + new_node = cgraph_function_versioning (old_node, vNULL, NULL, NULL, false, > + NULL, NULL, "simdclone"); > + My understanding of how IPA cloning etc. works is that you first set up various data structures describing how you change the arguments and only then actually do cgraph_function_versioning which already during the copying will do some of the transformations of the IL. But perhaps those transformations are too complicated to describe for tree-inline.c to make them for you. > + tree attr = lookup_attribute ("omp declare simd", > + DECL_ATTRIBUTES (node->symbol.decl)); > + if (!attr) > + return; > + do > + { > + struct cgraph_node *new_node = simd_clone_create (node); > + > + bool inbranch_clause; > + simd_clone_clauses_extract (new_node, TREE_VALUE (attr), > + &inbranch_clause); > + simd_clone_compute_isa_and_simdlen (new_node); > + simd_clone_mangle (node, new_node); As discussed on IRC, I was hoping that for OpenMP simd and selected targets (e.g. i?86-linux and x86_64-linux) we could do better than that, creating not just one or two clones as we do for Cilk+ where one can select which CPU (and thus ISA) he wants to build the clones for, but creating clones for all ISAs, and just based on command line options either emit just one of them as the really optimized one and the others just as thunks that would just call other simd clone functions or the normal function possibly several times. Jakub