> -----Original Message-----
> From: Jakub Jelinek <ja...@redhat.com>
> Sent: Tuesday, October 3, 2023 12:02 PM
> To: Tamar Christina <tamar.christ...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; jwak...@redhat.com
> Subject: Re: [PATCH]middle-end: Recursively check
> is_trivially_copyable_or_pair in vec.h
> 
> On Tue, Oct 03, 2023 at 10:27:16AM +0000, Tamar Christina wrote:
> > +/* Structure used to track meta-data on PHI arguments used to generate
> > +   most efficient comparison sequence to slatten a PHI node.  */
> 
>                                             ^^^ typo (at least, never heard 
> of this word, and
> wiktionary doesn't know it either (except for Dannish/Swedish))
> 
> > @@ -2045,6 +2065,25 @@ gen_phi_nest_statement (gphi *phi,
> gimple_stmt_iterator *gsi,
> >    return lhs;
> >  }
> >
> 
> Perhaps add a short function comment here?
> 
> > +static int
> > +cmp_arg_entry (const void *p1, const void *p2) {
> > +  const ifcvt_arg_entry sval1 = *(const ifcvt_arg_entry *)p1;
> > +  const ifcvt_arg_entry sval2 = *(const ifcvt_arg_entry *)p2;
> > +
> > +  if (sval1.num_compares < sval2.num_compares)
> > +    return -1;
> > +  else if (sval1.num_compares > sval2.num_compares)
> > +    return 1;
> > +
> > +  if (sval1.occurs < sval2.occurs)
> > +    return -1;
> > +  else if (sval1.occurs > sval2.occurs)
> > +    return 1;
> > +
> > +  return 0;
> > +}
> > +
> 
> > @@ -2167,61 +2206,53 @@ predicate_scalar_phi (gphi *phi,
> gimple_stmt_iterator *gsi)
> >    /* Create hashmap for PHI node which contain vector of argument indexes
> >       having the same value.  */
> >    bool swap = false;
> > -  hash_map<tree_operand_hash, auto_vec<int> > phi_arg_map;
> > +  hash_map<tree_operand_hash, vec<int> > phi_arg_map;
> >    unsigned int num_args = gimple_phi_num_args (phi);
> >    /* Vector of different PHI argument values.  */
> > -  auto_vec<tree> args (num_args);
> > +  auto_vec<ifcvt_arg_entry_t> args;
> >
> > -  /* Compute phi_arg_map.  */
> > +  /* Compute phi_arg_map, determine the list of unique PHI args and the
> indices
> > +     where they are in the PHI node.  The indices will be used to determine
> > +     the conditions to apply and their complexity.  */
> >    for (i = 0; i < num_args; i++)
> >      {
> >        tree arg;
> >
> >        arg = gimple_phi_arg_def (phi, i);
> > -      if (!phi_arg_map.get (arg))
> > -   args.quick_push (arg);
> >        phi_arg_map.get_or_insert (arg).safe_push (i);
> >      }
> >
> > -  /* Determine element with max number of occurrences and complexity.
> Looking at only
> > -     number of occurrences as a measure for complexity isn't enough as all
> usages can
> > -     be unique but the comparisons to reach the PHI node differ per branch.
> */
> > -  typedef std::pair <tree, std::pair <unsigned, unsigned>> ArgEntry;
> > -  auto_vec<ArgEntry> argsKV;
> > -  for (i = 0; i < args.length (); i++)
> > +  /* Determine element with max number of occurrences and complexity.
> Looking
> > +     at only number of occurrences as a measure for complexity isn't enough
> as
> > +     all usages can be unique but the comparisons to reach the PHI node 
> > differ
> > +     per branch.  */
> > +  for (auto entry : phi_arg_map)
> >      {
> >        unsigned int len = 0;
> > -      for (int index : phi_arg_map.get (args[i]))
> > +      for (int index : entry.second)
> >     {
> >       edge e = gimple_phi_arg_edge (phi, index);
> >       len += get_bb_num_predicate_stmts (e->src);
> >     }
> >
> > -      unsigned occur = phi_arg_map.get (args[i])->length ();
> > +      unsigned occur = entry.second.length ();
> >        if (dump_file && (dump_flags & TDF_DETAILS))
> >     fprintf (dump_file, "Ranking %d as len=%d, idx=%d\n", i, len, occur);
> > -      argsKV.safe_push ({ args[i], { len, occur }});
> > +      args.safe_push ({ entry.first, len, occur, entry.second });
> >      }
> >
> >    /* Sort elements based on rankings ARGS.  */
> > -  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
> > -                                        const ArgEntry &right) {
> > -    return left.second < right.second;
> > -  });
> > -
> > -  for (i = 0; i < args.length (); i++)
> > -    args[i] = argsKV[i].first;
> > +  args.qsort (cmp_arg_entry);
> 
> I admit I don't know what you're using the args vector later on for and
> whether its ordering affects code generation, but because you qsort it I
> assume it does. My worry is that a hash_map traversal might not be the same
> order on all hosts and similarly qsort doesn't achieve stable sorting in case
> num_compares and occurrs members are equal for two or more different
> arguments.  Can that ever happen?

The order does matter but only for args. The hashmap is only used to collect
the unique values and their locations.  While you can have num_compares and
occurs being the same it wouldn't matter for the optimization as they are both
"equally" as expensive.  It would matter in the case where they are the last 2
entries in the list as we never test the last entry.  So your codegen would
select a different element then to test.  So could potentially affect 
reproducibility.

> We have stablesort method instead of
> qsort but that would require consistent ordering in the vector (std::sort
> doesn't ensure stable sorting either).
> 
> If it is a non-issue, the patch is ok with the above nits fixed.  Otherwise
> perhaps we'd need to push in the first loop into the vector (but that
>       if (!phi_arg_map.get (arg))
>       args.quick_push (arg);
>       phi_arg_map.get_or_insert (arg).safe_push (i); in there was quite
> inefficient, better would be
>       bool existed;
>       phi_arg_map.get_or_insert (arg, &existed).safe_push (i);
>       if (!existed)
>       args.safe_push (ifcvt_arg_entry { arg, 0, 0, vNULL }); or something
> similar), plus use stablesort.  Or add another compared member which would
> be the first position.

Hmm the problem here is that it would make the second loop that fills in the len
quadratic as it has to search for arg in the list.  I suppose I could push a 
pointer
to the struct instead of `i` in the hashmap and the element into args and update
the pointer as we go along?  Would that work?

Regards,
Tamar

> 
>       Jakub

Reply via email to