On 12/21/2017 02:25 PM, Qing Zhao wrote:
> Hi, 
> 
> I updated my patch based on all your comments. 
> 
> the major changes are the following:
> 
>       1. replace the candidate calls with __builtin_str(n)cmp_eq instead of 
> __builtin_memcmp_eq;
>             in builtins.c,  when expanding the new __builtin_str(n)cmp_eq 
> call, expand them first as
>             __builtin_memcmp_eq, if Not succeed,  change the call back to 
> __builtin_str(n)cmp.
>       2. change the call to “get_range_strlen” with “compute_objsize”.
Please read the big comment before compute_objsize.  If you are going to
use it to influence code generation or optimization, then you're most
likely doing something wrong.

compute_objsize can return an estimate in some circumstances.


>       3. add the missing case for equality checking with zero;
>       4. adjust the new testing case for PR83026; add a new testing case for 
> the missing case added in 3.
>       5. update “uhwi” to “shwi” for where it needs;
>       6. some other minor format changes.
> 
> the changes are retested on x86 and aarch64, bootstrapped and regression 
> tested. no issue.
> 
> Okay for trunk?
> 
> thanks.
> 
> Qing
> 
> Please see the updated patch:
> 
> gcc/ChangeLog:
> 
> +2017-12-21  Qing Zhao  <qing.z...@oracle.com>
> +
> +     PR middle-end/78809
> +     PR middle-end/83026
> +     * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ
> +     and BUILT_IN_STRNCMP_EQ.
> +     * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and
> +     BUILT_IN_STRNCMP_EQ.
> +     * tree-ssa-strlen.c (compute_string_length): New function.
> +     (handle_builtin_string_cmp): New function to handle calls to
> +     string compare functions.
> +     (strlen_optimize_stmt): Add handling to builtin string compare
> +     calls. 
> +     * tree.c (build_common_builtin_nodes): Add new defines of
> +     BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ.
> +
> gcc/testsuite/ChangeLog
> 
> +2017-12-21  Qing Zhao  <qing.z...@oracle.com> 
> +
> +     PR middle-end/78809
> +     * gcc.dg/strcmpopt_2.c: New testcase.
> +     * gcc.dg/strcmpopt_3.c: New testcase.
> +
> +     PR middle-end/83026
> +     * gcc.dg/strcmpopt_3.c: New testcase.
What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ.
ISTM that if you're going to introduce those new builtins, then you have
to audit all the code that runs between their introduction into the IL
and when you expand them to ensure they're handled properly.

All you're really doing is carrying along a status bit about what
tree-ssa-strlen did.  So you could possibly store that status bit somewhere.

The concern with both is that something later invalidates the analysis
you've done.  I'm having a hard time coming up with a case where this
could happen, but I'm definitely concerned about this possibility.
Though I feel it's more likely to happen if we store a status bit vs
using STRCMP_EQ STRNCMP_EQ.

[ For example, we have two calls with the same arguments, but one feeds
an equality test, the other does not.  If we store a status bit that one
could be transformed, but then we end up CSE-ing the two calls, the
status bit would be incorrect because one of the calls did not feed an
equality test.  Hmmm. ]




> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> index 94f20ef..57563ef 100644
> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c
> @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
>    return false;
>  }
>  
> +/* Given an index to the strinfo vector, compute the string length for the
> +   corresponding string. Return -1 when unknown.  */
> + 
> +static HOST_WIDE_INT 
> +compute_string_length (int idx)
> +{
> +  HOST_WIDE_INT string_leni = -1; 
> +  gcc_assert (idx != 0);
> +
> +  if (idx < 0)
> +    string_leni = ~idx;
So it seems to me you should just
  return ~idx;

Then appropriately simplify the rest of the code.

> +
> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
> +   equality test against zero:
> +
> +   A. When both arguments are constant strings and it's a strcmp:
> +      * if the length of the strings are NOT equal, we can safely fold the 
> call
> +        to a non-zero value.
> +      * otherwise, do nothing now.
I'm guessing your comment needs a bit of work.  If both arguments are
constant strings, then we can just use the host str[n]cmp to fold the
str[n]cmp to a constant.  Presumably this is handled earlier :-)

So what I'm guessing is you're really referring to the case where the
lengths are known constants, even if the contents of the strings
themselves are not.  In that case if its an equality comparison, then
you can fold to a constant.  Otherwise we do nothing.  So I think the
comment needs updating here.






> +  
> +   B. When one of the arguments is constant string, try to replace the call 
> with
> +   a __builtin_str(n)cmp_eq call where possible, i.e:
> +
> +   strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   constant string, C is a constant.
> +     if (C <= strlen(STR) && sizeof_array(s) > C)
> +       {
> +         replace this call with
> +         strncmp_eq (s, STR, C) (!)= 0
> +       }
> +     if (C > strlen(STR)
> +       {
> +         it can be safely treated as a call to strcmp (s, STR) (!)= 0
> +         can handled by the following strcmp.
> +       }
> +
> +   strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a 
> +   constant string.
> +     if  (sizeof_array(s) > strlen(STR))
> +       {
> +         replace this call with
> +         strcmp_eq (s, STR, strlen(STR)+1) (!)= 0
So I'd hazard a guess that this comment has the same problem.  It's
mixing up the concept of a constant string and the concept of a
nonconstant string, but which has a known constant length.

First, if one of the arguments is a string constant, then it should be
easy to get the constant at expansion time with minimal effort.  It's
also the case that knowing if we're comparing the result against zero is
trivial to figure out at expansion time.  This would probably be a
reasonble thing to add to the expanders.



Your function comment for handle_builtin_string_cmp does not indicate
what the return value means.  From reading the code is appears to return
TRUE when it does nothing and FALSE when it optimizes the call in some
way.  Is that correct?  That seems inverted from the way we'd normally
write this stuff.



> +
> +  /* When one of args is constant string.  */
> +  tree var_string = NULL_TREE;
> +  HOST_WIDE_INT const_string_leni = -1;
> +  
> +  if (idx1)
> +    {
> +      const_string_leni = compute_string_length (idx1);
> +      var_string = arg2;
> +    } 
> +  else 
> +    {
> +      gcc_checking_assert (idx2);
> +      const_string_leni = compute_string_length (idx2);
> +      var_string = arg1;
> +    } 
> +
> +  if (const_string_leni < 0) 
> +    return true;
> + 
> +  unsigned HOST_WIDE_INT var_sizei = 0;
> +  /* try to determine the minimum size of the object pointed by var_string.  
> */
> +  tree size = compute_objsize (var_string, 2);
So this worries me as well.  compute_objsize is not supposed to be used
to influence code generation or optimization.  It is not guaranteed to
return an exact size, but instead may return an estimate!




I'd really like to hear Jakub or Richi chime in with their thoughts on
how this transforms one builtin into another and whether or not they
think that is wise.

jeff

Reply via email to