On 12/21/2017 02:25 PM, Qing Zhao wrote: > Hi, > > I updated my patch based on all your comments. > > the major changes are the following: > > 1. replace the candidate calls with __builtin_str(n)cmp_eq instead of > __builtin_memcmp_eq; > in builtins.c, when expanding the new __builtin_str(n)cmp_eq > call, expand them first as > __builtin_memcmp_eq, if Not succeed, change the call back to > __builtin_str(n)cmp. > 2. change the call to “get_range_strlen” with “compute_objsize”. Please read the big comment before compute_objsize. If you are going to use it to influence code generation or optimization, then you're most likely doing something wrong.
compute_objsize can return an estimate in some circumstances. > 3. add the missing case for equality checking with zero; > 4. adjust the new testing case for PR83026; add a new testing case for > the missing case added in 3. > 5. update “uhwi” to “shwi” for where it needs; > 6. some other minor format changes. > > the changes are retested on x86 and aarch64, bootstrapped and regression > tested. no issue. > > Okay for trunk? > > thanks. > > Qing > > Please see the updated patch: > > gcc/ChangeLog: > > +2017-12-21 Qing Zhao <qing.z...@oracle.com> > + > + PR middle-end/78809 > + PR middle-end/83026 > + * builtins.c (expand_builtin): Add the handling of BUILT_IN_STRCMP_EQ > + and BUILT_IN_STRNCMP_EQ. > + * builtins.def: Add new builtins BUILT_IN_STRCMP_EQ and > + BUILT_IN_STRNCMP_EQ. > + * tree-ssa-strlen.c (compute_string_length): New function. > + (handle_builtin_string_cmp): New function to handle calls to > + string compare functions. > + (strlen_optimize_stmt): Add handling to builtin string compare > + calls. > + * tree.c (build_common_builtin_nodes): Add new defines of > + BUILT_IN_STRNCMP_EQ and BUILT_IN_STRCMP_EQ. > + > gcc/testsuite/ChangeLog > > +2017-12-21 Qing Zhao <qing.z...@oracle.com> > + > + PR middle-end/78809 > + * gcc.dg/strcmpopt_2.c: New testcase. > + * gcc.dg/strcmpopt_3.c: New testcase. > + > + PR middle-end/83026 > + * gcc.dg/strcmpopt_3.c: New testcase. What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ. ISTM that if you're going to introduce those new builtins, then you have to audit all the code that runs between their introduction into the IL and when you expand them to ensure they're handled properly. All you're really doing is carrying along a status bit about what tree-ssa-strlen did. So you could possibly store that status bit somewhere. The concern with both is that something later invalidates the analysis you've done. I'm having a hard time coming up with a case where this could happen, but I'm definitely concerned about this possibility. Though I feel it's more likely to happen if we store a status bit vs using STRCMP_EQ STRNCMP_EQ. [ For example, we have two calls with the same arguments, but one feeds an equality test, the other does not. If we store a status bit that one could be transformed, but then we end up CSE-ing the two calls, the status bit would be incorrect because one of the calls did not feed an equality test. Hmmm. ] > diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c > index 94f20ef..57563ef 100644 > --- a/gcc/tree-ssa-strlen.c > +++ b/gcc/tree-ssa-strlen.c > @@ -2540,6 +2540,216 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi) > return false; > } > > +/* Given an index to the strinfo vector, compute the string length for the > + corresponding string. Return -1 when unknown. */ > + > +static HOST_WIDE_INT > +compute_string_length (int idx) > +{ > + HOST_WIDE_INT string_leni = -1; > + gcc_assert (idx != 0); > + > + if (idx < 0) > + string_leni = ~idx; So it seems to me you should just return ~idx; Then appropriately simplify the rest of the code. > + > +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do > + equality test against zero: > + > + A. When both arguments are constant strings and it's a strcmp: > + * if the length of the strings are NOT equal, we can safely fold the > call > + to a non-zero value. > + * otherwise, do nothing now. I'm guessing your comment needs a bit of work. If both arguments are constant strings, then we can just use the host str[n]cmp to fold the str[n]cmp to a constant. Presumably this is handled earlier :-) So what I'm guessing is you're really referring to the case where the lengths are known constants, even if the contents of the strings themselves are not. In that case if its an equality comparison, then you can fold to a constant. Otherwise we do nothing. So I think the comment needs updating here. > + > + B. When one of the arguments is constant string, try to replace the call > with > + a __builtin_str(n)cmp_eq call where possible, i.e: > + > + strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is a > + constant string, C is a constant. > + if (C <= strlen(STR) && sizeof_array(s) > C) > + { > + replace this call with > + strncmp_eq (s, STR, C) (!)= 0 > + } > + if (C > strlen(STR) > + { > + it can be safely treated as a call to strcmp (s, STR) (!)= 0 > + can handled by the following strcmp. > + } > + > + strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a > + constant string. > + if (sizeof_array(s) > strlen(STR)) > + { > + replace this call with > + strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 So I'd hazard a guess that this comment has the same problem. It's mixing up the concept of a constant string and the concept of a nonconstant string, but which has a known constant length. First, if one of the arguments is a string constant, then it should be easy to get the constant at expansion time with minimal effort. It's also the case that knowing if we're comparing the result against zero is trivial to figure out at expansion time. This would probably be a reasonble thing to add to the expanders. Your function comment for handle_builtin_string_cmp does not indicate what the return value means. From reading the code is appears to return TRUE when it does nothing and FALSE when it optimizes the call in some way. Is that correct? That seems inverted from the way we'd normally write this stuff. > + > + /* When one of args is constant string. */ > + tree var_string = NULL_TREE; > + HOST_WIDE_INT const_string_leni = -1; > + > + if (idx1) > + { > + const_string_leni = compute_string_length (idx1); > + var_string = arg2; > + } > + else > + { > + gcc_checking_assert (idx2); > + const_string_leni = compute_string_length (idx2); > + var_string = arg1; > + } > + > + if (const_string_leni < 0) > + return true; > + > + unsigned HOST_WIDE_INT var_sizei = 0; > + /* try to determine the minimum size of the object pointed by var_string. > */ > + tree size = compute_objsize (var_string, 2); So this worries me as well. compute_objsize is not supposed to be used to influence code generation or optimization. It is not guaranteed to return an exact size, but instead may return an estimate! I'd really like to hear Jakub or Richi chime in with their thoughts on how this transforms one builtin into another and whether or not they think that is wise. jeff