Hi, Jeff, Really sorry for my delay.
Your email on 1/12/2018 and Richard’s email on 1/15/2018, were completely lost in my mailboxes until yesterday my colleague, Paolo Carlini, forwarded it to me. I really apologize for the late reply. Please see my reply below: > On Jan 12, 2018, at 4:47 PM, Jeff Law <l...@redhat.com > <mailto:l...@redhat.com>> wrote: >> >> >> 1. replace the candidate calls with __builtin_str(n)cmp_eq instead of >> __builtin_memcmp_eq; >> in builtins.c, when expanding the new __builtin_str(n)cmp_eq >> call, expand them first as >> __builtin_memcmp_eq, if Not succeed, change the call back to >> __builtin_str(n)cmp. >> 2. change the call to “get_range_strlen” with “compute_objsize”. > Please read the big comment before compute_objsize. If you are going to > use it to influence code generation or optimization, then you're most > likely doing something wrong. > > compute_objsize can return an estimate in some circumstances. > On Jan 15, 2018, at 2:48 AM, Richard Biener <rguent...@suse.de > <mailto:rguent...@suse.de>> wrote: > > Yes, compute_objsize cannot be used for optimizations. Yes, I just checked the compute_objsize again, you are right, it might return an estimated code size. > On Jan 12, 2018, at 4:47 PM, Jeff Law <l...@redhat.com > <mailto:l...@redhat.com>> wrote: > What I don't like here is the introduction of STRCMP_EQ and STRNCMP_EQ. > ISTM that if you're going to introduce those new builtins, then you have > to audit all the code that runs between their introduction into the IL > and when you expand them to ensure they're handled properly. > > All you're really doing is carrying along a status bit about what > tree-ssa-strlen did. So you could possibly store that status bit somewhere. > > The concern with both is that something later invalidates the analysis > you've done. I'm having a hard time coming up with a case where this > could happen, but I'm definitely concerned about this possibility. > Though I feel it's more likely to happen if we store a status bit vs > using STRCMP_EQ STRNCMP_EQ. > > [ For example, we have two calls with the same arguments, but one feeds > an equality test, the other does not. If we store a status bit that one > could be transformed, but then we end up CSE-ing the two calls, the > status bit would be incorrect because one of the calls did not feed an > equality test. Hmmm. ] See below. >> +static HOST_WIDE_INT >> +compute_string_length (int idx) >> +{ >> + HOST_WIDE_INT string_leni = -1; >> + gcc_assert (idx != 0); >> + >> + if (idx < 0) >> + string_leni = ~idx; > So it seems to me you should just > return ~idx; > > Then appropriately simplify the rest of the code. Okay, I will update my code per your suggestion. > >> + >> +/* Handle a call to strcmp or strncmp. When the result is ONLY used to do >> + equality test against zero: >> + >> + A. When both arguments are constant strings and it's a strcmp: >> + * if the length of the strings are NOT equal, we can safely fold the >> call >> + to a non-zero value. >> + * otherwise, do nothing now. > I'm guessing your comment needs a bit of work. If both arguments are > constant strings, then we can just use the host str[n]cmp to fold the > str[n]cmp to a constant. Presumably this is handled earlier :-) > > So what I'm guessing is you're really referring to the case where the > lengths are known constants, even if the contents of the strings > themselves are not. In that case if its an equality comparison, then > you can fold to a constant. Otherwise we do nothing. So I think the > comment needs updating here. your understanding is correct, I will update the comments to make it more accurate. >> + >> + B. When one of the arguments is constant string, try to replace the call >> with >> + a __builtin_str(n)cmp_eq call where possible, i.e: >> + >> + strncmp (s, STR, C) (!)= 0 in which, s is a pointer to a string, STR is >> a >> + constant string, C is a constant. >> + if (C <= strlen(STR) && sizeof_array(s) > C) >> + { >> + replace this call with >> + strncmp_eq (s, STR, C) (!)= 0 >> + } >> + if (C > strlen(STR) >> + { >> + it can be safely treated as a call to strcmp (s, STR) (!)= 0 >> + can handled by the following strcmp. >> + } >> + >> + strcmp (s, STR) (!)= 0 in which, s is a pointer to a string, STR is a >> + constant string. >> + if (sizeof_array(s) > strlen(STR)) >> + { >> + replace this call with >> + strcmp_eq (s, STR, strlen(STR)+1) (!)= 0 > So I'd hazard a guess that this comment has the same problem. It's > mixing up the concept of a constant string and the concept of a > nonconstant string, but which has a known constant length. > > First, if one of the arguments is a string constant, then it should be > easy to get the constant at expansion time with minimal effort. It's > also the case that knowing if we're comparing the result against zero is > trivial to figure out at expansion time. This would probably be a > reasonble thing to add to the expanders. Yes, should be a string with constant length. I will update my comments to reflect this. > Your function comment for handle_builtin_string_cmp does not indicate > what the return value means. From reading the code is appears to return > TRUE when it does nothing and FALSE when it optimizes the call in some > way. Is that correct? Yes. > That seems inverted from the way we'd normally > write this stuff. This is just following the handling of memcmp (please see the routine handle_builtin_memcmp), try to be consistent with them. >> + >> + /* When one of args is constant string. */ >> + tree var_string = NULL_TREE; >> + HOST_WIDE_INT const_string_leni = -1; >> + >> + if (idx1) >> + { >> + const_string_leni = compute_string_length (idx1); >> + var_string = arg2; >> + } >> + else >> + { >> + gcc_checking_assert (idx2); >> + const_string_leni = compute_string_length (idx2); >> + var_string = arg1; >> + } >> + >> + if (const_string_leni < 0) >> + return true; >> + >> + unsigned HOST_WIDE_INT var_sizei = 0; >> + /* try to determine the minimum size of the object pointed by var_string. >> */ >> + tree size = compute_objsize (var_string, 2); > So this worries me as well. compute_objsize is not supposed to be used > to influence code generation or optimization. It is not guaranteed to > return an exact size, but instead may return an estimate! As I mentioned in above: Yes, I just checked the compute_objsize again, you are right, it might return an estimated code size. there are 3 parts in compute_objsize: A. call “compute_builtin_object_size” to get the code size; B. If A failed, use value_RANGE info when the referenced object involves a non-constant offset in some range. C. if both A and B failed, use the TYPE info to determine the code size. So, From my understanding: both A and C should provide accurate code size info, but B will NOT. for my purpose, if I do NOT use B, then it will be return accurate code size, right? > > > I'd really like to hear Jakub or Richi chime in with their thoughts on > how this transforms one builtin into another and whether or not they think > that is wise On Jan 15, 2018, at 2:48 AM, Richard Biener <rguent...@suse.de <mailto:rguent...@suse.de>> wrote: > Given it follows how we handle memcmp () != 0 it's ok to trigger inline > expansion. > Didn't look into the patch whether it does this. > And yes, places that look at STR[N]CMP now also have to look atSTR[N]CMP_EQ. Based on both you and Richard’s above comments, my understanding is: introducing STRCMP_EQ/STRNCMP_EQ follows the current handling of memcmp ()!=0 in GCC, should be fine. but I should check all the places that look at STRCMP/STRNCMP to include STRCMP_EQ/STRNCMP_EQ now. Thanks. Qing