Hi, Jeff, 
thanks a lot for your comments. please see my reply in below:

> On Nov 16, 2017, at 12:47 PM, Jeff Law <l...@redhat.com> wrote:
> 
>> 
>>   B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
>>      if the result is ONLY used to do a simple equality test against zero, 
>> one of "s1" or "s2" is a small constant string, n is a constant, and the 
>> other non-constant string is guaranteed to not read beyond the end of the 
>> string:
>>      change strncmp (s1, s2, n) or strcmp (s1, s2) to corresponding memcmp 
>> (s1, s2, n); 
> So how to you determine the non-constant string is long enough to avoid
> reading past its end?  I guess you might get that from range
> information.  But when it applies it seems reasoanble.  Again, this
> could be considered a canonicalization step.

In my current local implementation, I used the following routine to get the 
range info:  (and use the MINMAXLEN[1]+1 for the length of the non-constant 
string)

/* Determine the minimum and maximum value or string length that ARG
   refers to and store each in the first two elements of MINMAXLEN.
   For expressions that point to strings of unknown lengths that are
   character arrays, use the upper bound of the array as the maximum
   length.  For example, given an expression like 'x ? array : "xyz"'
   and array declared as 'char array[8]', MINMAXLEN[0] will be set
   to 3 and MINMAXLEN[1] to 7, the longest string that could be
   stored in array.
   Return true if the range of the string lengths has been obtained
   from the upper bound of an array at the end of a struct.  Such
   an array may hold a string that's longer than its upper bound
   due to it being used as a poor-man's flexible array member.  */

bool
get_range_strlen (tree arg, tree minmaxlen[2])
{
}

However, this routine currently miss a very obvious case as the following:

char s[100] = {'a','b','c','d’};

__builtin_strcmp(s, "abc") != 0

So, I have to change this routine to include such common case.  

do you think using this routine is good? or do you have other suggestions 
(since I am still not very familiar with the internals of GCC, might not find 
the best available one now…)

> 
> 
>> 
>>   C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
>>      if the result is NOT used to do simple equality test against zero, one 
>> of "s1" or "s2" is a small constant string, n is a constant, and the Min 
>> value of the length of the constant string and "n" is smaller than a 
>> predefined threshold T, 
>>      inline the call by a byte-to-byte comparision sequence to avoid calling 
>> overhead. 
> Also seems reasonable.
>> 
>>   A would like to be put in gimple fold phase (in routine 
>> "gimple_fold_builtin_string_compare" of gimple-fold.c

>> OK.  Note that various optimizations can expose N or one of the strings
> to be a constant.  So having it as part of the folders makes a lot of
> sense .

I have finished this part of change and sent the patch to gcc-patch alias 
already. 
https://patchwork.ozlabs.org/patch/838200/ 
<https://patchwork.ozlabs.org/patch/838200/>

Do you think it’s necessary to add the same functionality at other places, such 
as tree-ssa-strlen.c, in order to catch more cases?

> 
>>   B would like to be put in strlen phase (add new "handle_builtin_str(n)cmp" 
>> routines in tree-ssa-strlen.c)
> Which is where you're most likely to have some kind of range information
> which ISTM you need to prove the non-constant string is large enough
> that you're not reading past its end.
Yes,that’s the main reason. another reason is: memcmp != 0 optimization is 
implemented at -O2,  if we want to use this available work,
implement B at strlen phase is better.

> 
>>   C would like to be put in expand phase (tree-ssa-strlen.c or builtins.c): 
>> run-time performance testing is needed to decide the predefined threshold T. 
> If you wanted to put it into expansion, I wouldn't object -- we could
> always do experiments to see if there's any value in moving it early
> (expansion to byte comparisons could possible expose other optimizations).
earlier to where? do you have any suggestion?  I can definitely do some 
experiments. 
> 
> In general I like what you're suggesting.  And on a higher level I like
> that we're looking to rationalize where these kinds of things happen
> (compiler vs library).  It's something I've wanted to see happen for a
> long time.
Part of A and C has been implemented in glibc previously,  Wilco removed them 
from Glibc  at
  https://sourceware.org/git/?p=glibc.git;a=commit;h=f7db120f67d853e0cfa2 
<https://sourceware.org/git/?p=glibc.git;a=commit;h=f7db120f67d853e0cfa2>

as I wrote in the writeup:

The reasons to delete it from glibc and move to GCC are:
     ** when doing it in glibc, the user cannot disable it by using 
-fno-builtin-strcmp/strncmp;
     ** __builtin_strncmp cannot be replaced similarly;
     ** when compile C++ or not using glibc, the transformation is not applied.

Thanks a lot.

Qing
> 
> 
> Jeff

Reply via email to