Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

Jeff Law Thu, 16 Nov 2017 17:16:17 -0800

On 11/16/2017 03:39 PM, Qing Zhao wrote:
> Hi, Jeff, 
> thanks a lot for your comments. please see my reply in below:
> 
> 
>> On Nov 16, 2017, at 12:47 PM, Jeff Law <l...@redhat.com
>> <mailto:l...@redhat.com>> wrote:
>>
>>>
>>>   B. for strncmp (s1, s2, n) (!)= 0 or strcmp (s1, s2) (!)= 0
>>>      if the result is ONLY used to do a simple equality test against
>>> zero, one of "s1" or "s2" is a small constant string, n is a
>>> constant, and the other non-constant string is guaranteed to not read
>>> beyond the end of the string:
>>>      change strncmp (s1, s2, n) or strcmp (s1, s2) to corresponding
>>> memcmp (s1, s2, n); 
>> So how to you determine the non-constant string is long enough to avoid
>> reading past its end?  I guess you might get that from range
>> information.  But when it applies it seems reasoanble.  Again, this
>> could be considered a canonicalization step.
> 
> In my current local implementation, I used the following routine to get
> the range info:  (and use the MINMAXLEN[1]+1 for the length of the
> non-constant string)
> 
> /* Determine the minimum and maximum value or string length that ARG
>    refers to and store each in the first two elements of MINMAXLEN.
>    For expressions that point to strings of unknown lengths that are
>    character arrays, use the upper bound of the array as the maximum
>    length.  For example, given an expression like 'x ? array : "xyz"'
>    and array declared as 'char array[8]', MINMAXLEN[0] will be set
>    to 3 and MINMAXLEN[1] to 7, the longest string that could be
>    stored in array.
>    Return true if the range of the string lengths has been obtained
>    from the upper bound of an array at the end of a struct.  Such
>    an array may hold a string that's longer than its upper bound
>    due to it being used as a poor-man's flexible array member.  */
> 
> bool
> get_range_strlen (tree arg, tree minmaxlen[2])
> {
> }
> 
> However, this routine currently miss a very obvious case as the following:
> 
> char s[100] = {'a','b','c','d’};
> 
> __builtin_strcmp(s, "abc") != 0
> 
> So, I have to change this routine to include such common case.  
> 
> do you think using this routine is good? or do you have other
> suggestions (since I am still not very familiar with the internals of
> GCC, might not find the best available one now…)
The range information attached to an SSA_NAME is global data.  ie, it
must hold at all locations where the object in question might be
referenced.  This implies that it will sometimes (often?) be less
precise than you might like.


I am currently working towards an embeddable context sensitive range
analyzer that in theory could be used within tree-ssa-strlen pass to
give more precise range information.  I'm hoping to wrap that work up in
the next day or so so that folks can use it in gcc-8.




> 
>>
>>
>>>
>>>   C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
>>>      if the result is NOT used to do simple equality test against
>>> zero, one of "s1" or "s2" is a small constant string, n is a
>>> constant, and the Min value of the length of the constant string and
>>> "n" is smaller than a predefined threshold T, 
>>>      inline the call by a byte-to-byte comparision sequence to avoid
>>> calling overhead. 
>> Also seems reasonable.
>>>
>>>   A would like to be put in gimple fold phase (in routine
>>> "gimple_fold_builtin_string_compare" of gimple-fold.c
> 
>>> OK.  Note that various optimizations can expose N or one of the strings
>> to be a constant.  So having it as part of the folders makes a lot of
>> sense .
> 
> I have finished this part of change and sent the patch to gcc-patch
> alias already. 
> https://patchwork.ozlabs.org/patch/838200/
> 
> Do you think it’s necessary to add the same functionality at other
> places, such as tree-ssa-strlen.c, in order to catch more cases?
For "A"?  No, I think having it in the gimple folder is fine.  Most
passes call into the folder when they make noteable changes to a
statement.  So if a later pass exposes one argument as a string constant
your code for "A" should get a chance to fold the call.

>>
>>>   B would like to be put in strlen phase (add new
>>> "handle_builtin_str(n)cmp" routines in tree-ssa-strlen.c)
>> Which is where you're most likely to have some kind of range information
>> which ISTM you need to prove the non-constant string is large enough
>> that you're not reading past its end.
> Yes,that’s the main reason. another reason is: memcmp != 0 optimization
> is implemented at -O2,  if we want to use this available work,
> implement B at strlen phase is better.
Noted.

> 
>>
>>>   C would like to be put in expand phase (tree-ssa-strlen.c or
>>> builtins.c): run-time performance testing is needed to decide the
>>> predefined threshold T. 
>> If you wanted to put it into expansion, I wouldn't object -- we could
>> always do experiments to see if there's any value in moving it early
>> (expansion to byte comparisons could possible expose other optimizations).
> earlier to where? do you have any suggestion?  I can definitely do some
> experiments. 
The biggest question in my mind is what secondary opportunities arise
when we expose the byte comparisons.  So things like if-conversion
if-combination, propagation of equality, particularly in the single byte
case, etc.

The difficulty is tracking when exposure leads to these secondary
opportunities.  I often end up looking for this kind of stuff by first
identifying many source files where the transformation applies.  Then I
generate dumps & assembly code for the transformation in each candidate
location.  I first analyze the assembly files for differences.  If an
assembly file shows a difference, then I look more closely to try and
characterize the difference and if it looks interesting, then I work
backwards to the dump files for more details.
>>
>> In general I like what you're suggesting.  And on a higher level I like
>> that we're looking to rationalize where these kinds of things happen
>> (compiler vs library).  It's something I've wanted to see happen for a
>> long time.
> Part of A and C has been implemented in glibc previously,  Wilco removed
> them from Glibc  at
>   https://sourceware.org/git/?p=glibc.git;a=commit;h=f7db120f67d853e0cfa2
I know :-)  I loosely watch the glibc lists too and have expressed
support for the glibc's team's effort to move transformations to the
points where they make the most sense.

jeff

Re: Please review writeup for fixing PR 78809 (inline strcmp for small constant strings)

Reply via email to