On Mon, Sep 04, 2023 at 11:14:41PM -0600, Jeff Law wrote:
> 
> 
> On 9/4/23 14:58, Hamza Mahfooz wrote:
> > Currently, we give up in fold_strstr_to_strncmp() if the length of the
> > the second argument to strstr() isn't known to us by the time we hit
> > that function. However, we can instead insert a strlen() in ourselves
> > and continue trying to fold strstr() into strlen()+strncmp().
> > 
> >     PR tree-optimization/96601
> > 
> > gcc/ChangeLog:
> > 
> >     * tree-ssa-strlen.cc (fold_strstr_to_strncmp): If arg1_len == NULL,
> >     insert a strlen() for strstr()'s arg1 and use it as arg1_len.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> >     * gcc.dg/strlenopt-30.c: Modify test.
> I'm not sure it's necessarily a win to convert to strncmp as aggressively as
> this patch would do.  Essentially when you have large needles that are
> partially matched repeatedly, performance can significantly suffer.

For -Os/-Oz I think this is never a desirable optimization, turning one call
into 2 with all the argument setup etc. (unless we have the length constant
or already computed).

Otherwise, consider say 2GB long needle, it will take quite long to even
compute strlen of that.
Now, looking at current glibc strstr implementation, it starts with
  /* Handle short needle special cases first.  */
  if (ne[0] == '\0')
    return (char *)hs;
  hs = (const unsigned char *)strchr ((const char*)hs, ne[0]);
  if (hs == NULL || ne[1] == '\0')
    return (char*)hs;
  if (ne[2] == '\0')
    return strstr2 (hs, ne);
  if (ne[3] == '\0')
    return strstr3 (hs, ne);

  /* Ensure haystack length is at least as long as needle length.
     Since a match may occur early on in a huge haystack, use strnlen
     and read ahead a few cachelines for improved performance.  */
  size_t ne_len = strlen ((const char*)ne);
  size_t hs_len = __strnlen ((const char*)hs, ne_len | 512);
So, if needle is very long but first character of the needle doesn't
appear in haystack at all and haystack is shorter than needle, this will also
not be desirable optimization, because strstr would just return NULL after
walking haystack, while with the optimization you need to compute the
length.  Otherwise I think the optimization is desirable, because typically
haystack is longer than needle and walking it completely using strchr will
be already more expensive than strlen on needle and otherwise strstr
computes the strlen anyway later.  But perhaps if strlen isn't known it
might be better to guard the strlen + strncmp on inline comparison of the
first character, so that one rules out the above mentioned special case,
so that we won't compute the strlen unnecessarily at least in that case.
Still strstr (32b_string, 2147483647b_string) == 32b_string will be
serious slowdown if 32b_string[0] == 2147483647b_string[0], but perhaps the
common case is more important.

Or do we want to add to the C library some kind of asymetric strcmp,
which will be equivalent to strncmp (p1, p2, strlen (p2)) but will actually
not compute the strlen but only compare the characters and just handle the
case where p2 has as the first different character '\0' as return 0 rather
than difference?

        Jakub

Reply via email to