Re: new module c-strstr

Paul Eggert Fri, 18 Aug 2006 10:44:53 -0700

Thanks, that looks nice, but some quibbles about the comments:

> /* The functions defined in this file assume the "C" locale and a character
>    set without diacritics (ASCII-US or EBCDIC-US or something like that).
>    Even if the "C" locale on a particular system is an extension of the ASCII
>    character set (like on BeOS, where it is UTF-8, or on AmigaOS, where it
>    is ISO-8859-1), the functions in this file recognize only the ASCII
>    characters.  More precisely, one of the string arguments must be an ASCII
>    string with additional restrictions.  */


The intent here is to act like the "C", where all single bytes count
as characters, even when some other locale is in effect, right?  So
the comment is misleading, since the code doesn't assume the "C"
locale.  How about the following comment instead?

/* c_strstr behaves like strstr would behave in the "C" locale, where
   every single byte counts as a distinct character.  */

>    This function is safe to be called, even in a multibyte locale, if NEEDLE
>    ...

I think this claim isn't true for some weird non-ASCII encoding
schemes like DBCS-Host.  Also, it wouldn't be true if someone
introduced a new encoding that varies from ASCII in some other way.
How about changing the wording to be:

   In all practical encodings that we know of that are extensions or
   near-extensions of ASCII, this function is safe to be called, even
   in a multibyte locale, if NEEDLE ...

Another possibility would be to remove the claim entirely, since it's not
that relevant to the intended use of c_strstr.

> foundneedle:
>   return (char*) haystack;

The usual GNU style puts a space before the "*".

Re: new module c-strstr

Reply via email to