Thanks, that looks nice, but some quibbles about the comments: > /* The functions defined in this file assume the "C" locale and a character > set without diacritics (ASCII-US or EBCDIC-US or something like that). > Even if the "C" locale on a particular system is an extension of the ASCII > character set (like on BeOS, where it is UTF-8, or on AmigaOS, where it > is ISO-8859-1), the functions in this file recognize only the ASCII > characters. More precisely, one of the string arguments must be an ASCII > string with additional restrictions. */
The intent here is to act like the "C", where all single bytes count as characters, even when some other locale is in effect, right? So the comment is misleading, since the code doesn't assume the "C" locale. How about the following comment instead? /* c_strstr behaves like strstr would behave in the "C" locale, where every single byte counts as a distinct character. */ > This function is safe to be called, even in a multibyte locale, if NEEDLE > ... I think this claim isn't true for some weird non-ASCII encoding schemes like DBCS-Host. Also, it wouldn't be true if someone introduced a new encoding that varies from ASCII in some other way. How about changing the wording to be: In all practical encodings that we know of that are extensions or near-extensions of ASCII, this function is safe to be called, even in a multibyte locale, if NEEDLE ... Another possibility would be to remove the claim entirely, since it's not that relevant to the intended use of c_strstr. > foundneedle: > return (char*) haystack; The usual GNU style puts a space before the "*".