I wrote: > I think it's time for me to report a glibc bug on strstr and strcasestr, > then...
Paul Eggert wrote: > But now that you mention it, why is there a c-strstr module, or a > fancy strstr replacement that looks at multibyte characters? The situation is indeed a bit messy. Since <ctype.h>, strtod, strtold are locale dependent, but sometimes one needs the locale independent functionality, so we added c-ctype, c-strtod, c-strtold. It thought this could be extended to more str* functions easily, but the situation is not so easy. The problematic modules are: - strstr: This function's behaviour is not clearly defined. POSIX says that it compares a "string" with a "sequence of bytes". Which a priori is nonsense, since the elements of strings are characters. - strcase (strcasecmp, strncasecmp): Here POSIX talks about two strings, but doesn't mention LC_CTYPE explicitly. Rather it says the results are "unspecified" in real locales. Also strncasecmp does not make sense for multibyte locales. - strcasestr: This function is not specified by POSIX. All known legacy implementations do not care about multibyte locales. It was tempting to make a clear API nomenclature: c-str* for the C locale emulation, str* for the internationalized functions. But if you're right with strstr, then we should find new names for the internationalized versions of these functions. Bruno