Question for the experts. Let's take the following example: ----->8------------->8-------------------- #include <stdio.h> #include <string.h> #include <wchar.h>
#define period 0x2e #define question 0x3f #define exclam 0x21 #define ellipsis L'\u2026' const wchar_t p[] = { period, question, exclam, ellipsis }; int main() { const wchar_t s[] = L". Hello."; printf("%ls\n", s); printf("%lu\n", wcsspn(s, p)); return 0; } -------------8<-----------8<---------------- Now run: $ cc -Wall example.c -o example && ./example . Hello. 8 $ egcc -Wall example.c -o example && ./example . Hello. 1 As you see, compiled with GCC the program does what is expected. To get the desired result with CLANG you have to write the string literally. Change the declaration of p[] above to: const wchar_t p[] = L".?!?"; ^ This is a UTF-8 ellipsis. And now: $ cc -Wall example.c -o example && ./example . Hello. 1 Using only ASCII or only UTF-8 in the array also works. Is this a bug in clang's wcsspn() or I'm wrong in assuming that the array can be declared in the way I did? -- Walter