https://bugs.llvm.org/show_bug.cgi?id=39586
Bug ID: 39586
Summary: Unicode no-break space is treated in an inconsistent
way
Product: clang
Version: unspecified
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Frontend
Assignee: unassignedclangb...@nondot.org
Reporter: vincent-l...@vinc17.net
CC: llvm-bugs@lists.llvm.org, richard-l...@metafoo.co.uk
As a followup to bug 39585 (which actually is a Debian packaging bug), consider
the following program:
int a;
#if FOO
#endif
int main (void)
{
return 0;
}
where the space before "int a;" and the space between "#if" and "FOO" are
no-break spaces (U+00A0).
Under Debian/unstable:
$ clang-8 tst.c
tst.c:1:1: warning: treating Unicode character as whitespace
[-Wunicode-whitespace]
int a;
^
tst.c:3:4: warning: treating Unicode character as whitespace
[-Wunicode-whitespace]
#if FOO
^
2 warnings generated.
But with the -E option:
$ clang-8 -E tst.c
tst.c:3:4: error: invalid token at start of a preprocessor expression
#if FOO
^
# 1 "tst.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 349 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "tst.c" 2
int a;
int main (void)
{
return 0;
}
1 error generated.
The first no-break space is probably treated as whitepace, like without the -E
option, but not the second one. This is not consistent.
Previous clang versions behave in the same way.
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs