| Issue |
97741
|
| Summary |
-Winvalid-token-paste fails to catch UCNs which are invalid preprocessor tokens
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
jeffgarrett
|
Consider ([godbolt link](https://godbolt.org/z/c1sGTM6WK)):
```cpp
#define X \\
#define U(x) x ## u0000
#define Y(x) U(x)
#define Z(x) #x
#define W(x) Z(x)
const char str[] = W(Y(X));
```
clang preprocesses it to `const char str[] = "\u0000";` but the UCN `\u0000` is an invalid as a preprocessor token per [\[lex.pptoken/2\]](https://eel.is/c++draft/lex.pptoken#2). This is conforming because this is preprocessor UB per [\[cpp.concat/3\]](https://eel.is/c++draft/cpp.concat#3).
(At least that's how I interpret it... The former allows it in the grammar production, and declares it ill-formed if the production would be matched. Is a character matching this production "valid" as a preprocessing token, as it is used in the latter? I think it could also be read that this is a valid preprocessing token and can thus be produced fleetingly by pasting but would be ill-formed if written directly.)
There is divergence. The godbolt link shows gcc gives the same error as it gives if one directly writes the UCN in the source.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs