[clang] [libcxx] [llvm] [Clang] Add warnings when mixing different charN_t types (PR #138708)

Tom Honermann via cfe-commits Wed, 14 May 2025 07:47:56 -0700

================
@@ -0,0 +1,155 @@
+// RUN: %clang_cc1 -verify -fsyntax-only -std=c++20 -Wconversion %s
+
+void c8(char8_t);
+void c16(char16_t);
+void c32(char32_t);
+
+void test(char8_t u8, char16_t u16, char32_t u32) {
+    c8(u8);
+    c8(u16); // expected-warning {{implicit conversion from 'char16_t' to 
'char8_t' may lose precision and change the meaning of the represented code 
unit}}
+    c8(u32); // expected-warning {{implicit conversion from 'char32_t' to 
'char8_t' may lose precision and change the meaning of the represented code 
unit}}
+
+    c16(u8);  // expected-warning {{implicit conversion from 'char8_t' to 
'char16_t' may change the meaning of the represented code unit}}
+    c16(u16);
+    c16(u32); // expected-warning {{implicit conversion from 'char32_t' to 
'char16_t' may lose precision and change the meaning of the represented code 
unit}}
+
+    c32(u8);  // expected-warning {{implicit conversion from 'char8_t' to 
'char32_t' may change the meaning of the represented code unit}}
+    c32(u16); // expected-warning {{implicit conversion from 'char16_t' to 
'char32_t' may change the meaning of the represented code unit}}
+    c32(u32);
+
+
+    c8(char32_t(0x7f));
+    c8(char32_t(0x80));   // expected-warning {{implicit conversion from 
'char32_t' to 'char8_t' changes the meaning of the codepoint '<U+0080>'}}
+
+    c8(char16_t(0x7f));
+    c8(char16_t(0x80));   // expected-warning {{implicit conversion from 
'char16_t' to 'char8_t' changes the meaning of the codepoint '<U+0080>'}}
+    c8(char16_t(0xD800)); // expected-warning {{implicit conversion from 
'char16_t' to 'char8_t' changes the meaning of the code unit '<0xD800>'}}
+    c8(char16_t(0xE000)); // expected-warning {{implicit conversion from 
'char16_t' to 'char8_t' changes the meaning of the codepoint '<U+E000>'}}
+
+
+    c16(char32_t(0x7f));
+    c16(char32_t(0x80));
+    c16(char32_t(0xD7FF));
+    c16(char32_t(0xD800)); // expected-warning {{implicit conversion from 
'char32_t' to 'char16_t' changes the meaning of the code unit '<0xD800>'}}
+    c16(char32_t(0xE000));
+    c16(char32_t(U'🐉')); // expected-warning {{implicit conversion from 
'char32_t' to 'char16_t' changes the meaning of the codepoint '🐉'}}
----------------
tahonermann wrote:


Just thinking about this some more. The presentation will display an invalid 
code unit or code point value as `'<0xXXXX>'`, an unprintable character as 
`'<U+XXXX>'`, and a printable character as, e.g., `'🐉'`. Perhaps those 
distinctions suffice to drop "the code point"? For example:
```
c16(char32_t(U'🐉')); // expected-warning {{implicit conversion from 'char32_t' 
to 'char16_t' changes the meaning of  '🐉'}}
```
We could also present both the source value and the post conversion result to 
make it more explicit how the meaning changes. For example:
```
c16(char32_t(U'🐉')); // expected-warning {{implicit conversion from 'char32_t' 
to 'char16_t' changes the meaning of  '🐉' to '<U+F409>'}}
```
Or, if `char16_t` happens to be larger than 16 bits (I'm not sure we should 
care about this case though):
```
c16(char32_t(U'🐉')); // expected-warning {{implicit conversion from 'char32_t' 
to 'char16_t' changes the meaning of  '🐉' to '<0x1F409>'}}
```

https://github.com/llvm/llvm-project/pull/138708
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libcxx] [llvm] [Clang] Add warnings when mixing different charN_t types (PR #138708)

Reply via email to