cor3ntin marked an inline comment as done.
cor3ntin added a comment.

In D106215#2943653 <https://reviews.llvm.org/D106215#2943653>, @aaron.ballman 
wrote:

> In D106215#2943631 <https://reviews.llvm.org/D106215#2943631>, @cor3ntin 
> wrote:
>
>> In D106215#2943611 <https://reviews.llvm.org/D106215#2943611>, 
>> @aaron.ballman wrote:
>>
>>> I think that C and C++ should behave the same here; at least, I don't see 
>>> any reason why they should have different capabilities.
>>
>> I agree but as WG14 hasn't weighted in I didn't want to make that call.
>> What do you think?
>
> My reading of C2x is that this is implementation-defined there as well.
>
> 6.4.4.4p13:
>
> A wide character constant prefixed by the letter L has type wchar_t, an 
> integer type defined in the
> <stddef.h> header; a wide character constant prefixed by the letter u or U 
> has type char16_t or
> char32_t, respectively, unsigned integer types defined in the <uchar.h> 
> header. The value of a
> wide character constant containing a single multibyte character that maps to 
> a single member of the
> extended execution character set is the wide character corresponding to that 
> multibyte character,
> as defined by the mbtowc, mbrtoc16, or mbrtoc32 function as appropriate for 
> its type, with an
> implementation-defined current locale. The value of a wide character constant 
> containing more
> than one multibyte character or a single multibyte character that maps to 
> multiple members of
> the extended execution character set, or containing a multibyte character or 
> escape sequence not
> represented in the extended execution character set, is 
> implementation-defined.
>
> Do you agree?

Yes, I agree.
I think clang could make it ill-formed if it wanted to!
If we want to do that we could probably remove some more code :)

>>> The paper said that there is no expected code breakage from this change, 
>>> but have you tried building a diverse corpus of code (like a distro's worth 
>>> of packages) under this patch to see if anything actually breaks in 
>>> practice? (I don't expect breakage that isn't identifying an actual issue 
>>> in the code, but having some verification would be appreciated.) This would 
>>> also help to identify whether the change is appropriate for C as well.
>>
>> We have done regexes over various repositories (every vcpkg package) with no 
>> match. Not running a complete compiler
>
> Regexes are a good start but they miss the goofy (and sometimes awful) stuff 
> that people do with token pasting, line continuations, and other random 
> tricks. Would you be willing to try this as an experiment, or am I asking too 
> much? :-) My thinking is that if we don't see any breakage from compiling a 
> diverse corpus of code, we've done enough due diligence to suggest this is 
> safe for both C and C++, but if we see some breakage, we can either identify 
> that there's some valid use for this that we've not considered (less likely) 
> and would be informative for both WG21 and WG14, or we can identify that we 
> helped find bugs in real world code (more likely) which is also good feedback 
> for the committees.

Unless there is a script to do that easily, I'm not sure I'll be able to get to 
it any time soon.
But really, there is 0 use for these things! And you can't do much goofiness  
`L  ## 'ab'` certainly - but that wouldn't be very useful either


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106215/new/

https://reviews.llvm.org/D106215

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to