aaron.ballman added a comment.

In D106215#2943675 <https://reviews.llvm.org/D106215#2943675>, @cor3ntin wrote:

> In D106215#2943653 <https://reviews.llvm.org/D106215#2943653>, @aaron.ballman 
> wrote:
>
>> In D106215#2943631 <https://reviews.llvm.org/D106215#2943631>, @cor3ntin 
>> wrote:
>>
>>> In D106215#2943611 <https://reviews.llvm.org/D106215#2943611>, 
>>> @aaron.ballman wrote:
>>>
>>>> I think that C and C++ should behave the same here; at least, I don't see 
>>>> any reason why they should have different capabilities.
>>>
>>> I agree but as WG14 hasn't weighted in I didn't want to make that call.
>>> What do you think?
>>
>> My reading of C2x is that this is implementation-defined there as well.
>>
>> 6.4.4.4p13:
>>
>> A wide character constant prefixed by the letter L has type wchar_t, an 
>> integer type defined in the
>> <stddef.h> header; a wide character constant prefixed by the letter u or U 
>> has type char16_t or
>> char32_t, respectively, unsigned integer types defined in the <uchar.h> 
>> header. The value of a
>> wide character constant containing a single multibyte character that maps to 
>> a single member of the
>> extended execution character set is the wide character corresponding to that 
>> multibyte character,
>> as defined by the mbtowc, mbrtoc16, or mbrtoc32 function as appropriate for 
>> its type, with an
>> implementation-defined current locale. The value of a wide character 
>> constant containing more
>> than one multibyte character or a single multibyte character that maps to 
>> multiple members of
>> the extended execution character set, or containing a multibyte character or 
>> escape sequence not
>> represented in the extended execution character set, is 
>> implementation-defined.
>>
>> Do you agree?
>
> Yes, I agree.
> I think clang could make it ill-formed if it wanted to!
> If we want to do that we could probably remove some more code :)

I don't see a reason why we'd want C and C++ to diverge here for our 
implementation, so I'd say let's do C as well. @rsmith -- do you have any 
concerns with that?

>>>> The paper said that there is no expected code breakage from this change, 
>>>> but have you tried building a diverse corpus of code (like a distro's 
>>>> worth of packages) under this patch to see if anything actually breaks in 
>>>> practice? (I don't expect breakage that isn't identifying an actual issue 
>>>> in the code, but having some verification would be appreciated.) This 
>>>> would also help to identify whether the change is appropriate for C as 
>>>> well.
>>>
>>> We have done regexes over various repositories (every vcpkg package) with 
>>> no match. Not running a complete compiler
>>
>> Regexes are a good start but they miss the goofy (and sometimes awful) stuff 
>> that people do with token pasting, line continuations, and other random 
>> tricks. Would you be willing to try this as an experiment, or am I asking 
>> too much? :-) My thinking is that if we don't see any breakage from 
>> compiling a diverse corpus of code, we've done enough due diligence to 
>> suggest this is safe for both C and C++, but if we see some breakage, we can 
>> either identify that there's some valid use for this that we've not 
>> considered (less likely) and would be informative for both WG21 and WG14, or 
>> we can identify that we helped find bugs in real world code (more likely) 
>> which is also good feedback for the committees.
>
> Unless there is a script to do that easily, I'm not sure I'll be able to get 
> to it any time soon.
> But really, there is 0 use for these things! And you can't do much goofiness  
> `L  ## 'ab'` certainly - but that wouldn't be very useful either

Okay, I'm probably making too big of an ask here. I can't think of reasonable 
code that would be broken by this change. If that turns out to be incorrect, we 
can address it when we get a real world use case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106215/new/

https://reviews.llvm.org/D106215

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to