abhina.sreeskantharajan marked 2 inline comments as done. abhina.sreeskantharajan added inline comments.
================ Comment at: clang/lib/Lex/LiteralSupport.cpp:235 + Converter->convert(std::string(1, ByteChar), ResultCharConv); + memcpy((void *)&ResultChar, ResultCharConv.data(), sizeof(unsigned)); + } ---------------- tahonermann wrote: > As Richard previously noted, this `memcpy()` needs to be addressed. The > intended behavior here is not clear. Are there valid scenarios in which the > conversion will produce a sequence of more than one code units? I believe > the input is limited to ASCII characters and invalid code units (e.g., a lead > byte of a UTF-8 sequence) and in the latter case, an error and/or > substitution of a `?` (in the execution encoding) seem like acceptable > behaviors to me. I replaced memcpy with an assignment. Please let me know if there is a better solution. ================ Comment at: clang/lib/Lex/LiteralSupport.cpp:238 + void *Pointer = &ResultChar; + memcpy(Pointer, ResultCharConv.data(), sizeof(unsigned)); + } ---------------- abhina.sreeskantharajan wrote: > rsmith wrote: > > What should happen if the result doesn't fit into an `unsigned`? This also > > appears to be making problematic assumptions about the endianness of the > > host. If we really want to pack multiple bytes of encoded output into a > > single `unsigned` result value (which itself seems dubious), we should do > > so with an endianness that doesn't depend on the host. > This may be a problem we need to revisit since ResultChar is expecting a char. I added an assertion for this case where the size of the character increases after translation. I've also removed the memcpy to avoid endianness issues. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93031/new/ https://reviews.llvm.org/D93031 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits