On Tue, Oct 11, 2022 at 07:42:43 -0400, Ben Boeckel wrote:
> On Mon, Oct 10, 2022 at 17:04:09 -0400, Jason Merrill wrote:
> > Can we share utf8 parsing code with decode_utf8_char in pretty-print.cc?
> 
> I can look at factoring that out. I'll have to decode its logic to see
> how much overlap there is.

There is some mismatch. First, that is in `gcc` and this is in `libcpp`.
Second, `pretty-print.cc`'s implementation:

- fails on an empty string;
- accepts extended-length (5+-byte) encodings which are invalid Unicode;
  and
- decodes codepoint-by-codepoint instead of just validating the entire
  string.

--Ben

Reply via email to