On Tue, Oct 11, 2022 at 07:42:43 -0400, Ben Boeckel wrote: > On Mon, Oct 10, 2022 at 17:04:09 -0400, Jason Merrill wrote: > > Can we share utf8 parsing code with decode_utf8_char in pretty-print.cc? > > I can look at factoring that out. I'll have to decode its logic to see > how much overlap there is.
There is some mismatch. First, that is in `gcc` and this is in `libcpp`. Second, `pretty-print.cc`'s implementation: - fails on an empty string; - accepts extended-length (5+-byte) encodings which are invalid Unicode; and - decodes codepoint-by-codepoint instead of just validating the entire string. --Ben