Peter Smith <smithpb2...@gmail.com> writes: > I had in mind something like a SHIFT-JIS encoding where a single > "character" may include some trail bytes that happen to be in the > ASCII printable range. AFAIK because the new logic is processing > bytes, not characters, I thought the end result could be a mix of > escaped and unescaped bytes for the single SJIS character.
It will not, because ... > But now looking at PostgreSQL-supported character sets [1] I saw SJIS > is not supported anyhow. Unfortunately, I am not familiar enough with > other encodings to know if there is still a chance of similar > printable ASCII trail bytes so I am fine with whatever wording is > chosen. ... trailing bytes that could be mistaken for ASCII are precisely the property that causes us to reject an encoding as not backend-safe. So this code doesn't need to consider that hazard, and processing the string byte-by-byte is perfectly OK. I'd be inclined to keep the text as simple as possible and not focus on the distinction between bytes and characters. regards, tom lane