On Fri, Apr 12, 2019 at 11:25:20AM +0200, felixs wrote: > On Thu, Apr 11, 2019 at 07:12:57PM -0500, Derek Martin wrote: > > On Sun, Apr 07, 2019 at 11:13:53PM +0200, felixs wrote: > > > On Fri, Apr 05, 2019 at 11:24:26AM -0700, Ian Zimmerman wrote: > > Thanks. I had already posted a follow-up on my first message. > > But clearly it won't help at all in this case. The problematic string > > isn't a binary representation of a unicode character. It's an HTML > > entity, and HTML entities in recipient headers is not supported by any > > of the RFCs
> Event though, call them HTML entities, call them something else, they > are ASCII characters and as such they are a subset of utf-8. Stop saying it's unicode or ASCII... That's incorrect--or at best misleading--on 2 counts: 1. ì (or ć) are indeed HTML entities, inserted by some software which clearly intended them to be interpreted as HTML entities. Referring to them as Unicode is wrong, or at best misleading. 2. The character which was meant to be represented, 'ć' [which is actually ć, not ì--I assume the last two digits got transpossed somehow...) is not ASCII at all, it's extended latin-1. Certainly the string of characters, ['&', '#', '2', '3', '6'] are all ASCII characters, but it is plainly obvious that it is meant to represent some other individual character, and if you're familiar with HTML entities, the notation makes it plain enough that it is an HTML entity. So claiming that it's UTF-8 and consequently suggesting setting the charset to UTF-8 is totally a red herring. -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02 -=-=-=-=- This message is posted from an invalid address. Replying to it will result in undeliverable mail due to spam prevention. Sorry for the inconvenience.
pgpCLlfidS2M8.pgp
Description: PGP signature