On 11/24/18 4:12 PM, Benny Pedersen wrote:
> André Rodier skrev den 2018-11-24 15:41:
>> -------------------------------------------------------
>> From     =?utf-8?q?Andr=C3=A9?= Rodier <an...@rodier.me>
>> To     =?utf-8?q?Andr=C3=A9?= Rodier <an...@rodier.me>
>> -------------------------------------------------------
>
> is unicode done first ?, if so quoted-printable encoding only see 7bit
> content
>
> or is quoted-printable encoding done first so unicode only encode 7bit
>
> both loose then, and many clients fail to do the decode parts of it
>
> same problem exists in html emails, if it just was html we did not
> need unicode or quoted-printable encoding at all
>
> i find unicode a joke :)
>
UTF-8  is not an 'encoding' per the mail RFC, but a character set. UTF-8
is a character set, so the e with accent at the end of André becomes 2
8-bit characters for QP, which is the =C3=A9 in the name. Encodings in
the mail RFC is a method to convert 8 bit messages into something 7 bit
clean.

Yes, if the only characters in the message can all be expressed in an 8
bit character set then you could make the message a bit smaller using
that instead of UTF-8, since the upper characters will only need a
single encoding instead of 2. If you might be using characters beyond an
8-bit character set, then UTF-8 is the best way to go.

Message bodies are a bit simpler, as you use a single content encoding
header for the whole body, and a lot of systems actually are 8-bit clean
so you can just use encoding 8bit, and let a MTA make the translation if
it hits a hop that isn't 8 bit clean.

Unicode isn't prefect, but it is the best we have for a single global
character set. If you mostly just need the basic Latin characters, it is
much more than you need, but once you need things beyond that it shows
its abilities.

-- 
Richard Damon

Reply via email to