On 11/24/18 4:12 PM, Benny Pedersen wrote: > André Rodier skrev den 2018-11-24 15:41: >> ------------------------------------------------------- >> From =?utf-8?q?Andr=C3=A9?= Rodier <an...@rodier.me> >> To =?utf-8?q?Andr=C3=A9?= Rodier <an...@rodier.me> >> ------------------------------------------------------- > > is unicode done first ?, if so quoted-printable encoding only see 7bit > content > > or is quoted-printable encoding done first so unicode only encode 7bit > > both loose then, and many clients fail to do the decode parts of it > > same problem exists in html emails, if it just was html we did not > need unicode or quoted-printable encoding at all > > i find unicode a joke :) > UTF-8 is not an 'encoding' per the mail RFC, but a character set. UTF-8 is a character set, so the e with accent at the end of André becomes 2 8-bit characters for QP, which is the =C3=A9 in the name. Encodings in the mail RFC is a method to convert 8 bit messages into something 7 bit clean.
Yes, if the only characters in the message can all be expressed in an 8 bit character set then you could make the message a bit smaller using that instead of UTF-8, since the upper characters will only need a single encoding instead of 2. If you might be using characters beyond an 8-bit character set, then UTF-8 is the best way to go. Message bodies are a bit simpler, as you use a single content encoding header for the whole body, and a lot of systems actually are 8-bit clean so you can just use encoding 8bit, and let a MTA make the translation if it hits a hop that isn't 8 bit clean. Unicode isn't prefect, but it is the best we have for a single global character set. If you mostly just need the basic Latin characters, it is much more than you need, but once you need things beyond that it shows its abilities. -- Richard Damon