Wietse wrote:
Mark Martinec:
Btw, amavisd since 2.10.0 converts ACE domain names to UTF-8
for presentation purposes (logging, JSON structured report,
DNS and admin notfications), and encodes non-ASCII UTF-8 domains
in sender and recipient addresses into ACE if the next hop MTA
(e.g. back-end postfix) does *not* announce support for SMTPUTF8.

What about non-ASCII text in Subject: and other headers, including
the user name in ``"User Name" <u...@example.com>'', are you
converting these according to RFC 2047, or is the above specifically
for email addresses?

Yes, these are converted to UTF-8 as well.

More specifically:

- logging (syslog or file) is nominally UTF-8, meaning that
best effort is made to decode mail addresses and mail header
content into UTF-8; there is no guarantee however that other
non-ASCII junk (other than C0 controls) won't make it into log,
i.e. garbage-in / garbage-out principle applies for the log
in the interest of preserving as much of original data as
sensible and safe.
  It is expected that syslogd will not clobber UTF-8 (i.e. a
command line option -8 is needed for FreeBSD's syslog, otherwise
C1 controls end up as ugly "protected" controls in a syslog file.

- structured log in JSON is strictly UTF-8, as demanded by
JSON specs; non-ASCII non-UTF-8 junk from mail is interpreted
as Latin-1;

- data from mail header or envelope that ends up in DSN
is strictly UTF-8, potential non-ASCII non-UTF-8 garbage is
interpreted as Latin-1;

- RFC 2047 MIME encoding in Subject and From header fields is
converted to UTF-8 for presentation purposes (top-level logging,
JSON structured log);

- ACE domain names in Message-ID and From header fields are
decoded into UTF-8 for logging and JSON. (Other "Display names"
from similar fields are currently not logged);

- ACE domain names in sender and recipient envelope addresses
is decoded into UTF-8 for logging and JSON;

- UTF-8 domain names in sender and recipient envelope addresses
is converted to ACE when passing mail to an MTA which does not
announce SMTPUTF8. There are no changes to passed mail other
than this one.

  Mark

Reply via email to