Viktor Dukhovni: > > > On Jan 25, 2017, at 3:19 PM, Andrew Sullivan <a...@anvilwalrusden.com> > > wrote: > > > > I am aware that the Postgres driver is currently hard-coded to LATIN1. > > This, of course, causes problems with SMTPUTF8, since the email > > addresses and so on could be in UTF8. > > > > I have a reason to need the combination, and I'm wondering whether > > there is anything standing in the way of just changing the code to set > > the encoding to UTF8 as opposed to LATIN1. Is there anything I could > > do to help? It looks to me like a trivial change in the driver code. > > The reason for LATIN1 is that all raw octet strings are valid LATIN1, > so whatever non-ASCII garbage comes down the wire, database lookups > won't tempfail with query encoding errors. Absent mechanisms like > SMTPUTF8 non-ASCII data in SMTP commands is undefined, and so no > particular encoding of non-ASCII characters can be assumed. > > If you promise UTF-8 encoding of pgsql queries, then something needs > to make sure that only valid UTF-8 is passed into queries. I don't > recall any code in place to restrict lookups in a given table to valid > UTF-8 inputs.
If the client requests SMTPUTF8, then Postfix will accept only valid UTF8 in SMTP commands. Otherwise, it remains backwards-compatible and passes on 8-bit data without any validation. > Even fancier would be dynamically adjusting the database encoding to > UTF-8 when the client includes the "SMTPUTF8" ESMTP parameter in its > "MAIL" command. Since, presumably, in that case all non-ASCII data > in the SMTP dialogue are then UTF-8 encoded (and can be validated > as such before query construction). That should work, at least for information in SMTP commands. Not sure what happens with (canonical) header rewriting, header_checks, etc. Wietse Wietse