Viktor Dukhovni:
> 
> > On Jan 25, 2017, at 3:19 PM, Andrew Sullivan <a...@anvilwalrusden.com> 
> > wrote:
> > 
> > I am aware that the Postgres driver is currently hard-coded to LATIN1.
> > This, of course, causes problems with SMTPUTF8, since the email
> > addresses and so on could be in UTF8.
> > 
> > I have a reason to need the combination, and I'm wondering whether
> > there is anything standing in the way of just changing the code to set
> > the encoding to UTF8 as opposed to LATIN1.  Is there anything I could
> > do to help?  It looks to me like a trivial change in the driver code.
> 
> The reason for LATIN1 is that all raw octet strings are valid LATIN1,
> so whatever non-ASCII garbage comes down the wire, database lookups
> won't tempfail with query encoding errors.  Absent mechanisms like
> SMTPUTF8 non-ASCII data in SMTP commands is undefined, and so no
> particular encoding of non-ASCII characters can be assumed.
> 
> If you promise UTF-8 encoding of pgsql queries, then something needs
> to make sure that only valid UTF-8 is passed into queries.  I don't
> recall any code in place to restrict lookups in a given table to valid
> UTF-8 inputs.

If the client requests SMTPUTF8, then Postfix will accept only valid
UTF8 in SMTP commands.  Otherwise, it remains backwards-compatible
and passes on 8-bit data without any validation.

> Even fancier would be dynamically adjusting the database encoding to
> UTF-8 when the client includes the "SMTPUTF8" ESMTP parameter in its
> "MAIL" command.  Since, presumably, in that case all non-ASCII data
> in the SMTP dialogue are then UTF-8 encoded (and can be validated
> as such before query construction).

That should work, at least for information in SMTP commands.  Not
sure what happens with (canonical) header rewriting, header_checks,
etc.

        Wietse

        Wietse

Reply via email to