> On Jan 25, 2017, at 3:19 PM, Andrew Sullivan <[email protected]> wrote:
>
> I am aware that the Postgres driver is currently hard-coded to LATIN1.
> This, of course, causes problems with SMTPUTF8, since the email
> addresses and so on could be in UTF8.
>
> I have a reason to need the combination, and I'm wondering whether
> there is anything standing in the way of just changing the code to set
> the encoding to UTF8 as opposed to LATIN1. Is there anything I could
> do to help? It looks to me like a trivial change in the driver code.
The reason for LATIN1 is that all raw octet strings are valid LATIN1,
so whatever non-ASCII garbage comes down the wire, database lookups
won't tempfail with query encoding errors. Absent mechanisms like
SMTPUTF8 non-ASCII data in SMTP commands is undefined, and so no
particular encoding of non-ASCII characters can be assumed.
If you promise UTF-8 encoding of pgsql queries, then something needs
to make sure that only valid UTF-8 is passed into queries. I don't
recall any code in place to restrict lookups in a given table to valid
UTF-8 inputs.
Even fancier would be dynamically adjusting the database encoding to
UTF-8 when the client includes the "SMTPUTF8" ESMTP parameter in its
"MAIL" command. Since, presumably, in that case all non-ASCII data
in the SMTP dialogue are then UTF-8 encoded (and can be validated
as such before query construction).
--
Viktor.