> On Aug 23, 2018, at 6:39 PM, Geir Pedersen <geir.peder...@gmail.com> wrote:
> 
> The dictionary interface to Postgresql found in src/global/dict_pgsql.c does 
> not support UTF8. It explicitly telles the database that Postfix will send 
> LATIN1.

Absent any indication of character set from the client, there was
no way to know what encoding any particular non-ASCII octet
string may be using, so the code was optimized to avoid spurious
database string conversion errors, by using an encoding that
would accept any octet-string, garbage-in -> garbage-out.

> With SMTPUTF8 support now in place, Postfix may try to look up addresses with 
> UTF8 in the local part in PostgreSQL virtual mailbox maps. 

While Postfix now supports UTF8, it is not always enabled, and
even when enabled not always used by the client.  So using
UTF-8 on the database connection may not always be appropriate.

Ideally we'd only use UTF-8 when the client indicates that is
using SMTPUTF8:

  MAIL FROM:<envelope-sen...@example.com> BODY=8BITMIME SMTPUTF8

> Such lookups now fail as the UTF8 sent by Postfix is taken as LATIN1
> by PostgreSQL. Error message from Postfix:
> "Recipient address rejected: User unknown in virtual mailbox table"

This means that we'd a way to dynamically update the client
encoding of the database connection to UTF8 when appropriate
and revert it LATIN1 when the client encoding is unspecified.

And this needs to work across the proxymap protocol.  So while
the change you're proposing is well motivated, I am not sure
that the solution is as simple as you propose.  We'd need to
add a query-time UTF8 flag to the low-level dictionary lookup
methods, implement the higher-level lookups on top of a default
octet-string (LATIN1 if you like) encoding, and add new functions
that perform similar lookups on UTF8 data.

The PostgresSQL driver would then export a function to switch
the client connection to UTF8 (assuming the encoding can be
changed on the fly between queries).

-- 
-- 
        Viktor.

Reply via email to