Hello. This version makes subscriber automatically set its database encoding to clinet_encoding (if not explicitly set). And changed the behavior when pg_server_to_client returns NULL to ERROR from sending original string.
At Wed, 1 Feb 2017 08:39:41 +0000, "Shinoda, Noriyoshi" <noriyoshi.shin...@hpe.com> wrote in <at5pr84mb0084a18d3bf1d93b862e95e4ee...@at5pr84mb0084.namprd84.prod.outlook.com> > Thank you for creating patches. > I strongly hope that your patch will be merged into the new > version. Since all databases are not yet based on UTF - 8, I > think conversion of character codes is still necessary. Thanks. > -----Original Message----- > From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] > Sent: Wednesday, February 01, 2017 3:31 PM > To: Shinoda, Noriyoshi <noriyoshi.shin...@hpe.com> > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Logical Replication and Character encoding > > At Wed, 01 Feb 2017 12:13:04 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > <horiguchi.kyot...@lab.ntt.co.jp> wrote in > <20170201.121304.267734380.horiguchi.kyot...@lab.ntt.co.jp> > > > > I tried a committed Logical Replication environment. I found > > > > that replication between databases of different encodings did not > > > > convert encodings in character type columns. Is this behavior > > > > correct? > > > > > > The output plugin for subscription is pgoutput and it currently > > > doesn't consider encoding but would easiliy be added if desired > > > encoding is informed. > > > > > > The easiest (but somewhat seems fragile) way I can guess is, > > > > > > - Subscriber connects with client_encoding specification and the > > > output plugin pgoutput decide whether it accepts the encoding > > > or not. If the subscriber doesn't, pgoutput send data without > > > conversion. > > > > > > The attached small patch does this and works with the following > > > CREATE SUBSCRIPTION. > > > > Oops. It forgets to care conversion failure. It is amended in the > > attached patch. > > > > > CREATE SUBSCRIPTION sub1 CONNECTION 'host=/tmp port=5432 > > > dbname=postgres client_encoding=EUC_JP' PUBLICATION pub1; > > > > > > > > > Also we may have explicit negotiation on, for example, > > > CREATE_REPLICATION_SLOT. > > > > > > 'CREATE_REPLICATION_SLOT sub1 LOGICAL pgoutput ENCODING EUC_JP' > > > > > > Or output plugin may take options. > > > > > > 'CREATE_REPLICATION_SLOT sub1 LOGICAL pgoutput OPTIONS(encoding EUC_JP)' > > > > > > > > > Any opinions? > > This patch chokes replication when the publisher finds an inconvertible > character in a tuple to be sent. For the case, dropping-then-recreating > subscription is necessary to go forward. I think this behavior is right and inevitable in order to protect subscribers from broken strings. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
>From e3af51822d1318743e554e24163390b74b254751 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp> Date: Thu, 2 Feb 2017 09:05:20 +0900 Subject: [PATCH] Enable logical-replication between databases with different encodings Different from physical replication, logical replication may run between databases with different encodings. This patch makes subscriber set client_encoding to database encoding and publisher follow it. --- .../libpqwalreceiver/libpqwalreceiver.c | 35 +++++++++++++++++++++- src/backend/replication/logical/proto.c | 17 +++++++++++ 2 files changed, 51 insertions(+), 1 deletion(-) diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c index 44a89c7..ef38af7 100644 --- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c +++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c @@ -24,6 +24,7 @@ #include "access/xlog.h" #include "miscadmin.h" #include "pgstat.h" +#include "mb/pg_wchar.h" #include "replication/logicalproto.h" #include "replication/walreceiver.h" #include "storage/proc.h" @@ -112,11 +113,43 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, char **err) { WalReceiverConn *conn; + const char *myconninfo = conninfo; const char *keys[5]; const char *vals[5]; int i = 0; /* + * For logical replication, set database encoding as client_encoding if + * not specified in conninfo + */ + if (logical) + { + PQconninfoOption *opts = NULL; + + opts = PQconninfoParse(conninfo, NULL); + + if (opts != NULL) + { + while (opts->keyword != NULL && + strcmp(opts->keyword, "client_encoding") != 0) + opts++; + + /* add client_encoding to conninfo if not set */ + if (opts->keyword == NULL || opts->val == NULL) + { + StringInfoData s; + + /* Assuming that the memory context here is properly set */ + initStringInfo(&s); + appendStringInfoString(&s, conninfo); + appendStringInfo(&s, " client_encoding=\"%s\"", + GetDatabaseEncodingName()); + myconninfo = s.data; + } + } + } + + /* * We use the expand_dbname parameter to process the connection string (or * URI), and pass some extra options. The deliberately undocumented * parameter "replication=true" makes it a replication connection. The @@ -124,7 +157,7 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname, * "replication" for .pgpass lookup. */ keys[i] = "dbname"; - vals[i] = conninfo; + vals[i] = myconninfo; keys[++i] = "replication"; vals[i] = logical ? "database" : "true"; if (!logical) diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c index 1f30de6..97f928e 100644 --- a/src/backend/replication/logical/proto.c +++ b/src/backend/replication/logical/proto.c @@ -16,6 +16,7 @@ #include "catalog/pg_namespace.h" #include "catalog/pg_type.h" #include "libpq/pqformat.h" +#include "mb/pg_wchar.h" #include "replication/logicalproto.h" #include "utils/builtins.h" #include "utils/lsyscache.h" @@ -442,6 +443,22 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple) pq_sendbyte(out, 't'); /* 'text' data follows */ outputstr = OidOutputFunctionCall(typclass->typoutput, values[i]); + if (pg_get_client_encoding() != GetDatabaseEncoding()) + { + char *p; + + p = pg_server_to_client(outputstr, strlen(outputstr)); + + /* Error out if failed */ + if (!p) + ereport(ERROR, + (errcode(ERRCODE_UNTRANSLATABLE_CHARACTER), + errmsg("character conversion failed"))); + + pfree(outputstr); + outputstr = p; + } + len = strlen(outputstr) + 1; /* null terminated */ pq_sendint(out, len, 4); /* length */ appendBinaryStringInfo(out, outputstr, len); /* data */ -- 2.9.2
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers