On 2020-09-08 14:48, Fujii Masao wrote:
On 2020/09/08 19:36, Alexey Kondratov wrote:
On 2020-09-08 05:49, Fujii Masao wrote:
On 2020/09/05 3:31, Alexey Kondratov wrote:
Attached is a patch, which implements a plain 2PC in the
postgres_fdw and adds a GUC 'postgres_fdw.use_twophase'. Also it
solves these errors handling issues above and tries to add proper
comments everywhere. I think, that 0003 should be rebased on the top
of it, or it could be a first patch in the set, since it may be used
independently. What do you think?
Thanks for the patch!
Sawada-san was proposing another 2PC patch at [1]. Do you have any
thoughts
about pros and cons between your patch and Sawada-san's?
[1]
https://www.postgresql.org/message-id/ca+fd4k4z6_b1etevqamwqhu4rx7xsrn5orl7ohj4b5b6sw-...@mail.gmail.com
Thank you for the link!
After a quick look on the Sawada-san's patch set I think that there
are two major differences:
Thanks for sharing your thought! As far as I read your patch quickly,
I basically agree with your this view.
1. There is a built-in foreign xacts resolver in the [1], which should
be much more convenient from the end-user perspective. It involves
huge in-core changes and additional complexity that is of course worth
of.
However, it's still not clear for me that it is possible to resolve
all foreign prepared xacts on the Postgres' own side with a 100%
guarantee. Imagine a situation when the coordinator node is actually a
HA cluster group (primary + sync + async replica) and it failed just
after PREPARE stage of after local COMMIT. In that case all foreign
xacts will be left in the prepared state. After failover process
complete synchronous replica will become a new primary. Would it have
all required info to properly resolve orphan prepared xacts?
IIUC, yes, the information required for automatic resolution is
WAL-logged and the standby tries to resolve those orphan transactions
from WAL after the failover. But Sawada-san's patch provides
the special function for manual resolution, so there may be some cases
where manual resolution is necessary.
I've found a note about manual resolution in the v25 0002:
+After that we prepare all foreign transactions by calling
+PrepareForeignTransaction() API. If we failed on any of them we change
to
+rollback, therefore at this time some participants might be prepared
whereas
+some are not prepared. The former foreign transactions need to be
resolved
+using pg_resolve_foreign_xact() manually and the latter ends
transaction
+in one-phase by calling RollbackForeignTransaction() API.
but it's not yet clear for me.
Implementing 2PC feature only inside postgres_fdw seems to cause
another issue; COMMIT PREPARED is issued to the remote servers
after marking the local transaction as committed
(i.e., ProcArrayEndTransaction()).
According to the Sawada-san's v25 0002 the logic is pretty much the same
there:
+2. Pre-Commit phase (1st phase of two-phase commit)
+3. Commit locally
+Once we've prepared all of them, commit the transaction locally.
+4. Post-Commit Phase (2nd phase of two-phase commit)
Brief look at the code confirms this scheme. IIUC, AtEOXact_FdwXact /
FdwXactParticipantEndTransaction happens after ProcArrayEndTransaction()
in the CommitTransaction(). Thus, I don't see many difference between
these approach and CallXactCallbacks() usage regarding this point.
Is this safe? This issue happens
because COMMIT PREPARED is issued via
CallXactCallbacks(XACT_EVENT_COMMIT) and that CallXactCallbacks()
is called after ProcArrayEndTransaction().
Once the transaction is committed locally any ERROR (or higher level
message) will be escalated to PANIC. And I do see possible ERROR level
messages in the postgresCommitForeignTransaction() for example:
+ if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ ereport(ERROR, (errmsg("could not commit transaction on server
%s",
+
frstate->server->servername)));
I don't think that it's very convenient to get a PANIC every time we
failed to commit one of the prepared foreign xacts, since it could be
not so rare in the distributed system. That's why I tried to get rid of
possible ERRORs as far as possible in my proposed patch.
Regards
--
Alexey Kondratov
Postgres Professional https://www.postgrespro.com
Russian Postgres Company