At Fri, 9 Oct 2020 02:33:37 +0000, "tsunakawa.ta...@fujitsu.com" 
<tsunakawa.ta...@fujitsu.com> wrote in 
> From: Masahiko Sawada <masahiko.saw...@2ndquadrant.com>
> > What about temporary network failures? I think there are users who
> > don't want to give up resolving foreign transactions failed due to a
> > temporary network failure. Or even they might want to wait for
> > transaction completion until they send a cancel request. If we want to
> > call the commit routine only once and therefore want FDW to retry
> > connecting the foreign server within the call, it means we require all
> > FDW implementors to write a retry loop code that is interruptible and
> > ensures not to raise an error, which increases difficulty.
> >
> > Yes, but if we don’t retry to resolve foreign transactions at all on
> > an unreliable network environment, the user might end up requiring
> > every transaction to check the status of foreign transactions of the
> > previous distributed transaction before starts. If we allow to do
> > retry, I guess we ease that somewhat.
> 
> OK.  As I said, I'm not against trying to cope with temporary network 
> failure.  I just don't think it's mandatory.  If the network failure is 
> really temporary and thus recovers soon, then the resolver will be able to 
> commit the transaction soon, too.

I should missing something, though...

I don't understand why we hate ERRORs from fdw-2pc-commit routine so
much. I think remote-commits should be performed before local commit
passes the point-of-no-return and the v26-0002 actually places
AtEOXact_FdwXact() before the critical section.

(FWIW, I think remote commits should be performed by backends, not by
another process, because backends should wait for all remote-commits
to end anyway and it is simpler. If we want to multiple remote-commits
in parallel, we could do that by adding some async-waiting interface.)

> Then, we can have a commit retry timeout or retry count like the following 
> WebLogic manual says.  (I couldn't quickly find the English manual, so below 
> is in Japanese.  I quoted some text that got through machine translation, 
> which appears a bit strange.)
> 
> https://docs.oracle.com/cd/E92951_01/wls/WLJTA/trxcon.htm
> --------------------------------------------------
> Abandon timeout
> Specifies the maximum time (in seconds) that the transaction manager attempts 
> to complete the second phase of a two-phase commit transaction.
> 
> In the second phase of a two-phase commit transaction, the transaction 
> manager attempts to complete the transaction until all resource managers 
> indicate that the transaction is complete. After the abort transaction timer 
> expires, no attempt is made to resolve the transaction. If the transaction 
> enters a ready state before it is destroyed, the transaction manager rolls 
> back the transaction and releases the held lock on behalf of the destroyed 
> transaction.
> --------------------------------------------------

That's not a retry timeout but a timeout for total time of all
2nd-phase-commits.  But I think it would be sufficient.  Even if an
fdw could retry 2pc-commit, it's a matter of that fdw and the core has
nothing to do with.

> > Also, what if the user sets the statement timeout to 60 sec and they
> > want to cancel the waits after 5 sec by pressing ctl-C? You mentioned
> > that client libraries of other DBMSs don't have asynchronous execution
> > functionality. If the SQL execution function is not interruptible, the
> > user will end up waiting for 60 sec, which seems not good.

I think fdw-2pc-commit can be interruptible safely as far as we run
the remote commits before entring critical section of local commit.

> FDW functions can be uninterruptible in general, aren't they?  We experienced 
> that odbc_fdw didn't allow cancellation of SQL execution.

At least postgres_fdw is interruptible while waiting the remote.

create view lt as select 1 as slp from (select pg_sleep(10)) t;
create foreign table ft(slp int) server sv1 options (table_name 'lt');
select * from ft;
^CCancel request sent
ERROR:  canceling statement due to user request

regrds.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Reply via email to