On Fri, 2 Oct 2020 at 18:20, tsunakawa.ta...@fujitsu.com <tsunakawa.ta...@fujitsu.com> wrote: > > From: Masahiko Sawada <masahiko.saw...@2ndquadrant.com> > > You proposed the first idea > > to avoid such a situation that FDW implementor can write the code > > while trying to reduce the possibility of errors happening as much as > > possible, for example by usingpalloc_extended(MCXT_ALLOC_NO_OOM) and > > hash_search(HASH_ENTER_NULL) but I think it's not a comprehensive > > solution. They might miss, not know it, or use other functions > > provided by the core that could lead an error. > > We can give the guideline in the manual, can't we? It should not be > especially difficult for the FDW implementor compared to other Postgres's > extensibility features that have their own rules -- table/index AM, > user-defined C function, trigger function in C, user-defined data types, > hooks, etc. And, the Postgres functions that the FDW implementor would use > to implement their commit will be very limited, won't they? Because most of > the commit processing is performed in the resource manager's library (e.g. > Oracle and MySQL client library.)
Yeah, if we think FDW implementors properly implement these APIs while following the guideline, giving the guideline is a good idea. But I’m not sure all FDW implementors are able to do that and even if the user uses an FDW whose transaction APIs don’t follow the guideline, the user won’t realize it. IMO it’s better to design the feature while not depending on external programs for reliability (correctness?) of this feature, although I might be too worried. > > > > Another idea is to use > > PG_TRY() and PG_CATCH(). IIUC with this idea, FDW implementor catches > > an error but ignores it rather than rethrowing by PG_RE_THROW() in > > order to return the control to the core after an error. I’m really not > > sure it’s a correct usage of those macros. In addition, after > > returning to the core, it will retry to resolve the same or other > > foreign transactions. That is, after ignoring an error, the core needs > > to continue working and possibly call transaction callbacks of other > > FDW implementations. > > No, not ignore the error. The FDW can emit a WARNING, LOG, or NOTICE > message, and return an error code to TM. TM can also emit a message like: > > WARNING: failed to commit part of a transaction on the foreign server 'XXX' > HINT: The server continues to try committing the remote transaction. > > Then TM asks the resolver to take care of committing the remote transaction, > and acknowledge the commit success to the client. It seems like if failed to resolve, the backend would return an acknowledgment of COMMIT to the client and the resolver process resolves foreign prepared transactions in the background. So we can ensure that the distributed transaction is completed at the time when the client got an acknowledgment of COMMIT if 2nd phase of 2PC is successfully completed in the first attempts. OTOH, if it failed for whatever reason, there is no such guarantee. From an optimistic perspective, i.g., the failures are unlikely to happen, it will work well but IMO it’s not uncommon to fail to resolve foreign transactions due to network issue, especially in an unreliable network environment for example geo-distributed database. So I think it will end up requiring the client to check if preceding distributed transactions are completed or not in order to see the results of these transactions. We could retry the foreign transaction resolution before leaving it to the resolver process but the problem that the core continues trying to resolve foreign transactions without neither transaction aborting and rethrowing even after an error still remains. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services