On Tue, Oct 6, 2020 at 10:52 PM Masahiko Sawada <masahiko.saw...@2ndquadrant.com> wrote: > > On Fri, 2 Oct 2020 at 18:20, tsunakawa.ta...@fujitsu.com > <tsunakawa.ta...@fujitsu.com> wrote: > > > > From: Masahiko Sawada <masahiko.saw...@2ndquadrant.com> > > > You proposed the first idea > > > to avoid such a situation that FDW implementor can write the code > > > while trying to reduce the possibility of errors happening as much as > > > possible, for example by usingpalloc_extended(MCXT_ALLOC_NO_OOM) and > > > hash_search(HASH_ENTER_NULL) but I think it's not a comprehensive > > > solution. They might miss, not know it, or use other functions > > > provided by the core that could lead an error. > > > > We can give the guideline in the manual, can't we? It should not be > > especially difficult for the FDW implementor compared to other Postgres's > > extensibility features that have their own rules -- table/index AM, > > user-defined C function, trigger function in C, user-defined data types, > > hooks, etc. And, the Postgres functions that the FDW implementor would use > > to implement their commit will be very limited, won't they? Because most > > of the commit processing is performed in the resource manager's library > > (e.g. Oracle and MySQL client library.) > > Yeah, if we think FDW implementors properly implement these APIs while > following the guideline, giving the guideline is a good idea. But I’m > not sure all FDW implementors are able to do that and even if the user > uses an FDW whose transaction APIs don’t follow the guideline, the > user won’t realize it. IMO it’s better to design the feature while not > depending on external programs for reliability (correctness?) of this > feature, although I might be too worried. >
After more thoughts on Tsunakawa-san’s idea it seems to need the following conditions: * At least postgres_fdw is viable to implement these APIs while guaranteeing not to happen any error. * A certain number of FDWs (or majority of FDWs) can do that in a similar way to postgres_fdw by using the guideline and probably postgres_fdw as a reference. These are necessary for FDW implementors to implement APIs while following the guideline and for the core to trust them. As far as postgres_fdw goes, what we need to do when committing a foreign transaction resolution is to get a connection from the connection cache or create and connect if not found, construct a SQL query (COMMIT/ROLLBACK PREPARED with identifier) using a fixed-size buffer, send the query, and get the result. The possible place to raise an error is limited. In case of failures such as connection error FDW can return false to the core along with a flag indicating to ask the core retry. Then the core will retry to resolve foreign transactions after some sleep. OTOH if FDW sized up that there is no hope of resolving the foreign transaction, it also could return false to the core along with another flag indicating to remove the entry and not to retry. Also, the transaction resolution by FDW needs to be cancellable (interruptible) but cannot use CHECK_FOR_INTERRUPTS(). Probably, as Tsunakawa-san also suggested, it’s not impossible to implement these APIs in postgres_fdw while guaranteeing not to happen any error, although not sure the code complexity. So I think the first condition may be true but not sure about the second assumption, particularly about the interruptible part. I thought we could support both ideas to get their pros; supporting Tsunakawa-san's idea and then my idea if necessary, and FDW can choose whether to ask the resolver process to perform 2nd phase of 2PC or not. But it's not a good idea in terms of complexity. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/