On Wed, Feb 21, 2018 at 6:07 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Tue, Feb 13, 2018 at 5:42 AM, Masahiko Sawada <sawada.m...@gmail.com> > wrote: >>> The fdw-transactions section of the documentation seems to imply that >>> henceforth every FDW must call FdwXactRegisterForeignServer, which I >>> think is an unacceptable API break. >>> >>> It doesn't seem advisable to make this behavior depend on >>> max_prepared_foreign_transactions. I think that it should be an >>> server-level option: use 2PC for this server, or not? FDWs that don't >>> have 2PC default to "no"; but others can be set to "yes" if the user >>> wishes. But we shouldn't force that decision to be on a cluster-wide >>> basis. >> >> Since I've added a new option two_phase_commit to postgres_fdw we need >> to ask FDW whether the foreign server is 2PC-capable server or not in >> order to register the foreign server information. That's why the patch >> asked FDW to call FdwXactRegisterForeignServer. However we can >> register a foreign server information automatically by executor (e.g. >> at BeginDirectModify and at BeginForeignModify) if a foreign server >> itself has that information. We can add two_phase_commit_enabled >> column to pg_foreign_server system catalog and that column is set to >> true if the foriegn server is 2PC-capable (i.g. has enough functions) >> and user want to use it. > > I don't see why this would need a new catalog column.
I might be missing your point. As for API breaking, this patch doesn't break any existing FDWs. All new APIs I proposed are dedicated to 2PC. In other words, FDWs that work today can continue working after this patch gets committed, but if FDWs want to support atomic commit then they should be updated to use new APIs. The reason why the calling of FdwXactRegisterForeignServer is necessary is that the core code controls the foreign servers that involved with the transaction but the whether each foreign server uses 2PC option (two_phase_commit) is known only on FDW code. We can eliminate the necessity of calling FdwXactRegisterForeignServer by moving 2PC option from fdw-level to server-level in order not to enforce calling the registering function on FDWs. If we did that, the user can use the 2PC option as a server-level option. > >>> But that brings up another issue: why is MyFdwConnections named that >>> way and why does it have those semantics? That is, why do we really >>> need a list of every FDW connection? I think we really only need the >>> ones that are 2PC-capable writes. If a non-2PC-capable foreign server >>> is involved in the transaction, then we don't really to keep it in a >>> list. We just need to flag the transaction as not locally prepareable >>> i.e. clear TwoPhaseReady. I think that this could all be figured out >>> in a much less onerous way: if we ever perform a modification of a >>> foreign table, have nodeModifyTable.c either mark the transaction >>> non-preparable by setting XACT_FLAGS_FDWNOPREPARE if the foreign >>> server is not 2PC capable, or otherwise add the appropriate >>> information to MyFdwConnections, which can then be renamed to indicate >>> that it contains only information about preparable stuff. Then you >>> don't need each individual FDW to be responsible about calling some >>> new function; the core code just does the right thing automatically. >> >> I could not get this comment. Did you mean that the foreign >> transaction on not 2PC-capable foreign server should be end in the >> same way as before (i.g. by XactCallback)? >> >> Currently, because there is not FDW API to end foreign transaction, >> almost FDWs use XactCallbacks to end the transaction. But after >> introduced new FDW APIs, I think it's better to use FDW APIs to end >> transactions rather than using XactCallbacks. Otherwise we end up with >> having FDW APIs for 2PC (prepare and resolve) and XactCallback for >> ending the transaction, which would be hard to understand. So I've >> changed the foreign transaction management so that core code >> explicitly asks FDW to end/prepare a foreign transaction instead of >> ending it by individual FDWs. After introduced new FDW APIs, core code >> can have the information of all foreign servers involved with the >> transaction and call each APIs at appropriate timing. > > Well, it's one thing to introduce a new API. It's another thing to > require existing FDWs to be updated to use it. There are a lot of > existing FDWs out there, and I think that it is needlessly unfriendly > to force them all to be updated for v11 (or whenever this gets > committed) even if we think the new API is clearly better. FDWs that > work today should continue working after this patch is committed. Agreed. > Separately, I think there's a question of whether the new API is in > fact better -- I'm not sure I have a completely well-formed opinion > about that yet. I think one API should do one job. In terms of keeping simple API the current four APIs would be not bad. AFAICS other DB server that support 2PC such as MySQL, Oracle etc can be satisfied with this API. I'm thinking to remove a user mapping id from argument of three APIs (preparing, resolution and end). Because user mapping id can be found by {serverid, userid}. Also we can make get prepare-id API an optional API. That is, if FDW doesn't define this API the core code always passes px_<randam>_<serverid>_<userid> by default. For foreign server that has the short limit of prepare-id length, it need to provide the API. >> In FdwXactResolveForeignTranasction(), resolver concludes the fate of >> transaction by seeing the status of fdwxact entry and the state of >> local transaction in clog. what I need to do is making that function >> log a complaint in commit case if couldn't find the prepared >> transaction, and not do that in abort case. > > +1. > >> Also, postgres_fdw don't >> raise an error even if we could not find prepared transaction on >> foreign server because it might have been resolved by other process. > > +1. > >> But this is now responsible by FDW. I should change it to resolver >> side. That is, FDW can raise error in ordinarily way but core code >> should catch and process it. > > I don't understand exactly what you mean here. Hmm I think I got confused. My understanding is that logging a complaint in commit case and not doing that in abort case if prepared transaction doesn't exist is a core code's job. An API contract here is that FDW raise an error with ERRCODE_UNDEFINED_OBJECT error code if there is no such transaction. Since it's an expected case in abort case for the fdwxact manager, the core code can regard the error as not actual problem. So for FDWs basically they can raise an error if resolution is failed for whatever reason. But postgres_fdw doesn't raise an error in case where prepared transaction doesn't exist because in PostgreSQL prepared transaction can be ended by other process. The pseudo-code of resolution part is following. --- // Core code of foreign transaction resoliution PG_TRY(); { call API to resolve foreign transaction } PG_CATCH(); { if (sqlstate is ERRCODE_UNDEFINED_OBJECT) { if (we are committing) raise ERROR else // we are aborting raise WARNING // this is an expected result } else raise ERROR // re-throw the error } PG_END_TRY(); // postgres_fdw code of prepared transaction resolution do "COMMIT/ROLLBACK PREPARED" if (failed to resolve) { if (sqlstate is ERRCODE_UNDEFINED_OBJECT) { raise WARNING return true; // regards as succeeded } else raise ERROR // failed to resolve } return true; --- Or do you mean that FDWs should not raise an error if there is the prepared transaction, and then core code doesn't need to check sqlstate in case of error? > >> You're right. Perhaps we can deal with it by PrescanFdwXacts until >> reach consistent point, and then have vac_update_datfrozenxid check >> local xids of un-resolved fdwxact to determine the new datfrozenxid. >> Since the local xids of un-resolved fdwxacts would not be relevant >> with vacuuming, we don't need to include it to snapshot and >> GetOldestXmin etc. Also we hint to resolve fdwxact when near >> wraparound. > > I agree with you about snapshots, but I'm not sure about GetOldestXmin. Hmm, although I've thought concern in case where we don't consider local xids of un-resolved fdwxact in GetOldestXmin, I could not find problem. Could you share your concern if you have? I'll try to find a possibility based on it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center