On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote: > I might be missing your point. As for API breaking, this patch doesn't > break any existing FDWs. All new APIs I proposed are dedicated to 2PC. > In other words, FDWs that work today can continue working after this > patch gets committed, but if FDWs want to support atomic commit then > they should be updated to use new APIs. The reason why the calling of > FdwXactRegisterForeignServer is necessary is that the core code > controls the foreign servers that involved with the transaction but > the whether each foreign server uses 2PC option (two_phase_commit) is > known only on FDW code. We can eliminate the necessity of calling > FdwXactRegisterForeignServer by moving 2PC option from fdw-level to > server-level in order not to enforce calling the registering function > on FDWs. If we did that, the user can use the 2PC option as a > server-level option.
Well, FdwXactRegisterForeignServer has a "bool two_phase_commit" argument. If it only needs to be called by FDWs that support 2PC, then that argument is unnecessary. If it needs to be called by all FDWs, then it breaks existing FDWs that don't call it. >>> But this is now responsible by FDW. I should change it to resolver >>> side. That is, FDW can raise error in ordinarily way but core code >>> should catch and process it. >> >> I don't understand exactly what you mean here. > > Hmm I think I got confused. My understanding is that logging a > complaint in commit case and not doing that in abort case if prepared > transaction doesn't exist is a core code's job. An API contract here > is that FDW raise an error with ERRCODE_UNDEFINED_OBJECT error code if > there is no such transaction. Since it's an expected case in abort > case for the fdwxact manager, the core code can regard the error as > not actual problem. In general, it's not safe to catch an error and continue unless you protect the code that throws the error by a sub-transaction. That means we shouldn't expect the FDW to throw an error when the prepared transaction isn't found and then just have the core code ignore the error. Instead the FDW should return normally and, if the core code needs to know whether the prepared transaction was found, then the FDW should indicate this through a return value, not an ERROR. > Or do you mean that FDWs should not raise an error if there is the > prepared transaction, and then core code doesn't need to check > sqlstate in case of error? Right. As noted above, that's unsafe, so we shouldn't do it. >>> You're right. Perhaps we can deal with it by PrescanFdwXacts until >>> reach consistent point, and then have vac_update_datfrozenxid check >>> local xids of un-resolved fdwxact to determine the new datfrozenxid. >>> Since the local xids of un-resolved fdwxacts would not be relevant >>> with vacuuming, we don't need to include it to snapshot and >>> GetOldestXmin etc. Also we hint to resolve fdwxact when near >>> wraparound. >> >> I agree with you about snapshots, but I'm not sure about GetOldestXmin. > > Hmm, although I've thought concern in case where we don't consider > local xids of un-resolved fdwxact in GetOldestXmin, I could not find > problem. Could you share your concern if you have? I'll try to find a > possibility based on it. I don't remember exactly what I was thinking when I wrote that, but I think the point is that GetOldestXmin() does a bunch of things other than control the threshold for VACUUM, and we'd need to study them all and look for problems. For example, it won't do for an XID to get reused while it's still associated with an unresolved fdwxact. We therefore certainly need such XIDs to hold back the cluster-wide threshold for clog truncation in some manner -- and maybe that involves GetOldestXmin(). Or maybe not. But anyway the point, broadly considered, is that GetOldestXmin() is used in various ways, and I don't know if we've thought through all of the consequences in regard to this new feature. Can I ask what your time frame is for updating this patch? Considering how much work appears to remain, if you want to get this committed to v12, it would be best to get started as early as possible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company