Thank you for the comment. On Fri, May 11, 2018 at 3:57 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawada <sawada.m...@gmail.com> > wrote: >> I might be missing your point. As for API breaking, this patch doesn't >> break any existing FDWs. All new APIs I proposed are dedicated to 2PC. >> In other words, FDWs that work today can continue working after this >> patch gets committed, but if FDWs want to support atomic commit then >> they should be updated to use new APIs. The reason why the calling of >> FdwXactRegisterForeignServer is necessary is that the core code >> controls the foreign servers that involved with the transaction but >> the whether each foreign server uses 2PC option (two_phase_commit) is >> known only on FDW code. We can eliminate the necessity of calling >> FdwXactRegisterForeignServer by moving 2PC option from fdw-level to >> server-level in order not to enforce calling the registering function >> on FDWs. If we did that, the user can use the 2PC option as a >> server-level option. > > Well, FdwXactRegisterForeignServer has a "bool two_phase_commit" > argument. If it only needs to be called by FDWs that support 2PC, > then that argument is unnecessary. If it needs to be called by all > FDWs, then it breaks existing FDWs that don't call it. >
I understood now. For now since FdwXactRegisterForeignServer needs to be called by only FDWs that support 2PC, I will remove the argument. >>>> But this is now responsible by FDW. I should change it to resolver >>>> side. That is, FDW can raise error in ordinarily way but core code >>>> should catch and process it. >>> >>> I don't understand exactly what you mean here. >> >> Hmm I think I got confused. My understanding is that logging a >> complaint in commit case and not doing that in abort case if prepared >> transaction doesn't exist is a core code's job. An API contract here >> is that FDW raise an error with ERRCODE_UNDEFINED_OBJECT error code if >> there is no such transaction. Since it's an expected case in abort >> case for the fdwxact manager, the core code can regard the error as >> not actual problem. > > In general, it's not safe to catch an error and continue unless you > protect the code that throws the error by a sub-transaction. That > means we shouldn't expect the FDW to throw an error when the prepared > transaction isn't found and then just have the core code ignore the > error. Instead the FDW should return normally and, if the core code > needs to know whether the prepared transaction was found, then the FDW > should indicate this through a return value, not an ERROR. > >> Or do you mean that FDWs should not raise an error if there is the >> prepared transaction, and then core code doesn't need to check >> sqlstate in case of error? > > Right. As noted above, that's unsafe, so we shouldn't do it. Thank you. I will think the API contract again based on your suggestion. > >>>> You're right. Perhaps we can deal with it by PrescanFdwXacts until >>>> reach consistent point, and then have vac_update_datfrozenxid check >>>> local xids of un-resolved fdwxact to determine the new datfrozenxid. >>>> Since the local xids of un-resolved fdwxacts would not be relevant >>>> with vacuuming, we don't need to include it to snapshot and >>>> GetOldestXmin etc. Also we hint to resolve fdwxact when near >>>> wraparound. >>> >>> I agree with you about snapshots, but I'm not sure about GetOldestXmin. >> >> Hmm, although I've thought concern in case where we don't consider >> local xids of un-resolved fdwxact in GetOldestXmin, I could not find >> problem. Could you share your concern if you have? I'll try to find a >> possibility based on it. > > I don't remember exactly what I was thinking when I wrote that, but I > think the point is that GetOldestXmin() does a bunch of things other > than control the threshold for VACUUM, and we'd need to study them all > and look for problems. For example, it won't do for an XID to get > reused while it's still associated with an unresolved fdwxact. We > therefore certainly need such XIDs to hold back the cluster-wide > threshold for clog truncation in some manner -- and maybe that > involves GetOldestXmin(). Or maybe not. But anyway the point, > broadly considered, is that GetOldestXmin() is used in various ways, > and I don't know if we've thought through all of the consequences in > regard to this new feature. Okay, I'll have GetOldestXmin() consider the oldest local xid of un-resolved fdwxact as well in the next version patch for more safety, while considering more efficient ways. > > Can I ask what your time frame is for updating this patch? > Considering how much work appears to remain, if you want to get this > committed to v12, it would be best to get started as early as > possible. > I'll post an updated patch by PGCon at the latest, hopefully in the next week. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center