On Wed, Aug 22, 2018 at 9:09 AM Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Wed, Aug 22, 2018 at 1:20 PM Chris Travers <chris.trav...@adjust.com> > wrote: > > The two things I would suggest is that rather than auto-detecting (if I > understand your patch correctly) whether prepared transactions are possible > on the other system, making it an option to the foreign server or foreign > table. Otherwise one might enable prepared transactions for one set of > operations on one database and have it automatically cause headaches in > another context. > > Yeah, currently it's an option for foreign servers. The patch adds a > new option "two_phase_commit" to postgres_fdw. > Seems sane. > > > > > The other thing I wasn't quite sure about on your patch was what happens > if, say, someone trips over a power cord while the background worker is > trying to commit things, whether the information is available on the > initiating server when it comes back. whether a DBA has to go out and try > to figure out what needs to be committed remotely, and how this would be > done. If you could explain that process, that would be helpful to me. > > > > (In my approach these would be recorded on the master and an SQL > function could re-initiate the background worker.) > > My approach is almost the same as yours. For details, in the > pre-commit we do WAL-logging for each participants server before > preparing transactions on the remote sides. The WAL has information of > participants foreign servers(foreign server oid, database oid etc) and > its global transaction identifier. Even if plug-pulled during trying > to commit we can recover the global transactions that are not > completed yet and its participants information from WAL. After the > recovery users needs to execute the SQL function to fix the > in-completed global transactions. Since the function can find out > whether the remote transaction should be committed or rollback-ed by > checking CLOG. Does my answer make sense? > Yeah. That is probably more elegant than my solution. I do wonder though if the next phase would not be to add some sort of hook to automatically start the background worker in this case. > > >> > >> > >> > > >> > Moreover since COMMIT PREPARED occurs during the commit hook, not the > precommit hook, it is too late to roll back the local transaction. We > cannot raise errors since this causes a conflict in the commit status of > the local transaction. So when we commit the local transaction we commit > to committing all prepared transactions as soon as possible. Note some > changes need to be made to make this usable in the FDW context, so what I > am hoping is that the dialog helps impact the discussion and options going > forward. > >> > > >> >> > >> >> Also > >> >> since we don't want to wait for COMMIT PREPARED to complete we need > to > >> >> consider that users could cancel the query anytime. To not break the > >> >> current semantics we cannot raise error during 2nd phase of two-phase > >> >> commit but it's not realistic because even the palloc() can raise an > >> >> error. > >> > > >> > > >> > We don't palloc. All memory used here is on the stack. I do allow > for dramatic precondition checks to cause errors but those should never > happen absent some other programmer doing something dramatically unsafe > anyway. For example if you try to double-commit a transaction set..... > >> > >> Sorry, palloc() is just an example. I'm not sure all FDWs can > >> implement all callbacks for two-phase commit without codes that could > >> emit errors. > > > > > > Yeah, but if you are in the commit hook and someone emits an error, > that's wrong because that then tries to rollback an already committed > transaction and the backend rightfully panics. In fact I should probably > strip out the precondition check errors there and issue a warning. It > might sometimes happen when something goes seriously wrong on a system > level, but.... > > In my patch since the commit hook is performed by the background > worker not by the backends it's no problem if someone emits an error > in the commit hook. After the backend prepared transactions on the all > remote side, it enqueue itself to the wait queue. The background > worker gets the global transaction waiting to be completed and commit > prepared transaction on all remote side. After completed the global > transaction the background worker dequeue it. > Seems sane. I was firing off one background worker per global transaction that needed cleanup. This might be an area to think about in terms of questions of larger parallelism in remote writes. > > >> > >> > >> > > >> > There is a possible of system errors if one can no longer write to > the file log but at this point as long as we have logged the phase change > to commit we are able to recover later. > >> > > >> > So in the event where one sees an error here one continues on to the > next transaction in the global transaction set and tries to commit it, etc. > until it runs through the entire set of prepared transactions. Then if > there were any errors it fires off a background worker which re-reads the > log file and goes out to the various foreign servers, checks to see if > there is a prepared transaction, and if so commits it. If the transaction > set state was in rollback, it tries to roll it back instead. If this > errors,, it sleeps for a second and then loops through those which errored > and retries until all are complete. > >> > >> Yeah, the patch has the similar functionality. > >> > >> > The other thing is we record whether we are committing or rolling > back the transaction when we hit the commit or rollback hook. This is > critical because we can imagine a world where the Oracle FDW supports > similar semantics. In that case everything works and is not ordering > dependent. I.e. we can prepare our transactions. Oracle can try and fail, > and rollback, and we rollback all the transactions everywhere. And all we > have to know was we got to the precommit hook and then we rolled back. > >> > >> In my patch the global transaction manager manages each status of > >> foreign servers participating in global transactions with WAL logging. > >> The fate of transaction on foreign server will be determined according > >> to the state of local transaction and their status. WAL logging is > >> important because not only in term of speedup but also supporting > >> streaming replication. > > > > > > So you are optimizing for large numbers of prepared transactions or at a > high rate? > > I don't do optimizations much for the patch as this is the first > implementation. Once the basic feature committed I will do that. > > > > > Also does the background worker get fired again on recovery as needed? > > No. I added new SQL function to fix global transactions. We need to > execute that function manually after recovery. > And that is solely on a case where the db does a full restart. Is it possible the background worker could survive a backend restart? > > >> > >> > >> >> > >> >> The design the patch chose is making backends do only PREPARE and > wait > >> >> for the background worker to complete COMMIT PREPARED. In this design > >> >> the clients get a "committed" only either when successful in commit > on > >> >> all participants or when they cancel the query explicitly. In other > >> >> words, the client will wait for completion of 2nd phase of two-phase > >> >> commit forever unless it cancels. > >> > > >> > > >> > In this approach we make a best effort to commit or rollback (as > appropriate in the state of the global transaction) *all* remote > transactions during global commit or global rollback. It is not guaranteed > but it avoids breaking semantics as much as we can. Also the background > worker here does not need to attach to shared memory since the log has > everything required. COMMIT PREPARED ought to be a fast operation unless > there are network problems but those can affect prepare as well. > >> > > >> > Also imagine a case where you are writing to three dbs. One is on > Oracle, one on DB2, and one on PostgreSQL You successfully prepare your > transaction. DB2 successfully prepares, and then the Oracle db errors for > some reason (maybe a deferred constraint). Does the background worker have > enough information to know to roll back your transaction on the remote side? > >> > >> I think that what the background worker needs to know to rollback > >> remote transactions are how to rollback and what to rollback. How to > >> rollback is defined in each FDWs. > > > > > > Agreed. And naturally same with commits. > > > > My assumption is that each foreign data wrapper would have to set its > own precommit/commit hook callbacks. I think your patch extends the fdw > structure to try to ensure these are done automatically? > > Yes. The patch adds new FDW APIs for the atomic commit such as > prepare, commit, rollback, resolve(2nd phase of 2PC). The FDW > developers who want make their FDW support the atomic commit need to > define these API and call the registration function when transaction > starts. If the FDW of the registered foreign server doesn't support > FDW's atomic commit API the transaction emit an error. > > Regards, > > -- > Masahiko Sawada > NIPPON TELEGRAPH AND TELEPHONE CORPORATION > NTT Open Source Software Center > -- Best Regards, Chris Travers Head of Database Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin