Hi, Kristian! Now, WL#132 - Transaction coordinator plugin
> ============= High-Level Specification ... > In current MariaDB, we have two different TC implementations (as well > as a "dummy" empty implementation that I do not know if is used). The code in mysqld.cc is tc_log= (total_ha_2pc > 1 ? (opt_bin_log ? (TC_LOG *) &mysql_bin_log : (TC_LOG *) &tc_log_mmap) : (TC_LOG *) &tc_log_dummy); so, tc_log_dummy is used when there's at most one xa-capable engine. But MySQL does not use 2pc for a transaction unless it has at least two xa-capable participants. In other words, tc_log_dummy is never used. > Binary log > ---------- > > The binary log implements also a "fake" storage engine, mainly to hook > into the commit (and prepare) phase of transaction processing. This is > mainly used for statements in non-transactional engines, which are > "committed" and written to the binary log outside of the TC and > log_xid() framework. No, this is used to make the number of xa-capable transaction participants more than one and to force MySQL to use 2PC. > TC interface subclasses > ----------------------- > > The MWL#116 has two different algorithms for handling commit order and > invoking prepare_ordered() and commit_ordered() handler methods: > > - One used with TC_MMAP, which needs no correspondance between > engines and TC. This uses the existing log_xid() interface. > > - One used with the binary log TC, which ensures same commit order in > engines and binary log, and which uses a new single-threaded > group_log_xid() TC interface to efficiently do group commit. > > In the prototype patch for MWL#116, these two methods are mixed with > each other in the function ha_commit_trans(), and the logic is quite > complex. Using the log_and_order() TC generalisation provides a nice > cleanup of this. > > We implement two subclasses of the TC interface: > > - One class TC_LOG_unordered for the method used with TC_MMAP. This > implements the old log_xid() interface. > > - One class TC_LOG_group_commit for the method used for the binary > log. This implements the new group_log_xid() interface. > > Each subclass implements the corresponding algorithm for invoking > prepare_ordered() and commit_ordered(), using the same mechanisms as > in MWL#116, but implemented in a cleaner way. The ha_commit_trans() > function then has no details about prepare_ordered() or > commit_ordered(), it just calls into tc_log->log_and_order(), which > handles the necessary details. > > Thus a simple TC plugin similar to the binary log or TC_MMAP can > implement one of the simple interfaces log_xid() or group_log_xid(), > without having to worry about prepare_ordered() and commit_ordered(). > But a plugin like Galera that needs to do more can implement the more > general interface. I still see no real value in keeping or supporting log_xid() interface. I think we can only implement one interface - group_log_xid() - and that's enough. > ============= Low-Level Design ... > log_and_order() > Requests a decision to commit (non-zero return) or rollback (zero > return) of the transaction. At this point, the transaction has > been successfully prepared in all engines. > > The method must call run_prepare_ordered(), in a way so that calls > in different threads happen in the order that the transactions are > committed. This call must be protected by the global > LOCK_prepare_ordered mutex. > > The method must then call run_commit_ordered(), protected by > LOCK_commit_ordered, again so that different threads are called in > the order that transactions are committed. > > The idea with prepare_ordered() is to call it as early as possible > after commit order has been decided, for example to release locks > early. In particular, a transaction can still be rolled back after > prepare_ordered() (for example in case of a crash). In contrast, > commit_ordered() may only be called after the transaction is > durably committed in the TC. > > If need_prepare_ordered or need_commit_ordered is passed as FALSE, > then the corresponding call need not be done. It is safe to do it > anyway, however omitting it avoids the need to take a global > mutex. Why would this ever be needed ? (I mean need_prepare_ordered or need_commit_ordered being FALSE) ... > A TC based on this interface overrides group_log_xid() and > xid_log_after() instead of log_and_order(), and again does not need to > deal with any {prepare,commit}_ordered(). Why do you need xid_log_after here ? General comment: Wouldn't it be simpler to create only group_log_xid() interface, no log_and_order() or log_xid() ? The tc plugin gets the list in group_log_xid() - it can reorder the list any way it wants, call prepare_ordered() and commit_ordered() as needed and so on. In this interpretation, group_log_xid() can meet all the use cases. And there's no need to create a multitude of methods that one needs to get familiar with before implementing a TC plugin. Regards, Sergei P.S. Minor detail - there could be helper functions like iterate_the_list_and_call_prepare_ordered(), that the plugin can use. _______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp