Hello. While working with cluster stuff (DTM, tsDTM) we noted that postgres 2pc transactions is approximately two times slower than an ordinary commit on workload with fast transactions — few single-row updates and COMMIT or PREPARE/COMMIT. Perf top showed that a lot of time is spent in kernel on fopen/fclose, so it worth a try to reduce file operations with 2pc tx.
Now 2PC in postgres does following: * on prepare 2pc data (subxacts, commitrels, abortrels, invalmsgs) saved to xlog and to file, but file not is not fsynced * on commit backend reads data from file * if checkpoint occurs before commit, then files are fsynced during checkpoint * if case of crash replay will move data from xlog to files In this patch I’ve changed this procedures to following: * on prepare backend writes data only to xlog and store pointer to the start of the xlog record * if commit occurs before checkpoint then backend reads data from xlog by this pointer * on checkpoint 2pc data copied to files and fsynced * if commit happens after checkpoint then backend reads files * in case of crash replay will move data from xlog to files (as it was before patch) Most of that ideas was already mentioned in 2009 thread by Michael Paquier http://www.postgresql.org/message-id/c64c5f8b0908062031k3ff48428j824a9a46f2818...@mail.gmail.com where he suggested to store 2pc data in shared memory. At that time patch was declined because no significant speedup were observed. Now I see performance improvements by my patch at about 60%. Probably old benchmark overall tps was lower and it was harder to hit filesystem fopen/fclose limits. Now results of benchmark are following (dual 6-core xeon server): Current master without 2PC: ~42 ktps Current master with 2PC: ~22 ktps Current master with 2PC: ~36 ktps Benchmark done with following script: \set naccounts 100000 * :scale \setrandom from_aid 1 :naccounts \setrandom to_aid 1 :naccounts \setrandom delta 1 100 \set scale :scale+1 BEGIN; UPDATE pgbench_accounts SET abalance = abalance - :delta WHERE aid = :from_aid; UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :to_aid; PREPARE TRANSACTION ':client_id.:scale'; COMMIT PREPARED ':client_id.:scale';
2pc_xlog.diff
Description: Binary data
--- Stas Kelvich Postgres Professional: http://www.postgrespro.com Russian Postgres Company
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers