On 12/6/23 09:56, Amit Kapila wrote: > On Tue, Dec 5, 2023 at 10:23 PM Tomas Vondra > <tomas.von...@enterprisedb.com> wrote: >> >> On 12/5/23 13:17, Amit Kapila wrote: >> >>> (b) for transactional >>> cases, we see overhead due to traversing all the top-level txns and >>> check the hash table for each one to find whether change is >>> transactional. >>> >> >> Not really, no. As I explained in my preceding e-mail, this check makes >> almost no difference - I did expect it to matter, but it doesn't. And I >> was a bit disappointed the global hash table didn't move the needle. >> >> Most of the time is spent in >> >> 78.81% 0.00% postgres postgres [.] DecodeCommit (inlined) >> | >> ---DecodeCommit (inlined) >> | >> |--72.65%--SnapBuildCommitTxn >> | | >> | --72.61%--SnapBuildBuildSnapshot >> | | >> | --72.09%--pg_qsort >> | | >> | |--66.24%--pg_qsort >> | | | >> >> And there's almost no difference between master and build with sequence >> decoding - see the attached diff-alter-sequence.perf, comparing the two >> branches (perf diff -c delta-abs). >> > > I think in this the commit time predominates which hides the overhead. > We didn't investigate in detail if that can be improved but if we see > a similar case of abort [1], it shows the overhead of > ReorderBufferSequenceIsTransactional(). I understand that aborts won't > be frequent and it is sort of unrealistic test but still helps to show > that there is overhead in ReorderBufferSequenceIsTransactional(). Now, > I am not sure if we can ignore that case because theoretically, the > overhead can increase based on the number of top-level transactions. > > [1]: > https://www.postgresql.org/message-id/TY3PR01MB9889D457278B254CA87D1325F581A%40TY3PR01MB9889.jpnprd01.prod.outlook.com >
But those profiles were with the "old" patch, with one hash table per top-level transaction. I see nothing like that with the patch [1] that replaces that with a single global hash table. With that patch, the ReorderBufferSequenceIsTransactional() took ~0.5% in any tests I did. What did have bigger impact is this: 46.12% 1.47% postgres [.] pg_logical_slot_get_changes_guts | |--45.12%--pg_logical_slot_get_changes_guts | | | |--42.34%--LogicalDecodingProcessRecord | | | | | |--12.82%--xact_decode | | | | | | | |--9.46%--DecodeAbort (inlined) | | | | | | | | | |--8.44%--ReorderBufferCleanupTXN | | | | | | | | | | | |--3.25%--ReorderBufferSequenceCleanup (in) | | | | | | | | | | | | | |--1.59%--hash_seq_search | | | | | | | | | | | | | |--0.80%--hash_search_with_hash_value | | | | | | | | | | | | | --0.59%--hash_search | | | | | | hash_bytes I guess that could be optimized, but it's also a direct consequence of the huge number of aborts for transactions that create relfilenode. For any other workload this will be negligible. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company