Re: [HACKERS] PageRepairFragmentation performance

José Luis Tallón Tue, 18 Nov 2014 10:39:04 -0800

On 11/18/2014 07:03 PM, Heikki Linnakangas wrote:

When profiling replay the WAL generated by pgbench, I noticed thePageRepairFragmentation consumes a large fraction of the CPU time:
[snip]
1. Replace the qsort with something cheaper. The itemid arrays beingsorted are small, a few hundred item at most, usually even smaller. Inthis pgbench test case I used, the typical size is about 60. With asmall array a plain insertion sort is cheaper than the genericqsort(), because it can avoid the function overhead etc. involved withgeneric qsort. Or we could use something smarter, like a radix sort,knowing that we're sorting small integers. Or we could implement aninlineable version of qsort and use that.

IIRC, we would have a theoretical complexity of quicksort and radix sortshould be approximately the same for 256-1024 items... O(n*log(n)) vsO(d*n), where d is ~log2(n) or just 16 in this case. However,lexicographical ("bitstring-wise" ordering) might not be what we areaiming for here

AFAIK, an inlined quicksort should be about the best performing sortavailable (most of the enhancement coming from staying within the I-cache)

2. Instead of sorting the array and using memmove in-place, we couldcopy all the tuples to a temporary buffer in arbitrary order, andfinally copy the temporary copy back to the buffer. That requires twomemory copies per tuple, instead of one memmove, but memcpy() ispretty darn fast. It would be a loss when there are only a few largetuples on the page, so that avoiding the sort doesn't help, or whenthe tuples are mostly already in the correct places, so that most ofthe memmove()s are no-ops. But with a lot of small tuples, it would bea win, and it would be simple.

Memmove *should* be no slower than memcpy.... if both are actuallytranslated by the compiler to use intrinsics as opposed to calling thefunctions --- as it seems to be done here (cfr. __memmove_ssse3_back )A simple "if" in order to eliminate the no-op memmoves might as well doit, too.


Just my two (euro) cents, though

The second option would change behaviour slightly, as the tuples wouldbe placed on the page in different physical order than before. Itwouldn't be visible to to users, though.
I spent some time hacking approach 1, and replaced the qsort() callwith a bucket sort. I'm not sure if a bucket sort is optimal, orbetter than a specialized quicksort implementation, but it seemed simple.
With the testcase I've been using - replaying about 2GB of WALgenerated by pgbench - this reduces the replay time from about 55 s to45 s.

Not bad at all... though I suspect most of it might come from stayingwithin the I-cache as opposed to regular qsort.

The smaller itemIdSortData structure surely helps a bit, too :)

Thoughts? Attached is the patch I put together. It's actually twopatches: the first is just refactoring, putting the common codebetween PageRepairFragmentation, PageIndexMultiDelete, andPageIndexDeleteNoCompact to function. The second replaces the qsort().

Definitively worth-while, even if just for the refactor. The speed-upsounds very good, too.




Thanks,

    / J.L.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PageRepairFragmentation performance

Reply via email to