On Fri, Aug 02, 2019 at 04:45:43PM +0300, Konstantin Knizhnik wrote:


On 27.06.2019 21:33, Andrey Borodin wrote:

13 мая 2019 г., в 12:14, Michael Paquier <mich...@paquier.xyz> написал(а):

Decompression can matter a lot for mostly-read workloads and
compression can become a bottleneck for heavy-insert loads, so
improving compression or decompression should be two separate
problems, not two problems linked.  Any improvement in one or the
other, or even both, is nice to have.
Here's patch hacked by Vladimir for compression.

Key differences (as far as I see, maybe Vladimir will post more complete list 
of optimizations):
1. Use functions instead of macro-functions: not surprisingly it's easier to 
optimize them and provide less constraints for compiler to optimize.
2. More compact hash table: use indexes instead of pointers.
3. More robust segment comparison: like memcmp, but return index of first 
different byte

In weighted mix of different data (same as for compression), overall speedup is 
x1.43 on my machine.

Current implementation is integrated into test_pglz suit for benchmarking 
purposes[0].

Best regards, Andrey Borodin.

[0] https://github.com/x4m/test_pglz

It takes me some time to understand that your memcpy optimization is correct;) I have tested different ways of optimizing this fragment of code, but failed tooutperform your implementation!
Results at my computer is simlar with yours:

Decompressor score (summ of all times):
NOTICE:  Decompressor pglz_decompress_hacked result 6.627355
NOTICE:  Decompressor pglz_decompress_hacked_unrolled result 7.497114
NOTICE:  Decompressor pglz_decompress_hacked8 result 7.412944
NOTICE:  Decompressor pglz_decompress_hacked16 result 7.792978
NOTICE:  Decompressor pglz_decompress_vanilla result 10.652603

Compressor score (summ of all times):
NOTICE:  Compressor pglz_compress_vanilla result 116.970005
NOTICE:  Compressor pglz_compress_hacked result 89.706105


But ...  below are results for lz4:

Decompressor score (summ of all times):
NOTICE:  Decompressor lz4_decompress result 3.660066
Compressor score (summ of all times):
NOTICE:  Compressor lz4_compress result 10.288594

There is 2 times advantage in decompress speed and 10 times advantage in compress speed. So may be instead of "hacking" pglz algorithm we should better switch to lz4?


I think we should just bite the bullet and add initdb option to pick
compression algorithm. That's been discussed repeatedly, but we never
ended up actually doing that. See for example [1].

If there's anyone willing to put some effort into getting this feature
over the line, I'm willing to do reviews & commit. It's a seemingly
small change with rather insane potential impact.

But even if we end up doing that, it still makes sense to optimize the
hell out of pglz, because existing systems will still use that
(pg_upgrade can't switch from one compression algorithm to another).

regards

[1] 
https://www.postgresql.org/message-id/flat/55341569.1090107%402ndquadrant.com

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to