Re: Zedstore - compressed in-core columnar storage

Heikki Linnakangas Tue, 20 Aug 2019 04:13:38 -0700

On 20/08/2019 05:04, Justin Pryzby wrote:

it looks like zedstore
    with lz4 gets ~4.6x for our largest customer's largest table.  zfs using
    compress=gzip-1 gives 6x compression across all their partitioned
tables,
    and I'm surprised it beats zedstore .

I did a quick test, with 10 million random IP addresses, in text format.I loaded it into a zedstore table ("create table ips (ip text) usingzedstore"), and poked around a little bit to see how the space is used.

postgres=# select lokey, nitems, ncompressed, totalsz, uncompressedsz,freespace from pg_zs_btree_pages('ips') where attno=1 and level=0 limit 10;

 lokey | nitems | ncompressed | totalsz | uncompressedsz | freespace
-------+--------+-------------+---------+----------------+-----------
     1 |      4 |           4 |    6785 |           7885 |      1320
   537 |      5 |           5 |    7608 |           8818 |       492
  1136 |      4 |           4 |    6762 |           7888 |      1344
  1673 |      5 |           5 |    7548 |           8776 |       540
  2269 |      4 |           4 |    6841 |           7895 |      1256
  2807 |      5 |           5 |    7555 |           8784 |       540
  3405 |      5 |           5 |    7567 |           8772 |       524
  4001 |      4 |           4 |    6791 |           7899 |      1320
  4538 |      5 |           5 |    7596 |           8776 |       500
  5136 |      4 |           4 |    6750 |           7875 |      1360
(10 rows)

There's on average about 10% of free space on the pages. We're losingquite a bit to to ZFS compression right there. I'm sure there's somefree space on the heap pages as well, but ZFS compression will squeezeit out.

The compression ratio is indeed not very good. I think one reason isthat zedstore does LZ4 in relatively small chunks, while ZFS surelycompresses large blocks in one go. Looking at the above, there is onaverage 125 datums packed into each "item" (avg(hikey-lokey) / nitems).I did a quick test with the "lz4" command-line utility, compressing flatfiles containing random IP addresses.


$ lz4 /tmp/125-ips.txt
Compressed filename will be : /tmp/125-ips.txt.lz4

Compressed 1808 bytes into 1519 bytes ==> 84.02%

$ lz4 /tmp/550-ips.txt
Compressed filename will be : /tmp/550-ips.txt.lz4

Compressed 7863 bytes into 6020 bytes ==> 76.56%

$ lz4 /tmp/750-ips.txt
Compressed filename will be : /tmp/750-ips.txt.lz4
Compressed 10646 bytes into 8035 bytes ==> 75.47%

The first case is roughly what we do in zedstore currently: we compressabout 125 datums as one chunk. The second case is roughty what we wouldget, if we collected on 8k worth of datums and compressed them all asone chunk. And the third case simulates the case we would allow theinput to be larger than 8k, so that the compressed chunk just fits on an8k page. Not too much difference between the second and third case, butits pretty clear that we're being hurt by splitting the input into suchsmall chunks.

The downside of using a larger compression chunk size is that randomaccess becomes more expensive. Need to give the on-disk format some morethought. Although I actually don't feel too bad about the currentcompression ratio, perfect can be the enemy of good.


- Heikki

Re: Zedstore - compressed in-core columnar storage

Reply via email to