Re: LSM tree for Postgres

Konstantin Knizhnik Sun, 09 Aug 2020 00:26:53 -0700



On 09.08.2020 04:53, Alexander Korotkov wrote:


I realize that it is not true LSM.
But still I wan to notice that it is able to provide ~10 times increase
of insert speed when size of index is comparable with RAM size.
And "true LSM" from RocksDB shows similar results.

It's very far from being shown.  All the things you've shown is a
naive benchmark.  I don't object that your design can work out some
cases.  And it's great that we have the lsm3 extension now.  But I
think for PostgreSQL core we should think about better design.

Sorry, I mean that at particular benchmark and hardware Lsm3 and RocksDBshows similar performance.

It definitely doesn't mean that it will be true in all other cases.

This is one of the reasons why I have published this Lsm3 and RockDB FDWextensions:

anybody can try to test them at their workload.

It will be very interesting to me to know this results, because Icertainly understandthat measuring of random insert performance in dummy table is not enoughto make some

conclusions.

And I certainly do not want to say that we do not need "right" LSMimplementation inside Postgres core.

It just requires an order of magnitude more efforts.

And there are many questions and challenges. For example Postgres buffersize (8kb) seems to be too small for LSM.Should LSM implementation bypass Postgres buffer cache? There pros andcontras...

Another issue is logging. Should we just log all operations with LSM inWAL in usual way (as it is done for nbtree and Lsm3)?It seems to me that for LSM alternative and more efficient solutions maybe proposed.For example we may not log inserts in top index at all and just replaythem during recovery, assuming that this operation withsmall index is fast enough. And merge of top index with base index canbe done in atomic way and so also doesn't require WAL.

As far as I know Anastasia Lubennikova several years ago has implementedLSM for Postgres.

There was some performance issues (with concurrent access?).

This is why the first thing I want to clarify for myself is what are thebottlenecks of LSM architectureand are them caused by LSM itself or its integration in Postgresinfrastructure.

I any case, before thinking about details of in-core LSM implementationfor Postgres, I think thatit is necessary to demonstrate workloads at which RocksDB (or any otherexisted DBMS with LSM)shows significant performance advantages comparing with Postgres withnbtree/Lsm3.

May be if size of
index will be 100 times larger then
size of RAM, RocksDB will be significantly faster than Lsm3. But modern
servers has 0.5-1Tb of RAM.
Can't believe that there are databases with 100Tb indexes.

Comparison of whole RAM size to single index size looks plain wrong
for me.  I think we can roughly compare whole RAM size to whole
database size.  But also not the whole RAM size is always available
for caching data.  Let's assume half of RAM is used for caching data.
So, a modern server with 0.5-1Tb of RAM, which suffers from random
B-tree insertions and badly needs LSM-like data-structure, runs a
database of 25-50Tb.  Frankly speaking, there is nothing
counterintuitive for me.


There is actually nothing counterintuitive.
I just mean that there are not so much 25-50Tb OLTP databases.

Re: LSM tree for Postgres

Reply via email to