On Tue, Aug 4, 2020 at 8:24 AM Alexander Korotkov <aekorot...@gmail.com> wrote: > So, in my point of view LSM as an index AM is far not a full power LSM > for PostgreSQL, but it's still useful. Large insert-only tables can > benefit from LSM. Large tables with many indexes could also benefit, > because non-HOT updates will become cheaper.
Right -- this is why you usually have to choose one or the other. An LSM design typically subsumes not just indexing and table storage, but also checkpointing -- you cannot really compare an LSM to a B-Tree because you really have to talk about other components to make a sensible comparison (at which point you're actually comparing two totally different *storage engines*). Roughly speaking, the compaction process is the equivalent of checkpointing. So you either use (say) InnoDB or RocksDB everywhere -- you usually can't have it both ways. Well, maybe you can kind of get the benefits of both, but in practice LSMs are usually highly optimized for the things that they're good at, at the expense of other things. So in practice you kind of have to make an up-front choice. An LSM is definitely not a natural fit for the index access method interface in Postgres. One thing that I don't think anyone else made reference to on the thread (which is surprising) is that the performance of an LSM is usually not measured using any of the conventional metrics that we care about. For example, consider the Facebook MyRocks paper "Optimizing Space Amplification in RocksDB" [1]. The reported RocksDB throughput for an LSM-sympathetic workload is not really any faster than InnoDB, and sometimes slower. That's not the point, though; the main advantages of using an LSM are reductions in space amplification and write amplification, particularly the latter. This isn't so much about performance as it is about efficiency -- it enabled Facebook to get a lot more out of the inexpensive flash storage that they use. It lowered the total cost of ownership by *a lot*. I personally think that we should care about efficiency in this sense a lot more than we do now, but the fact remains that it hasn't really been considered an independent problem that could be addressed by accepting a tradeoff similar to the tradeoff LSMs make quite explicit (apparently you can tune LSMs to get less write amplification but more read amplification, or vice versa). In general, we almost always just talk about throughout and latency without considering efficiency specifically. I'm not suggesting that we need an LSM, but an appreciation of LSMs could be helpful -- it could lead to better designs elsewhere. Mark Callaghan's blog is a pretty good resource for learning about LSMs [2] (perhaps you've heard of him?). He wrote a bunch of stuff about Postgres recently, which I enjoyed. [1] http://cidrdb.org/cidr2017/papers/p82-dong-cidr17.pdf [2] https://smalldatum.blogspot.com/ -- Peter Geoghegan