Hi Manish, > I’ve been exploring the idea of integrating an LSM tree–based storage engine > into PostgreSQL — similar in spirit to MyRocks for MySQL — by replacing the > underlying storage while preserving PostgreSQL’s upper layers (planner, > executor, MVCC, etc.). > [...]
Personally I wouldn't expect too much from RocksDB considering that historically LSM-trees were designed to improve performance on HDDs. The idea was to replace random writes with sequential ones. These days we seem to be shifting to SSDs so this approach should be beneficial in relatively niche cases when the amount of data is so large that it's economically justified to buy HDDs... and you still want Postgres for these tens or hundreds of terabytes of data, as a single instance (no FDWs etc). In theory at least. This being said, I don't want to discourage you from implementing a TAM that would use RocksDB. This would be an excellent project for several reasons: * We could check if the above statement is true or false * This could be considered as a reference 3rd party TAM implementation * This also would be an implementation of index-organized tables * Perhaps you will discover certain shortcomings of TAM API in the process so we could improve it * This promises to be a very-very fun project > I’d love to hear your thoughts: > > Does this direction make sense for experimentation within the Postgres > ecosystem? Yes. > Are there known architectural blockers or prior discussions/attempts in this > space worth revisiting? >From the performance point of view, maybe. Note that RocksDB basically has its own shared buffers, transactions and write ahead logging. Makes me wonder if it would be simpler to implement LSM-tree from scratch. On the flip side you can start by using RocksDB as a key-value storage basically, see the results, and then replace it with your own implementation with the same interface. Supporting backups could be a challenge, including incremental ones. I'm not sure if/how pg_basebackup and our other tools interact with TAMs. > Would such a project be best developed entirely as a fork, or is there > openness to evolving TAM to better support pluggable storage with LSM‑like > semantics? Please don't create forks. As a person whose responsibility besides other things was to help maintain several PG forks and upgrade them to the next upstream releases I assure you that you don't want doing this ever. Just use the TAM interface and create a seperate PostgreSQL extension on GitHub. This mailing list could be used for general TAM discussions. When and if your extension will become mature enough it can be discussed whether it's worth adding to /contrib/ or not. -- Best regards, Aleksander Alekseev