Hi hackers, I’ve been exploring the idea of integrating an LSM tree–based storage engine into PostgreSQL — similar in spirit to MyRocks for MySQL — by replacing the underlying storage while preserving PostgreSQL’s upper layers (planner, executor, MVCC, etc.).
The motivation stems from the well‑known write‑amplification issues with B‑trees under high write throughput. An LSM‑based engine could offer: - Significant improvements in write performance and space efficiency, especially under heavy ingestion workloads. - Better scalability with larger datasets, particularly when compression is applied. - Comparable read performance (with trade‑offs depending on workload), and opportunities to optimize through Bloom filters, tiered compaction strategies, etc. - Reduced reliance on manual VACUUM: obsolete versions would be purged naturally during LSM compactions, potentially eliminating routine heap vacuuming (transaction‑ID wrap‑around handling and stats collection would still need careful design). The hoped‑for outcome is a faster, more scalable PostgreSQL for >1 TB workloads, while maintaining the rich feature set and ecosystem compatibility users expect from Postgres. Unlike Neon, this approach is not targeting cloud‑native object storage or remote WAL streaming, but instead optimizing for maximum performance on local disks or high‑performance block volumes, where write throughput and compaction efficiency matter most. This would likely involve implementing a new Table Access Method (TAM), possibly backed by a forked engine such as BadgerDB or RocksDB, adapted to support PostgreSQL’s MVCC and WAL semantics. I’d love to hear your thoughts: 1. Does this direction make sense for experimentation within the Postgres ecosystem? 2. Are there known architectural blockers or prior discussions/attempts in this space worth revisiting? 3. Would such a project be best developed entirely as a fork, or is there openness to evolving TAM to better support pluggable storage with LSM‑like semantics? Looking forward to your feedback. - Manish https://github.com/manishrjain