Just a heads up, I updated the report with the latest results from the persistence work, as well as the tarball with raw results.
-- Pierre Laporte @pingtimeout <https://twitter.com/pingtimeout> pie...@pingtimeout.fr On Wed, Mar 19, 2025 at 3:20 PM Pierre Laporte <pie...@pingtimeout.fr> wrote: > Hi, > > I have been working on a set of benchmarks for Polaris [1]. I have run > them against the current main branch (Eclipselink+Postgresql) > implementation as well as the NoSQL persistence layer implementation [2]. > The complete report for these performance tests is available at this > address: > https://docs.google.com/document/d/1RLYaAtNUkgNW3Ef7-BWfF_8RkSK7B7oR/edit. > Feel free to review it at your convenience. > > The benchmarks demonstrate that the new Persistence implementation offers: > > - Comparable or better performance for sequential operations > - Significantly better reliability under concurrent load > - Consistent read performance even under high-concurrency scenarios > - Some challenges with write operations under high concurrent writes > conditions (under investigation) > > These results suggest that the NoSQL persistence layer implementation > provides a robust foundation for scaling Polaris, particularly for > workloads dominated by high concurrency. > > I will soon open a separate PR to contribute these benchmarks to the main > codebase. > > Let me know if you have any question. > > Pierre > > [1] > https://github.com/pingtimeout/polaris/tree/persistence-benchmarks/benchmarks > [2] https://github.com/apache/polaris/pull/1189 > > -- > > Pierre Laporte > @pingtimeout <https://twitter.com/pingtimeout> > pie...@pingtimeout.fr > http://www.pingtimeout.fr/ > > > On Mon, Mar 17, 2025 at 3:46 PM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > >> Hi Robert, >> >> Thanks for the update and the draft PR ! >> >> I would like to use this thread to thank Dennis. Big kudos to Dennis >> for the changes he made: without these changes, it would have been >> impossible to add new backends like MongoDB. >> >> I propose we review and comment on Robert's PR. >> >> I would also like to propose a community meeting to discuss the >> Persistence Improvement and drive consensus. >> What about Tuesday, March 25th at 9:30am PST ? >> >> Thanks all ! >> >> Regards >> JB >> >> On Mon, Mar 17, 2025 at 2:43 PM Robert Stupp <sn...@snazy.de> wrote: >> > >> > Hi, >> > >> > I’ve made quite some progress on building the integration for NoSQL >> > databases. The initial code supports MongoDB [A], but is not limited to >> > that database. A working implementation has been pushed as a draft-PR >> > [1] for illustration purposes how it can look like when it is fully >> > integrated. A couple of smaller PRs will follow. >> > >> > Background: The only common denominator for "synchronization purposes” >> > that all NoSQL databases support is a single-row compare-and-swap (CAS) >> > operation - think of this as (pseudo-SQL) “UPDATE table SET x = >> > :new_value WHERE primary_key = :primary_key AND x = >> :expected_old_value”. >> > >> > The most important objective for the implementation is correctness, >> > especially in scenarios with high concurrent load. Explicit tests to >> > verify the correctness are included, for the CI “use case” and for >> > manual/special runs against a clustered database setup (which are just >> > “too much” for the Github hosted runners). >> > >> > The current integration point is >> > ‘MetaStoreManagerFactory’/’PolarisMetaStoreManager’ implemented in the >> > “bridge” Gradle project. >> > >> > The ‘components/persistence/README.md’ in the draft-PR contains more >> > technical information. >> > >> > A benchmarking tool to measure performance and correctness of Polaris >> > will be proposed soon as a separate/independent effort. We have used >> > this benchmarking tool to measure performance and implicitly the >> > correctness of the implementation. >> > >> > Implementations for particular (No)SQL databases are isolated in one >> > (Gradle) project per database. This is effectively/conceptually the same >> > approach that already works for Nessie, which supports quite some >> > databases [2]. >> > >> > Robert >> > >> > [1] https://github.com/apache/polaris/pull/1189 >> > [2] >> > >> https://projectnessie.org/nessie-latest/configuration/#support-for-the-database-specific-implementations >> > [A] Technically there is also an “in memory” implementation for testing >> > purposes (not intended to replace the existing one). >> > >> > >> > -- >> > Robert Stupp >> > @snazy >> > >> >