I think what you are referring to is the "number of tables per namespace"
property.  See the binary tree example
<https://github.com/pingtimeout/polaris/tree/persistence-benchmarks/benchmarks#binary-tree-example>
in the docs where, after a binary tree of namespaces, 5 tables are created
in each namespace.  So yes, a scenario with 100 tables per namespace is
definitely possible.

Now, could we run those as well as part of this effort, I would say that it
depends.  What are the specific insights we are trying to get out of that
new scenario?

I believe the benchmark would need a reasonable number of namespaces as
well.  So an additional question is: what would the namespaces tree look
like in terms of width and height?  That could easily multiply to a high
number of entities, which in turn would mean that *only* the new
persistence implementation + MongoDB can be used.  I am fine with that, as
it is clear to me that the new persistence layer is the way to go.  Does
everybody agree with that statement?

--

Pierre

On Wed, Mar 19, 2025 at 9:33 PM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> I think I saw in the other document you had some benchmarks with a less 1N
> to 1T ratio? Could we run some of those as well? It would be great to have
> something with closer to a 1 Namspace to 100 tables sort of layout.
>
> On Wed, Mar 19, 2025 at 3:06 PM Pierre Laporte <pie...@pingtimeout.fr>
> wrote:
>
> > Just a heads up, I updated the report with the latest results from the
> > persistence work, as well as the tarball with raw results.
> >
> > --
> >
> > Pierre Laporte
> > @pingtimeout <https://twitter.com/pingtimeout>
> > pie...@pingtimeout.fr
> >
> >
> > On Wed, Mar 19, 2025 at 3:20 PM Pierre Laporte <pie...@pingtimeout.fr>
> > wrote:
> >
> > > Hi,
> > >
> > > I have been working on a set of benchmarks for Polaris [1].  I have run
> > > them against the current main branch (Eclipselink+Postgresql)
> > > implementation as well as the NoSQL persistence layer implementation
> [2].
> > > The complete report for these performance tests is available at this
> > > address:
> > >
> >
> https://docs.google.com/document/d/1RLYaAtNUkgNW3Ef7-BWfF_8RkSK7B7oR/edit.
> > > Feel free to review it at your convenience.
> > >
> > > The benchmarks demonstrate that the new Persistence implementation
> > offers:
> > >
> > >    - Comparable or better performance for sequential operations
> > >    - Significantly better reliability under concurrent load
> > >    - Consistent read performance even under high-concurrency scenarios
> > >    - Some challenges with write operations under high concurrent writes
> > >    conditions (under investigation)
> > >
> > > These results suggest that the NoSQL persistence layer implementation
> > > provides a robust foundation for scaling Polaris, particularly for
> > > workloads dominated by high concurrency.
> > >
> > > I will soon open a separate PR to contribute these benchmarks to the
> main
> > > codebase.
> > >
> > > Let me know if you have any question.
> > >
> > > Pierre
> > >
> > > [1]
> > >
> >
> https://github.com/pingtimeout/polaris/tree/persistence-benchmarks/benchmarks
> > > [2] https://github.com/apache/polaris/pull/1189
> > >
> > > --
> > >
> > > Pierre Laporte
> > > @pingtimeout <https://twitter.com/pingtimeout>
> > > pie...@pingtimeout.fr
> > > http://www.pingtimeout.fr/
> > >
> > >
> > > On Mon, Mar 17, 2025 at 3:46 PM Jean-Baptiste Onofré <j...@nanthrax.net>
> > > wrote:
> > >
> > >> Hi Robert,
> > >>
> > >> Thanks for the update and the draft PR !
> > >>
> > >> I would like to use this thread to thank Dennis. Big kudos to Dennis
> > >> for the changes he made: without these changes, it would have been
> > >> impossible to add new backends like MongoDB.
> > >>
> > >> I propose we review and comment on Robert's PR.
> > >>
> > >> I would also like to propose a community meeting to discuss the
> > >> Persistence Improvement and drive consensus.
> > >> What about Tuesday, March 25th at 9:30am PST ?
> > >>
> > >> Thanks all !
> > >>
> > >> Regards
> > >> JB
> > >>
> > >> On Mon, Mar 17, 2025 at 2:43 PM Robert Stupp <sn...@snazy.de> wrote:
> > >> >
> > >> > Hi,
> > >> >
> > >> > I’ve made quite some progress on building the integration for NoSQL
> > >> > databases. The initial code supports MongoDB [A], but is not limited
> > to
> > >> > that database. A working implementation has been pushed as a
> draft-PR
> > >> > [1] for illustration purposes how it can look like when it is fully
> > >> > integrated. A couple of smaller PRs will follow.
> > >> >
> > >> > Background: The only common denominator for "synchronization
> purposes”
> > >> > that all NoSQL databases support is a single-row compare-and-swap
> > (CAS)
> > >> > operation - think of this as (pseudo-SQL) “UPDATE table SET x =
> > >> > :new_value WHERE primary_key = :primary_key AND x =
> > >> :expected_old_value”.
> > >> >
> > >> > The most important objective for the implementation is correctness,
> > >> > especially in scenarios with high concurrent load. Explicit tests to
> > >> > verify the correctness are included, for the CI “use case” and for
> > >> > manual/special runs against a clustered database setup (which are
> just
> > >> > “too much” for the Github hosted runners).
> > >> >
> > >> > The current integration point is
> > >> > ‘MetaStoreManagerFactory’/’PolarisMetaStoreManager’ implemented in
> the
> > >> > “bridge” Gradle project.
> > >> >
> > >> > The ‘components/persistence/README.md’ in the draft-PR contains more
> > >> > technical information.
> > >> >
> > >> > A benchmarking tool to measure performance and correctness of
> Polaris
> > >> > will be proposed soon as a separate/independent effort. We have used
> > >> > this benchmarking tool to measure performance and implicitly the
> > >> > correctness of the implementation.
> > >> >
> > >> > Implementations for particular (No)SQL databases are isolated in one
> > >> > (Gradle) project per database. This is effectively/conceptually the
> > same
> > >> > approach that already works for Nessie, which supports quite some
> > >> > databases [2].
> > >> >
> > >> > Robert
> > >> >
> > >> > [1] https://github.com/apache/polaris/pull/1189
> > >> > [2]
> > >> >
> > >>
> >
> https://projectnessie.org/nessie-latest/configuration/#support-for-the-database-specific-implementations
> > >> > [A] Technically there is also an “in memory” implementation for
> > testing
> > >> > purposes (not intended to replace the existing one).
> > >> >
> > >> >
> > >> > --
> > >> > Robert Stupp
> > >> > @snazy
> > >> >
> > >>
> > >
> >
>

Reply via email to