Hi Robert,

Thanks for the update and the draft PR !

I would like to use this thread to thank Dennis. Big kudos to Dennis
for the changes he made: without these changes, it would have been
impossible to add new backends like MongoDB.

I propose we review and comment on Robert's PR.

I would also like to propose a community meeting to discuss the
Persistence Improvement and drive consensus.
What about Tuesday, March 25th at 9:30am PST ?

Thanks all !

Regards
JB

On Mon, Mar 17, 2025 at 2:43 PM Robert Stupp <sn...@snazy.de> wrote:
>
> Hi,
>
> I’ve made quite some progress on building the integration for NoSQL
> databases. The initial code supports MongoDB [A], but is not limited to
> that database. A working implementation has been pushed as a draft-PR
> [1] for illustration purposes how it can look like when it is fully
> integrated. A couple of smaller PRs will follow.
>
> Background: The only common denominator for "synchronization purposes”
> that all NoSQL databases support is a single-row compare-and-swap (CAS)
> operation - think of this as (pseudo-SQL) “UPDATE table SET x =
> :new_value WHERE primary_key = :primary_key AND x = :expected_old_value”.
>
> The most important objective for the implementation is correctness,
> especially in scenarios with high concurrent load. Explicit tests to
> verify the correctness are included, for the CI “use case” and for
> manual/special runs against a clustered database setup (which are just
> “too much” for the Github hosted runners).
>
> The current integration point is
> ‘MetaStoreManagerFactory’/’PolarisMetaStoreManager’ implemented in the
> “bridge” Gradle project.
>
> The ‘components/persistence/README.md’ in the draft-PR contains more
> technical information.
>
> A benchmarking tool to measure performance and correctness of Polaris
> will be proposed soon as a separate/independent effort. We have used
> this benchmarking tool to measure performance and implicitly the
> correctness of the implementation.
>
> Implementations for particular (No)SQL databases are isolated in one
> (Gradle) project per database. This is effectively/conceptually the same
> approach that already works for Nessie, which supports quite some
> databases [2].
>
> Robert
>
> [1] https://github.com/apache/polaris/pull/1189
> [2]
> https://projectnessie.org/nessie-latest/configuration/#support-for-the-database-specific-implementations
> [A] Technically there is also an “in memory” implementation for testing
> purposes (not intended to replace the existing one).
>
>
> --
> Robert Stupp
> @snazy
>

Reply via email to