Hi,

I’ve made quite some progress on building the integration for NoSQL databases. The initial code supports MongoDB [A], but is not limited to that database. A working implementation has been pushed as a draft-PR [1] for illustration purposes how it can look like when it is fully integrated. A couple of smaller PRs will follow.

Background: The only common denominator for "synchronization purposes” that all NoSQL databases support is a single-row compare-and-swap (CAS) operation - think of this as (pseudo-SQL) “UPDATE table SET x = :new_value WHERE primary_key = :primary_key AND x = :expected_old_value”.

The most important objective for the implementation is correctness, especially in scenarios with high concurrent load. Explicit tests to verify the correctness are included, for the CI “use case” and for manual/special runs against a clustered database setup (which are just “too much” for the Github hosted runners).

The current integration point is ‘MetaStoreManagerFactory’/’PolarisMetaStoreManager’ implemented in the “bridge” Gradle project.

The ‘components/persistence/README.md’ in the draft-PR contains more technical information.

A benchmarking tool to measure performance and correctness of Polaris will be proposed soon as a separate/independent effort. We have used this benchmarking tool to measure performance and implicitly the correctness of the implementation.

Implementations for particular (No)SQL databases are isolated in one (Gradle) project per database. This is effectively/conceptually the same approach that already works for Nessie, which supports quite some databases [2].

Robert

[1] https://github.com/apache/polaris/pull/1189
[2] https://projectnessie.org/nessie-latest/configuration/#support-for-the-database-specific-implementations [A] Technically there is also an “in memory” implementation for testing purposes (not intended to replace the existing one).


--
Robert Stupp
@snazy

Reply via email to