Re: "non-transactional", I guess it's more descriptive to refer to this new
implementation as "AtomicOperationJdbcPersistence" if it's confusing to say
"non-transactional"; the key is simply that we don't expose any
"runInTransaction" to the AtomicOperationMetaStoreManager layer.

Regarding Robert's question about "same schema", the important distinction
is that all the prior discussions about "ease of migration" are more
focused on in-place *unidirectional schema compatibility*, not necessarily
*exact same schema*.

This distinction is important because it's true that the current
EclipseLink implementation's three tightly-coupled tables ModelEntity,
ModelEntityActive, and ModelEntityChangeTracking indeed are not all carried
over to the "AtomicOperationJdbcPersistence" implementation, and that
keeping those three tables means keeping inefficient transaction behaviors.

However, keeping "ModelEntity" but adding a secondary index on it allows
unidirectional compatibility (migrating from old EclipseLink -> new
AtomicOperationJdbcPersistence), without still keeping those three
tightly-coupled tables.

I suppose it's also true though that backwards-compatibility right now is
kind of "best effort" at this stage in the project; if we do discover
unforeseen serious shortcomings in trying to keep the "ModelEntity" table
the same, it's reasonable to consider giving existing users a different
migration path and defining a totally different schema if needed. But by
default it's "nice" if it happens to be easy to preserve that compatibility

Regarding JSONB, on the one hand, self-contained optimizations within a
given persistence implementation (e.g. silently re-serialize any retrieved
JSONB-backed JsonObjects into a plain JSON String to keep the current
PolarisBaseEntity interface unchanged) seem safe enough but we wouldn't
really reap much benefit (other than maybe a slight generalized performance
improvement) without either:

1. Retooling the db-agnostic layer so that we efficiently interact with a
JSONB-backed JsonObject directly instead of how the code currently expects
a standard JSON string and redundantly deserializes it a bunch.
2. Actually using query patterns that index on fields within the JSON

However, it's unclear how we expect to do (2) if we also have to support
the other persistence backend implementations. Would we just optimize the
index on PolicyType just for Postgres but not for the other implementations?

If we wanted to make indexing of other subfields first-class, we probably
need a better way of conveying that intent.

Overall my take is it's fine enough if we think JSONB and/or "separate
tables per type" have enough benefits to break the
EclipseLink-compatibility behavior, and then using JSONB only as an
internal detail is harmless, same as if a persistence implementation choose
to transparently store text compressed for example.

If we're going to use fancy indexing features though we should make sure we
have an idea of how we intend to do that for other persistence backends as
well (or ideally, if it fits into a general model we only need to implement
once for each persistence backend and then can reuse for all sorts of new
entity types).


On Fri, Mar 21, 2025 at 2:41 PM Yufei Gu <flyrain...@gmail.com> wrote:

> Thanks, Prashant, for working on this!
>
> Using JSONB in the WIP JDBC implementation makes sense. For the EclipseLink
> implementation, I'd recommend keeping the current schema as-is (without
> JSONB) so existing users don’t need to go through a migration.
>
> If users do want to migrate, I’d suggest moving directly to the JDBC
> implementation rather than adapting EclipseLink to the new JSONB field
>
> Yufei
>
>
> On Fri, Mar 21, 2025 at 9:08 AM Prashant Singh
> <prashant.si...@snowflake.com.invalid> wrote:
>
> > Hi Robert,
> >
> > Thank you for your response!
> >
> > To clarify my point about "non-transactional" operations, I meant that
> most
> > operations don't require full transactions, with the exception of methods
> > such as writeEntities. For this specific case, Dennis created a new
> > implementation of Polaris MetaStore Manager, AtomicMetaStoreManager [1],
> > which:
> >
> >    - Implements PolarisMetaStoreManager using only one-shot atomic
> >    operations within a BasePersistence implementation.
> >    - Avoids requirements of open-ended multi-statement transactions.
> >
> > This approach doesn't necessitate model changes. My understanding is that
> > we've agreed on this implementation in its current state (as Dimitri
> > approved it), and the JDBCImpl will simply implement BasePersistence [2]
> > and utilize AtomicMetaStoreManager to eliminate the need for
> EclipseLink's
> > transactional operations.
> >
> > Regarding your comment:
> >
> > because IIRC the requirements that some people voiced is to keep the same
> > schema as the current Eclipselink one
> >
> > Yes, I'm aiming to maintain the same schema as EclipseLink in the new
> JDBC
> > implementation to facilitate easy migration. The entity table, grants
> > table, and all other schemas remain consistent.
> >
> > Regarding the JSONB suggestion, I wanted to gather the community's
> > thoughts. If we agree, we could incorporate it into both the JDBC and
> > EclipseLink implementations, or solely into the JDBC implementation. As
> > mentioned, JSONB would be beneficial for supporting new query patterns.
> >
> > Please share your thoughts on this.
> >
> > Best regards,
> >
> > Prashant Singh
> >
> > [1]
> >
> >
> https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/AtomicOperationMetaStoreManager.java
> > [2]
> >
> >
> https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/BasePersistence.java
> >
> > On Fri, Mar 21, 2025 at 5:23 AM Robert Stupp <sn...@snazy.de> wrote:
> >
> > > Hey Prashant,
> > >
> > > thanks for working on this!
> > >
> > > I'm a bit confused about "non transactional Postgres" - how's PG not
> > > transactional? If so (like one DML per tx), how is consistency
> > > guaranteed, because IIRC the requirements that some people voiced is to
> > > keep the same schema as the current Eclipselink one. I've been a bit
> > > confused whether people use that one or not, because different people
> > > tell different things. Can you shed some light on that? I mean, if EL
> > > isn't used and there are better ways, why should we keep EL at all?
> > >
> > >  From my experience (via #1189), it's not possible to just issue one
> DML
> > > per PG tx and guarantee consistency with the data model defined by EL.
> > > So I suspect you're working on a different database schema. If so, how
> > > does it work?
> > >
> > > Can you share your design and work with everybody here?
> > >
> > > Robert
> > >
> > > [1] https://github.com/apache/polaris/pull/1189
> > >
> > > On 20.03.25 19:21, Prashant Singh wrote:
> > > > Hey folks,
> > > > I am presently working on implementing non transactional postgres
> > > > implementation using jdbc. in the course of that i noticed in the
> > entity
> > > > table that the column type of the properties / internal properties
> > column
> > > > is TEXT and not JSONB, I dug a bit into the history of this, seems
> like
> > > we
> > > > did wanted to have JSONB but ended up implementing it as TEXT (may to
> > > > support more relational DB's which doesn't support JSONB) .
> > > >
> > > > JSONB has a couple of benefits, some of the noticeable ones are that
> it
> > > > lets you define indexes on them [1], which would help us in
> supporting
> > > some
> > > > different query patterns as we start supporting new entities. Until
> the
> > > > separate table per entity discussion is finalized, this could be
> > helpful
> > > as
> > > > with supporting the new query pattern. For ex policy need to
> presently
> > do
> > > > client side filtering as policyType which is a present inside
> > properties,
> > > > it needs to be parsed in client side and then do filtering at client
> > end.
> > > >
> > > > Please let me know your thoughts considering the above, happy to put
> > out
> > > a
> > > > PR for the same.
> > > > I am happy to handle this in my JDBC impl as this is anyway a user
> > would
> > > be
> > > > consciously choosing it over eclipse impl. I would love to know your
> > > > thoughts if we need to handle it in eclipse link impl as well.
> > > >
> > > >
> > > > Best,
> > > > Prashant
> > > >
> > > > [1]
> > > >
> > >
> >
> https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql/
> > > >
> > > --
> > > Robert Stupp
> > > @snazy
> > >
> > >
> >
>

Reply via email to