Re: [Discuss] Change EntityTable properties and internal properties from TEXT to JSONB
Hey Prashant, thanks for working on this! I'm a bit confused about "non transactional Postgres" - how's PG not transactional? If so (like one DML per tx), how is consistency guaranteed, because IIRC the requirements that some people voiced is to keep the same schema as the current Eclipselink one. I've been a bit confused whether people use that one or not, because different people tell different things. Can you shed some light on that? I mean, if EL isn't used and there are better ways, why should we keep EL at all? From my experience (via #1189), it's not possible to just issue one DML per PG tx and guarantee consistency with the data model defined by EL. So I suspect you're working on a different database schema. If so, how does it work? Can you share your design and work with everybody here? Robert [1] https://github.com/apache/polaris/pull/1189 On 20.03.25 19:21, Prashant Singh wrote: Hey folks, I am presently working on implementing non transactional postgres implementation using jdbc. in the course of that i noticed in the entity table that the column type of the properties / internal properties column is TEXT and not JSONB, I dug a bit into the history of this, seems like we did wanted to have JSONB but ended up implementing it as TEXT (may to support more relational DB's which doesn't support JSONB) . JSONB has a couple of benefits, some of the noticeable ones are that it lets you define indexes on them [1], which would help us in supporting some different query patterns as we start supporting new entities. Until the separate table per entity discussion is finalized, this could be helpful as with supporting the new query pattern. For ex policy need to presently do client side filtering as policyType which is a present inside properties, it needs to be parsed in client side and then do filtering at client end. Please let me know your thoughts considering the above, happy to put out a PR for the same. I am happy to handle this in my JDBC impl as this is anyway a user would be consciously choosing it over eclipse impl. I would love to know your thoughts if we need to handle it in eclipse link impl as well. Best, Prashant [1] https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql/ -- Robert Stupp @snazy
[Polaris Community Sync] Mar. 21, 2025 record
Hi folks, Thanks everyone for joining the Apache Polaris Community Sync yesterday. As usual, amazing discussion, and a very very packed agenda. Sorry for the topics we postponed to the next Community Meeting. Here's the record: https://drive.google.com/file/d/1O2EO7ekFUnfpk2OY7yzP3WybNQ-hG5Zv/view?usp=sharing The website will be updated soon (PR on the way). Thanks! Regards JB
Re: [DISCUSS] Preparing 0.10.0 release including binary distributions
0.10.0-beta works for me On Fri, Mar 21, 2025 at 9:47 AM Jean-Baptiste Onofré wrote: > Hi Yufei, > > Thanks for your reply! > > I propose to use 0.10.0-beta (as consensus). > > I'm moving forward on the PR about LICENSE/NOTICE and checking the > artifacts. I will create this milestone on GitHub. > > Thanks, > Regards > JB > > On Thu, Mar 20, 2025 at 8:04 PM Yufei Gu wrote: > > > > I'm OK with 0.10.0-preview or Robert's idea of with a "-beta" or > > "-experimental" or 1.0.0-preview. We just need to give users a clear > > message that it's not a stable release you can rely on. > > > > Yufei > > > > > > On Wed, Mar 19, 2025 at 7:25 AM Jean-Baptiste Onofré > > wrote: > > > > > Thanks guys for the feedback. > > > > > > @Yufei, what do you think about 0.10.0 release then ? > > > > > > If we don't find a consensus on versioning, then, let's focus on the > > > 1.0.0 release directly. > > > > > > I would like to remind the purpose of the 0.10.0/1.0.0-preview release > > > purpose: > > > - include binary distributions > > > - include legal aspect for all artifacts > > > - submit the release to the mentors and, if passed, to the IPMC > > > The idea is to validate a release with binary distributions/artifacts > > > before the 1.0.0 release. > > > > > > Regards > > > JB > > > > > > On Wed, Mar 19, 2025 at 2:54 PM Dmitri Bourlatchkov > > > wrote: > > > > > > > > I'd be fine with 1.0.0-preview1 if we had a solid Persistence > codebase. > > > > > > > > I do not think that 1.0.0 necessarily conveys the idea of stability > in > > > > runtime, but I do believe that it has strong connotations for semver. > > > Given > > > > that NoSQL persistence is not settled yet, I'd like to avoid making > a 1.0 > > > > release because then we'd have to introduce major persistence > changes in > > > a > > > > minor release, which IMHO does not align with semver concepts very > well. > > > > > > > > ... and I guess we're not ready for 2.0 yet :) > > > > > > > > Cheers, > > > > Dmitri. > > > > > > > > On Wed, Mar 19, 2025 at 8:38 AM Jean-Baptiste Onofré < > j...@nanthrax.net> > > > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > @Dmitri and @Robert, are you fine with 1.0.0-preview1 ? > > > > > > > > > > I would like to move forward on this front ;) > > > > > > > > > > Thanks, > > > > > Regards > > > > > JB > > > > > > > > > > On Tue, Mar 18, 2025 at 12:08 AM Yufei Gu > > > wrote: > > > > > > > > > > > > Agreed with JB that 1.0-pre makes sense. My concern with 0.10.0 > is > > > that > > > > > it > > > > > > could mislead users into thinking an arbitrary cut from main > > > qualifies > > > > > as a > > > > > > stable release. > > > > > > > > > > > > > > > > > > Yufei > > > > > > > > > > > > > > > > > > On Mon, Mar 17, 2025 at 10:30 AM Jean-Baptiste Onofré < > > > j...@nanthrax.net> > > > > > > wrote: > > > > > > > > > > > > > It depends. For instance, Spark 4.0-preview started more than a > > > year > > > > > > > ago, and the scope changed. > > > > > > > If we communicate clearly it's just a previous and the scope > can > > > still > > > > > > > change, it's acceptable. > > > > > > > > > > > > > > I did the same on multiple projects: Camel 4.0.0.M1, M2, RC1, > RC2, > > > > > > > were different in content. > > > > > > > > > > > > > > I would separate the discussions in two parts: > > > > > > > 1. Should we have a discussion about what will be included in > 1.0 ? > > > > > > > Maybe yes, based on what we have in the GitHub Milestone. I > would > > > > > > > propose to start a separate thread about that. > > > > > > > 2. For preview release, as the idea is: > > > > > > > 2.1. Do a preparation release from main, different from 0.9.0, > > > > > > > including binary artifacts. > > > > > > > 2.2. Verify all legal aspects and the release is OK for the > IPMC > > > > > > > So, 0.10.0 or 1.0.0-preview work. Personally, I consider > > > 1.0.0-preview > > > > > > > more meaningful because, without considering the scope/content, > > > it's > > > > > > > really what it is: a preview release to test our process and > legal > > > > > > > aspects. > > > > > > > > > > > > > > Regards > > > > > > > JB > > > > > > > > > > > > > > On Mon, Mar 17, 2025 at 4:39 PM Dmitri Bourlatchkov < > > > di...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > > > > Using 1.0.0-preview1 implies the scope of 1.0 is > well-defined... > > > but > > > > > my > > > > > > > > impression is that it is not so. > > > > > > > > > > > > > > > > I think the 0.10.0 version is clear enough that it comes > before > > > 1.0 > > > > > and > > > > > > > > does not have any implied scope. > > > > > > > > > > > > > > > > While 0.10.0 is in progress, I believe we need to review the > > > scope > > > > > of 1.0 > > > > > > > > as a community on the dev list. I might have missed previous > > > > > discussions, > > > > > > > > but I do not recall a consensus on what goes into 1.0 :) > > > > > > > > > > > > > > > > WDYT? > > > > > > > > > > > > > > > > Thanks, > > > > > > > >
Re: [Discuss] Change EntityTable properties and internal properties from TEXT to JSONB
Re: "non-transactional", I guess it's more descriptive to refer to this new implementation as "AtomicOperationJdbcPersistence" if it's confusing to say "non-transactional"; the key is simply that we don't expose any "runInTransaction" to the AtomicOperationMetaStoreManager layer. Regarding Robert's question about "same schema", the important distinction is that all the prior discussions about "ease of migration" are more focused on in-place *unidirectional schema compatibility*, not necessarily *exact same schema*. This distinction is important because it's true that the current EclipseLink implementation's three tightly-coupled tables ModelEntity, ModelEntityActive, and ModelEntityChangeTracking indeed are not all carried over to the "AtomicOperationJdbcPersistence" implementation, and that keeping those three tables means keeping inefficient transaction behaviors. However, keeping "ModelEntity" but adding a secondary index on it allows unidirectional compatibility (migrating from old EclipseLink -> new AtomicOperationJdbcPersistence), without still keeping those three tightly-coupled tables. I suppose it's also true though that backwards-compatibility right now is kind of "best effort" at this stage in the project; if we do discover unforeseen serious shortcomings in trying to keep the "ModelEntity" table the same, it's reasonable to consider giving existing users a different migration path and defining a totally different schema if needed. But by default it's "nice" if it happens to be easy to preserve that compatibility Regarding JSONB, on the one hand, self-contained optimizations within a given persistence implementation (e.g. silently re-serialize any retrieved JSONB-backed JsonObjects into a plain JSON String to keep the current PolarisBaseEntity interface unchanged) seem safe enough but we wouldn't really reap much benefit (other than maybe a slight generalized performance improvement) without either: 1. Retooling the db-agnostic layer so that we efficiently interact with a JSONB-backed JsonObject directly instead of how the code currently expects a standard JSON string and redundantly deserializes it a bunch. 2. Actually using query patterns that index on fields within the JSON However, it's unclear how we expect to do (2) if we also have to support the other persistence backend implementations. Would we just optimize the index on PolicyType just for Postgres but not for the other implementations? If we wanted to make indexing of other subfields first-class, we probably need a better way of conveying that intent. Overall my take is it's fine enough if we think JSONB and/or "separate tables per type" have enough benefits to break the EclipseLink-compatibility behavior, and then using JSONB only as an internal detail is harmless, same as if a persistence implementation choose to transparently store text compressed for example. If we're going to use fancy indexing features though we should make sure we have an idea of how we intend to do that for other persistence backends as well (or ideally, if it fits into a general model we only need to implement once for each persistence backend and then can reuse for all sorts of new entity types). On Fri, Mar 21, 2025 at 2:41 PM Yufei Gu wrote: > Thanks, Prashant, for working on this! > > Using JSONB in the WIP JDBC implementation makes sense. For the EclipseLink > implementation, I'd recommend keeping the current schema as-is (without > JSONB) so existing users don’t need to go through a migration. > > If users do want to migrate, I’d suggest moving directly to the JDBC > implementation rather than adapting EclipseLink to the new JSONB field > > Yufei > > > On Fri, Mar 21, 2025 at 9:08 AM Prashant Singh > wrote: > > > Hi Robert, > > > > Thank you for your response! > > > > To clarify my point about "non-transactional" operations, I meant that > most > > operations don't require full transactions, with the exception of methods > > such as writeEntities. For this specific case, Dennis created a new > > implementation of Polaris MetaStore Manager, AtomicMetaStoreManager [1], > > which: > > > >- Implements PolarisMetaStoreManager using only one-shot atomic > >operations within a BasePersistence implementation. > >- Avoids requirements of open-ended multi-statement transactions. > > > > This approach doesn't necessitate model changes. My understanding is that > > we've agreed on this implementation in its current state (as Dimitri > > approved it), and the JDBCImpl will simply implement BasePersistence [2] > > and utilize AtomicMetaStoreManager to eliminate the need for > EclipseLink's > > transactional operations. > > > > Regarding your comment: > > > > because IIRC the requirements that some people voiced is to keep the same > > schema as the current Eclipselink one > > > > Yes, I'm aiming to maintain the same schema as EclipseLink in the new > JDBC > > implementation to facilitate easy migration. The entity table, grants > > table, a
Re: Clear separation of REST APIs
> I do not see a reason to change public facing Polaris APIs just because Iceberg's APIs changed/evolved. Really? If our goal is compatibility with the Iceberg REST spec, wouldn't Polaris APIs necessarily need to change if the Iceberg ones do? On Fri, Mar 21, 2025 at 4:14 AM Robert Stupp wrote: > There are still open concerns about the incompatibility w/ Servlet Spec > 6.0, which can be a problem. That will be changed/addressed in Iceberg's > v2 REST spec. But I do not see a good reason to _introduce_ this in > Polaris APIs, knowing that it is an issue. > > The other topic is "OAuth" - there is consensus (now) in Iceberg to > conform with the OAuth specification(s), and in turn _remove_ the > existing endpoint. Same concern here: I do not see a good reason to > _introduce_ this for Polaris APIs, already knowing that there is an > issue. We, Polaris, should really not re-define any OAuth specs. > > Regarding "shared types" - that's exactly the problem. I do not see a > reason to change public facing Polaris APIs just because Iceberg's APIs > changed/evolved. > > Regarding "shared paths" - that's also a problem. We (Polaris) do _not_ > "control" Iceberg's paths/endpoints - there is no guarantee that there > will be no conflict ever. On top, there's the path-representation issue > mentioned above, which does change the representation of your example > about "namespaces". > > When Iceberg introduces the v2 spec, Polaris would have to adopt its own > spec and implementation, causing unnecessary work for the project and > for users of the Polaris APIs. > > Sure, it's easier and quicker to just reuse what Iceberg _currently_ > defines. But it will cause more issues in the near-ish future, which can > be avoided with a little more effort. Polaris should really have APIs > that do not copy/implement already known issues and knowingly rely on > things that will go away. > > NB: Iceberg's API is not "our" (Polaris) API - it's independent, Polaris > "just" implements it. > > > On 21.03.25 03:43, Honah J. wrote: > > Thanks for the discussion so far! > > > > 'iceberg-rest-catalog-open-api.yaml' is declared to be a 1:1 copy of > >> Iceberg's v.1.7.1 'open-api/rest-catalog-open-api.yaml', but in fact it > >> has already diverged beyond just code formatting. > > The iceberg-rest-catalog-open-api.yaml file should indeed be a 1:1 copy > of > > the upstream Iceberg 1.7.1 spec. I performed a text comparison and > > confirmed that while there are formatting differences, the actual content > > remains identical. Please let me know if I miss any diff here so we can > fix > > Sorry - that was my fault when comparing the yaml files. I confirm that > the files are identical (except for formatting, which is not a problem > at all). > > > it. However, it is true that Polaris’s Iceberg Catalog spec need some > > customization from the Iceberg REST Catalog (IRC) spec: We have the > > notification API and we do not support scan planning APIs yet. For the > > /v1/oauth/tokens, It is already marked as deprecated and we have a > separate > > issue to track the removal https://github.com/apache/polaris/issues/12. > The > > reason that we made a copy of the token endpoint definition is let > Polaris > > community decide when to move it. > > > > For the two main issues Yun highlighted: > > > > For 1): The original intention behind consolidating all APIs in the root > > file (polaris-catalog-service.yaml) was to minimize duplication, > especially > > regarding shared definitions like "servers" and "securitySchemes". > However, > > I understand that relying solely on Open API tags such as "Catalog API", > > "Policy API", and "Generic API" to separate endpoints might cause > > confusion. Hence I am ok to follow Yun's suggestion to split these APIs > > clearly into two separate root files. This refactoring is straightforward > > and can be done at any point without impacting existing code. We'll just > > need an additional link on the website to render and preview Polaris APIs > > separately after this change. > > For 2) IMHO, we should maintain consistency across our APIs to avoid user > > confusion. Introducing separate definitions for basic definitions such as > > namespace, pageToken, pageSize, and error could complicate the user > > experience unnecessarily. These core definitions are unlikely to > experience > > breaking changes frequently. Even in the event they do evolve, for > example > > from v1 to v2, we will need to do the update in Polaris anyway to > > remain *Iceberg > > compatible *and we will keep supporting v1 apis for a long time to allow > > enough time for users to upgrade. Hence I am +1 on share basic > definitions > > with Iceberg spec file. > > > > Would love to hear what others think about this! > > Best, > > Honah (Jonas) > > > > On Thu, Mar 20, 2025 at 2:00 PM yun zou > wrote: > > > >> Thanks Robert for bringing this up! > >> > >> I see we mentioned two problems in this thread: > >> 1) polaris-catalog-service.yaml c
Re: [Discuss] Change EntityTable properties and internal properties from TEXT to JSONB
Hi Robert, Thank you for your response! To clarify my point about "non-transactional" operations, I meant that most operations don't require full transactions, with the exception of methods such as writeEntities. For this specific case, Dennis created a new implementation of Polaris MetaStore Manager, AtomicMetaStoreManager [1], which: - Implements PolarisMetaStoreManager using only one-shot atomic operations within a BasePersistence implementation. - Avoids requirements of open-ended multi-statement transactions. This approach doesn't necessitate model changes. My understanding is that we've agreed on this implementation in its current state (as Dimitri approved it), and the JDBCImpl will simply implement BasePersistence [2] and utilize AtomicMetaStoreManager to eliminate the need for EclipseLink's transactional operations. Regarding your comment: because IIRC the requirements that some people voiced is to keep the same schema as the current Eclipselink one Yes, I'm aiming to maintain the same schema as EclipseLink in the new JDBC implementation to facilitate easy migration. The entity table, grants table, and all other schemas remain consistent. Regarding the JSONB suggestion, I wanted to gather the community's thoughts. If we agree, we could incorporate it into both the JDBC and EclipseLink implementations, or solely into the JDBC implementation. As mentioned, JSONB would be beneficial for supporting new query patterns. Please share your thoughts on this. Best regards, Prashant Singh [1] https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/AtomicOperationMetaStoreManager.java [2] https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/BasePersistence.java On Fri, Mar 21, 2025 at 5:23 AM Robert Stupp wrote: > Hey Prashant, > > thanks for working on this! > > I'm a bit confused about "non transactional Postgres" - how's PG not > transactional? If so (like one DML per tx), how is consistency > guaranteed, because IIRC the requirements that some people voiced is to > keep the same schema as the current Eclipselink one. I've been a bit > confused whether people use that one or not, because different people > tell different things. Can you shed some light on that? I mean, if EL > isn't used and there are better ways, why should we keep EL at all? > > From my experience (via #1189), it's not possible to just issue one DML > per PG tx and guarantee consistency with the data model defined by EL. > So I suspect you're working on a different database schema. If so, how > does it work? > > Can you share your design and work with everybody here? > > Robert > > [1] https://github.com/apache/polaris/pull/1189 > > On 20.03.25 19:21, Prashant Singh wrote: > > Hey folks, > > I am presently working on implementing non transactional postgres > > implementation using jdbc. in the course of that i noticed in the entity > > table that the column type of the properties / internal properties column > > is TEXT and not JSONB, I dug a bit into the history of this, seems like > we > > did wanted to have JSONB but ended up implementing it as TEXT (may to > > support more relational DB's which doesn't support JSONB) . > > > > JSONB has a couple of benefits, some of the noticeable ones are that it > > lets you define indexes on them [1], which would help us in supporting > some > > different query patterns as we start supporting new entities. Until the > > separate table per entity discussion is finalized, this could be helpful > as > > with supporting the new query pattern. For ex policy need to presently do > > client side filtering as policyType which is a present inside properties, > > it needs to be parsed in client side and then do filtering at client end. > > > > Please let me know your thoughts considering the above, happy to put out > a > > PR for the same. > > I am happy to handle this in my JDBC impl as this is anyway a user would > be > > consciously choosing it over eclipse impl. I would love to know your > > thoughts if we need to handle it in eclipse link impl as well. > > > > > > Best, > > Prashant > > > > [1] > > > https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql/ > > > -- > Robert Stupp > @snazy > >
Re: [DISCUSS] Preparing 0.10.0 release including binary distributions
For me as well On 21.03.25 15:07, Dmitri Bourlatchkov wrote: 0.10.0-beta works for me On Fri, Mar 21, 2025 at 9:47 AM Jean-Baptiste Onofré wrote: Hi Yufei, Thanks for your reply! I propose to use 0.10.0-beta (as consensus). I'm moving forward on the PR about LICENSE/NOTICE and checking the artifacts. I will create this milestone on GitHub. Thanks, Regards JB On Thu, Mar 20, 2025 at 8:04 PM Yufei Gu wrote: I'm OK with 0.10.0-preview or Robert's idea of with a "-beta" or "-experimental" or 1.0.0-preview. We just need to give users a clear message that it's not a stable release you can rely on. Yufei On Wed, Mar 19, 2025 at 7:25 AM Jean-Baptiste Onofré wrote: Thanks guys for the feedback. @Yufei, what do you think about 0.10.0 release then ? If we don't find a consensus on versioning, then, let's focus on the 1.0.0 release directly. I would like to remind the purpose of the 0.10.0/1.0.0-preview release purpose: - include binary distributions - include legal aspect for all artifacts - submit the release to the mentors and, if passed, to the IPMC The idea is to validate a release with binary distributions/artifacts before the 1.0.0 release. Regards JB On Wed, Mar 19, 2025 at 2:54 PM Dmitri Bourlatchkov wrote: I'd be fine with 1.0.0-preview1 if we had a solid Persistence codebase. I do not think that 1.0.0 necessarily conveys the idea of stability in runtime, but I do believe that it has strong connotations for semver. Given that NoSQL persistence is not settled yet, I'd like to avoid making a 1.0 release because then we'd have to introduce major persistence changes in a minor release, which IMHO does not align with semver concepts very well. ... and I guess we're not ready for 2.0 yet :) Cheers, Dmitri. On Wed, Mar 19, 2025 at 8:38 AM Jean-Baptiste Onofré < j...@nanthrax.net> wrote: Hi everyone, @Dmitri and @Robert, are you fine with 1.0.0-preview1 ? I would like to move forward on this front ;) Thanks, Regards JB On Tue, Mar 18, 2025 at 12:08 AM Yufei Gu wrote: Agreed with JB that 1.0-pre makes sense. My concern with 0.10.0 is that it could mislead users into thinking an arbitrary cut from main qualifies as a stable release. Yufei On Mon, Mar 17, 2025 at 10:30 AM Jean-Baptiste Onofré < j...@nanthrax.net> wrote: It depends. For instance, Spark 4.0-preview started more than a year ago, and the scope changed. If we communicate clearly it's just a previous and the scope can still change, it's acceptable. I did the same on multiple projects: Camel 4.0.0.M1, M2, RC1, RC2, were different in content. I would separate the discussions in two parts: 1. Should we have a discussion about what will be included in 1.0 ? Maybe yes, based on what we have in the GitHub Milestone. I would propose to start a separate thread about that. 2. For preview release, as the idea is: 2.1. Do a preparation release from main, different from 0.9.0, including binary artifacts. 2.2. Verify all legal aspects and the release is OK for the IPMC So, 0.10.0 or 1.0.0-preview work. Personally, I consider 1.0.0-preview more meaningful because, without considering the scope/content, it's really what it is: a preview release to test our process and legal aspects. Regards JB On Mon, Mar 17, 2025 at 4:39 PM Dmitri Bourlatchkov < di...@apache.org> wrote: Using 1.0.0-preview1 implies the scope of 1.0 is well-defined... but my impression is that it is not so. I think the 0.10.0 version is clear enough that it comes before 1.0 and does not have any implied scope. While 0.10.0 is in progress, I believe we need to review the scope of 1.0 as a community on the dev list. I might have missed previous discussions, but I do not recall a consensus on what goes into 1.0 :) WDYT? Thanks, Dmitri. On Mon, Mar 17, 2025 at 9:49 AM Jean-Baptiste Onofré < j...@nanthrax.net> wrote: Usually, at Apache, we have two kind of versioning for "pre-release": - 1.0.0.M1 and 1.0.0.RC1 (Apache Superset, Apache Camel, Apache Karaf, Apache Cassandra, ... used this versioning) - 1.0.0-preview1 (Apache Spark, Apache Flink, ... used this versioning) For "clarity" for our community and users, I propose to use Apache Polaris (incubating) 1.0.0-preview1. Any objections? Regards JB On Mon, Mar 17, 2025 at 2:37 PM Kamesh Sampath wrote: Shall we name it like 1.0-pre? That aligns with common pattern across many opensource projects, another thought is to make that more semver friendly From: Yufei Gu Sent: Sunday, March 16, 2025 11:59:27 PM To: dev@polaris.apache.org Subject: Re: [DISCUSS] Preparing 0.10.0 release including binary distributions Thanks for the explanation, JB! In that case, we may focus on 0.10.0 only. How about a name like pre-1.0, which clarifies that it's a release mainly to test out something for 1.0.0? Yufei On Fri, Mar 14, 2025 at 11:33 PM Jean-Baptiste Onofré < j...@nanthrax.net> wrote: Hi Y
Re: [Discuss] Change EntityTable properties and internal properties from TEXT to JSONB
Thanks, Prashant, for working on this! Using JSONB in the WIP JDBC implementation makes sense. For the EclipseLink implementation, I'd recommend keeping the current schema as-is (without JSONB) so existing users don’t need to go through a migration. If users do want to migrate, I’d suggest moving directly to the JDBC implementation rather than adapting EclipseLink to the new JSONB field Yufei On Fri, Mar 21, 2025 at 9:08 AM Prashant Singh wrote: > Hi Robert, > > Thank you for your response! > > To clarify my point about "non-transactional" operations, I meant that most > operations don't require full transactions, with the exception of methods > such as writeEntities. For this specific case, Dennis created a new > implementation of Polaris MetaStore Manager, AtomicMetaStoreManager [1], > which: > >- Implements PolarisMetaStoreManager using only one-shot atomic >operations within a BasePersistence implementation. >- Avoids requirements of open-ended multi-statement transactions. > > This approach doesn't necessitate model changes. My understanding is that > we've agreed on this implementation in its current state (as Dimitri > approved it), and the JDBCImpl will simply implement BasePersistence [2] > and utilize AtomicMetaStoreManager to eliminate the need for EclipseLink's > transactional operations. > > Regarding your comment: > > because IIRC the requirements that some people voiced is to keep the same > schema as the current Eclipselink one > > Yes, I'm aiming to maintain the same schema as EclipseLink in the new JDBC > implementation to facilitate easy migration. The entity table, grants > table, and all other schemas remain consistent. > > Regarding the JSONB suggestion, I wanted to gather the community's > thoughts. If we agree, we could incorporate it into both the JDBC and > EclipseLink implementations, or solely into the JDBC implementation. As > mentioned, JSONB would be beneficial for supporting new query patterns. > > Please share your thoughts on this. > > Best regards, > > Prashant Singh > > [1] > > https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/AtomicOperationMetaStoreManager.java > [2] > > https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/BasePersistence.java > > On Fri, Mar 21, 2025 at 5:23 AM Robert Stupp wrote: > > > Hey Prashant, > > > > thanks for working on this! > > > > I'm a bit confused about "non transactional Postgres" - how's PG not > > transactional? If so (like one DML per tx), how is consistency > > guaranteed, because IIRC the requirements that some people voiced is to > > keep the same schema as the current Eclipselink one. I've been a bit > > confused whether people use that one or not, because different people > > tell different things. Can you shed some light on that? I mean, if EL > > isn't used and there are better ways, why should we keep EL at all? > > > > From my experience (via #1189), it's not possible to just issue one DML > > per PG tx and guarantee consistency with the data model defined by EL. > > So I suspect you're working on a different database schema. If so, how > > does it work? > > > > Can you share your design and work with everybody here? > > > > Robert > > > > [1] https://github.com/apache/polaris/pull/1189 > > > > On 20.03.25 19:21, Prashant Singh wrote: > > > Hey folks, > > > I am presently working on implementing non transactional postgres > > > implementation using jdbc. in the course of that i noticed in the > entity > > > table that the column type of the properties / internal properties > column > > > is TEXT and not JSONB, I dug a bit into the history of this, seems like > > we > > > did wanted to have JSONB but ended up implementing it as TEXT (may to > > > support more relational DB's which doesn't support JSONB) . > > > > > > JSONB has a couple of benefits, some of the noticeable ones are that it > > > lets you define indexes on them [1], which would help us in supporting > > some > > > different query patterns as we start supporting new entities. Until the > > > separate table per entity discussion is finalized, this could be > helpful > > as > > > with supporting the new query pattern. For ex policy need to presently > do > > > client side filtering as policyType which is a present inside > properties, > > > it needs to be parsed in client side and then do filtering at client > end. > > > > > > Please let me know your thoughts considering the above, happy to put > out > > a > > > PR for the same. > > > I am happy to handle this in my JDBC impl as this is anyway a user > would > > be > > > consciously choosing it over eclipse impl. I would love to know your > > > thoughts if we need to handle it in eclipse link impl as well. > > > > > > > > > Best, > > > Prashant > > > > > > [1] > > > > > > https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql/ > > >