Re: Discussion: Re-evaluating Realm Modeling in Polaris

Pierre Laporte Tue, 15 Apr 2025 05:35:26 -0700

Hi Prashant

I guess the answer will depend on how easy it should be for Polaris to
support multi-tenancy.

A separate database per realm would allow administrators to limit the
amount of resources that a realm can consume (e.g. the maximum number of
database connections).  Indeed, it would be one of the strongest isolation
mode.  However, the code would need to support a complete database
configuration per realm (think username and password and possibly IP
address) if the goal is to match Postgres capabilities.  In terms of
backup/restore, it is the most flexible option.

A "one schema per realm" approach would be a simpler approach, regarding
datasource configuration.  However, there would be less isolation between
realms, and a resource utilization spike on one realm could impact
performance of another realm.  It is as flexible as option #1 regarding
backup and restore.

A "realm as part of the primary key" approach is the most efficient way, in
that the cost of adding tenants is close to zero.  Like in option #2, there
is no real resource isolation between tenants and a noisy-neighbor
situation is a possible issue.  The biggest difference is regarding backup
and restore.  Consider the case where data is accidentally
wiped/corrupted/modified/... in a given tenant and administrators want to
restore it to a previous state.  With this approach, it is a much more
complex as Postgres does not (AFAIK) allow the possibility to restore
tables partially.

Just my 2 cents

--

Pierre

On Tue, Apr 15, 2025 at 12:42 AM Prashant Singh
<prashant.si...@snowflake.com.invalid> wrote:

> Dear Polaris Community,
>
> This email initiates a discussion regarding the modeling of Realms within
> the Polaris project, following its recent mention in my JDBC implementation
> pull request:
> https://github.com/apache/polaris/pull/1287/files#r2040383971.
>
> My current understanding, based on available information, is that Realms
> were primarily intended for isolation. Consequently, the EclipseLink
> implementation treats each Realm as a separate database.
>
> As we are re-implementing this functionality, it was suggested that we
> gather community feedback on the optimal approach to modeling Realms.
>
> Based on my current understanding, here are potential modeling options:
>
> *1. Separate Databases per Realm:*
>
>    - Each Realm would correspond to a distinct database.
>    - This could be implemented using Quarkus custom data sources, with one
>    data source per Realm.
>
> *2. Separate Schemas per Realm:*
>
>    - Each Realm would correspond to a distinct database schema within a
>    single database.
>    - Most database systems support two-part identifiers (
>    <schema_name>.<table_name>), allowing for data isolation.
>
> *3. Realm as a Primary Key:*
>
>    - A realm identifier would be added as a primary key (or part of a
>    composite primary key) to each Polaris table.
>    - Data isolation would be enforced through filtering based on this key
>    during data access.
>
> The optimal approach will likely depend on ease of use and maintainability
> for database administrators.
>
> Please share your thoughts and preferences regarding these options.
>
> Best regards,
>
> Prashant Singh
>

Re: Discussion: Re-evaluating Realm Modeling in Polaris

Reply via email to