Enhanced JdbcCatalog with Namespace Management

Qinhua Yan Fri, 27 Aug 2021 11:48:05 -0700

Hi there,


We'd like to share our JdbcCatalog impl with the community and welcome any 
discussion.

We are aware of the existing JdbcCatalog 
impl<https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java>,
 however, it has some feature gaps and doesn't work for our use case. 
Therefore, we implemented a SQL-database backed Catalog with the following 
enhancements.

1.       Namespace management and configuration

*       Each namespace can be backed by a different S3 bucket. This allows fine 
grained access control at the namespace level.

*       At namespace creation time, users can choose either 1) use a 
pre-existing bucket; 2) let the Catalog create a new bucket.

*       Isolate logical TableIdentifiers from physical S3 locations.

*       Support rename table within the same namespace without touching S3.

2.       Support various kinds of databases

*       Use Jooq <https://www.jooq.org/> to connect to the database and to 
ensure SQL semantics.

*       Easy to support different kinds of SQL without touching the core 
Catalog code.

*       Provide database initialization scripts for Postgres.



This Catalog implementation can be easily extended to support some advanced 
features such as undelete tables and namespace-backed-by-multiple-backends.



Any comments and discussions are welcomed!


Thank you!
Qinhua Yan

Enhanced JdbcCatalog with Namespace Management

Reply via email to