Thanks Jack and Yufei,

I’ll take a look at the pr. I hope it can be merged soon.

I agree, it looks like I’ll have to take care of access control myself.

For catalog level table undelete operation, I can enqueue deletions in a 
separate table where it can live for a few days so that it can be recovered if 
needed. I’m not sure if a feature like this is useful for others.

I’m not considering Hive metastore at the moment, but will take a look at it to 
check how it manages the catalog.

Thanks,
Mayur

From: Jack Ye <yezhao...@gmail.com>
Sent: Tuesday, May 11, 2021 3:14 PM
To: dev@iceberg.apache.org
Subject: Re: Iceberg catalog questions

For your subsequent questions:

2. mapping namespace name to the file path is only a convention, and can be 
overridden at both namespace and table level. The table root path can be 
customized to be at any location, and we actually recommend that for cloud 
storage use cases to reduce throttling.

3. access control has to be done across systems. For example, in the AWS Glue + 
S3 use case, the caller has to have permission to access both Glue and S3 with 
the correct IAM resource permissions. The permission control capability really 
depends on the platform you are operating on. It is a bit tricky for a 
relational database where you have to basically manage row-level access 
control, but it is technically achievable.

4. The behavior varies among catalog implementations. Technically, 
CatalogUtil.dropTableData is called to clean up files when purge is enabled in 
most implementations. In that case, it cleans up the metadata file, the 
manifest lists, the manifests and data files. That means if purge is not 
enabled, those files are still there and you can recover the table if you can 
rebuild the table pointer in the catalog. But this is a manual action, there is 
no Iceberg API support for it. In addition, If your storage has a file 
retention feature and you can recover the file, you can recover the dropped 
table version, but that is a storage-level feature but not an Iceberg feature.

-Jack


On Tue, May 11, 2021 at 11:55 AM Jack Ye 
<yezhao...@gmail.com<mailto:yezhao...@gmail.com>> wrote:
Yes there is one, but unfortunately we lost attention after some time: 
https://github.com/apache/iceberg/pull/1870

I think the PR is close to be merged with quite a few rounds of review already, 
we should add it as a milestone of 0.12.

-Jack

On Tue, May 11, 2021 at 11:46 AM Mayur Srivastava 
<mayur.srivast...@twosigma.com<mailto:mayur.srivast...@twosigma.com>> wrote:

Hi,

I’m looking to use/implement a PostgreSQL based Iceberg catalog. I’m wondering 
if one already exists and also have a few questions. I would really appreciate 
any help I can get with the questions.

1.      Does Iceberg have a catalog that is compatible with PostgreSQL (or any 
storage backend that is compatible with PostgreSQL)?

a.      If there are similar implementations, could someone share their 
experience with the database schema used for the catalog? E.g. does a namespace 
map to a database in the catalog backend?

b.      Is there an existing abstract base class that I can use to implement 
the catalog that talks to PostgreSQL?

2.      Mapping catalog namespace with S3 bucket: does someone have a 
recommendation of managing catalog namespace along with AWS S3 (or GCS) 
buckets? For example, when a top level namespace is created in the catalog, do 
users map it to a bucket or a sub-directory structure on S3? (this may be 
useful for setting the similar access control for both catalog namespace and 
the S3 bucket.)

3.      Table access permission management: since metadata is stored in two 
separate systems (table metadata in S3 and namespace/table location in 
catalog), how are table access permissions kept in sync in these storage 
systems? E.g. if a catalog is used with GCS, how are the namespace/bucket/table 
access permissions kept in sync?

4.      Undeleting or recovering a dropped table: does the catalog support 
undelete operation? If the underlying S3 data is not purged, can the catalog be 
used to recover the dropped table?



Thanks,

Mayur

Reply via email to