iceberg and s3a compatibility

2023-07-11 Thread Perfect Stranger
Hello. I am currently reading this: https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/committers.html and learning about the s3a committers. It's a bit confusing and it seems like you need to be an expert in order to properly use these committers. Because you don't just write to an

Code review: [spark] skip empty file during table migration, table snapshotting or adding files

2023-07-11 Thread Pucheng Yang
Hi community, In a previous email, I asked about how to get rid of partitions that only contain empty files. Here I am proposing a PR https://github.com/apache/iceberg/pull/8040 (issue: https://github.com/apache/iceberg/issues/7949) to skip adding empty files during the migration, snapshotting or

Re: iceberg and s3a compatibility

2023-07-11 Thread russell . spitzer
The long story short is that Iceberg itself is a commit protocol. So you don’t have to configure any Hadoop commit protocols. Iceberg doesn’t use those methods because its metadata structure doesn’t rely on the location of data files as information about the state of those files. It can just wri