Re: Storing catalog directly on object store

2024-12-03 Thread Vladimir Ozerov
I second Ryan’s opinion that production-grade catalog is a much broader concept than just CAS-ing the pointer. What we observe in practice in our company, is that users want to work with large schemas (sometimes - with literally thousands schemes and millions tables), have support for common DDL o

Re: Storing catalog directly on object store

2024-12-03 Thread Xuanwo
Hi, Nikhil Thank you very much for bringing S3 tables discussion here. However, I would like to point out that the S3 Table is not the same concept we are discussing here. It is not an object storage-based catalog; instead, it is a stateful service that provides dedicated APIs. It’s better to

Re: [Proposal] Automating the PyIceberg Release Process

2024-12-03 Thread Kevin Liu
Hi Fokko, Thank you for pointing me to the ASF Release Policy, it was very informative! Based on the policies, we cannot automate signing in GitHub Actions because Python wheels are not considered reproducible [1]. The wheels include information about the current date and time, which prevents val

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-03 Thread Honah J.
+1 (binding) Bryan, thanks for running the release! Verified signature/checksum/license. Built against JDK17. Kevin, thanks for the update and explanation! Best regards, Honah On Tue, Dec 3, 2024 at 7:35 PM Kevin Liu wrote: > Hey everyone, > > I wanted to follow up on the issue I encountered

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-03 Thread Kevin Liu
Hey everyone, I wanted to follow up on the issue I encountered with the failed tests for `:iceberg-kafka-connect:iceberg-kafka-connect-runtime:integrationTest` using Java 21. The issue was caused by a port conflict with the test container for Kafka Connect [1]. After shutting down the conflicting

Re: [VOTE] Release Apache PyIceberg 0.8.1rc1

2024-12-03 Thread Sung Yun
+1 (non-binding) Checked signatures, checksums and validated license headers. Ran the coverage tests using python3.12. Sung On 2024/12/03 21:37:48 Fokko Driesprong wrote: > +1 (binding) > > Checked checksums, signatures, and licenses. > > Honah, there are some open PRs to bump to the latest d

Re: Retry ValidationException with concurrent writes to the same partition

2024-12-03 Thread Kevin Liu
Hi Ha, Thanks for the question! Typically, when concurrent writes happen to the same partition, the writer must retry the operation. This is because the state of the table (and the partition) has changed, and the current write cannot be safely applied to the updated state. For example, if the oper

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-03 Thread Yufei Gu
+1(binding) Verified signature, checksum, and license check. Build passed. Apache Polaris Test suites passed with the rc1. Yufei On Tue, Dec 3, 2024 at 3:58 PM Kevin Liu wrote: > +1 (non-binding) > Thanks for running the release! > > Verified signature, checksum, and license check. Built and

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-03 Thread Kevin Liu
+1 (non-binding) Thanks for running the release! Verified signature, checksum, and license check. Built and tested using JDK 17 (`17.0.6-zulu`) I also tried building with Java 21 (`21.0.4-amzn`) but ran into the following failed tests. I don't think this is blocking the current RC since support

Re: Retry ValidationException with concurrent writes to the same partition

2024-12-03 Thread Yufei Gu
If you’re looking for finer-grained isolation beyond the snapshot level, the closest feature currently *WIP* is *Fine-Grained Commit* in the REST catalog. You can find more details here: Fine-Grained Commit Design Document

Retry ValidationException with concurrent writes to the same partition

2024-12-03 Thread Ha Cao
Hello, I have some concurrent writes to the same partition and they overlap in data files, and the isolation level is snapshot. Expectedly, I get this ValidationException thrown from this line

Re: [VOTE] Release Apache PyIceberg 0.8.1rc1

2024-12-03 Thread Fokko Driesprong
+1 (binding) Checked checksums, signatures, and licenses. Honah, there are some open PRs to bump to the latest dependencies (e.g. Pandas 2.2.3 ), except for the warning, everything works well. Would be good to get those bumped at some point :)

Re: [VOTE] Release Apache PyIceberg 0.8.1rc1

2024-12-03 Thread Honah J.
+1 (binding) Thanks for running the release, Kevin! - Verified signatures/checksum/license - Ran tests "make test-coverage" in python 3.11 I noticed that when running tests with latest dependencies: - pandas==2.2.3 - pyspark==3.5.4 - getdaft==0.3.15 some tests failed due to the following warning

Re: Storing catalog directly on object store

2024-12-03 Thread Nikhil Benesch
> And I'm also looking forward to what Jack is alluding to. AWS just announced *native* S3 support for Iceberg buckets! [0] This is almost surely what Jack was alluding to. This is very cool. It's a much deeper integration than I was expecting but nonetheless one that fully satisfies my use case

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-03 Thread Jean-Baptiste Onofré
+1 (non binding) Regards JB On Thu, Nov 21, 2024 at 2:35 PM Bryan Keller wrote: > > Hi Everyone, > > I propose that we release the following RC as the official Apache Iceberg > 1.7.1 release. > > The commit ID is 4a432839233f2343a9eae8255532f911f06358ef > * This corresponds to the tag: apache-i

Re: [DISCUSS] Apache Iceberg Summit 2025 - Selection Committee

2024-12-03 Thread Anurag Mantripragada
Thanks for the email. I’m happy to help as well. Anurag Mantripragada > On Dec 1, 2024, at 7:27 PM, Nick Riasanovsky wrote: > > Happy to volunteer as well. > > - Nick Riasanovsky > > On Sun, Dec 1, 2024 at 10:17 PM Anton Okolnychyi > wrote: >> Happy to voluntee

Re: [Proposal] Automating the PyIceberg Release Process

2024-12-03 Thread Fokko Driesprong
Hey Kevin, First of all, thanks for working on the releases, that's always much appreciated. Regarding the changes to the release process, I'm all for automating as much as possible, but I have some concerns. I also think it is important to split out nightly builds, and the release process in gen

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-12-03 Thread Péter Váry
Built, verified the signature, checksums, rat and ran some tests. +1 Eduard Tudenhöfner ezt írta (időpont: 2024. dec. 3., K, 8:47): > I actually verified everything (checksums, license checks, tests) last > week but forgot to vote, sorry about that. > Also thanks for running the release Bryan.