Re: Discuss proposal - IRC APIs for Multi-Statement Multi-Table Transactions

2025-05-05 Thread Maninderjit Singh
Thanks for getting back to me and probing further! I agree that the approaches are functionally equivalent and we are debating using a logical clock which is time based vs non-time based. There are few good arguments that you have brought forward which I want to discuss further: *1. Catalog is most

Proposal: Support Cascading Deletion of Namespaces

2025-05-05 Thread Maninderjit Singh
Hi all, We recently discussed the need for supporting cascading drop for the namespaces in the iceberg Catalog community sync to allow different catalog implementations to precisely define their behavior. I have added a short proposal

Re: [DISCUSS] Finalizing the v3 spec

2025-05-05 Thread Anton Okolnychyi
DVs in Spark seem to behave reasonably, serving as a reference implementation of the V3 spec. There are areas for optimization/refinement but nothing was observed that requires changing the spec. I would also like to add the notion of content overhead/metadata (for Puffin/Parquet footers) to manife

Re: [Discuss] spec: Add missing 'spec_id' for data_file

2025-05-05 Thread Aihua Xu
Szehon pointed out that it's not serialized to the manifest file. So it's used internally but it doesn't need to be in the spec. Thanks, Aihua On Mon, May 5, 2025 at 10:34 AM Aihua Xu wrote: > Hi all, > > I notice that we are missing spec_id for data_file in the spec while > working on V3 featu

Re: [VOTE] Update partition stats spec for V3

2025-05-05 Thread Anton Okolnychyi
> > Does this mean that we can't upgrade v2 table to v3 table in a lazy > approach? That means it's not a mere table metadata upgrade, but we need to > upgrade all partition statistics? Renjie, it is simply a clarification in the spec. We have treated NULL as 0 in V2 while it wasn't very obvious.

Re: [VOTE] Update partition stats spec for V3

2025-05-05 Thread Anton Okolnychyi
My apologies for not following up earlier. This vote passed with 16 +1 (5 binding) and no 0/-1 votes. I am going to merge the spec PR later today. - Anton ср, 5 лют. 2025 р. о 22:14 Renjie Liu пише: > >Make delete counts required to avoid ambiguity w.r.t NULL vs unknown. > > Thanks Anton for dr

Re: Discuss proposal - IRC APIs for Multi-Statement Multi-Table Transactions

2025-05-05 Thread Jagdeep Sidhu
Hi Maninder, I believe the two approaches are functionally equivalent, "Global CatalogSequenceNumbers" or "Catalog timestamps" that you propose. The difference is simply whether we use an actual timestamp or some monotonically increasing sequence/logical clock. I think we should use sequence numbe

[Discuss] spec: Add missing 'spec_id' for data_file

2025-05-05 Thread Aihua Xu
Hi all, I notice that we are missing spec_id for data_file in the spec while working on V3 features (https://github.com/apache/iceberg/pull/12970) and it seems obvious. But let me know if a vote is needed for such change. Thanks, Aihua

[RESULT] [VOTE] Add encryption keys to table metadata

2025-05-05 Thread Ryan Blue
Thanks, everyone! This passes with 11 +1 votes and no -1 or +0. Ryan On Wed, Apr 30, 2025 at 8:44 PM Denny Lee wrote: > +1 (non-binding) > > On Wed, Apr 30, 2025 at 8:06 PM Aihua Xu wrote: > >> +1 (non-binding). >> >> On Wed, Apr 30, 2025 at 3:49 PM Daniel Weeks wrote: >> >>> +1 (binding) >>