Re: Spec changes for deletion vectors

2024-10-15 Thread Russell Spitzer
@Scott We would have the ability to read delta vectors regardless of what we pick since on Iceberg side we really just need the bitmap and what offset it is located at within a file, everything else could be in the Iceberg metadata. We don’t have any disagreement on this aspect I think. The quest

Re: [DISCUSS] Formalized File IO Properties

2024-10-15 Thread Huang-Hsiang Cheng
Hi everyone, I tried to consolidate S3 properties in this PR: https://github.com/apache/iceberg/pull/11321 Hopefully we can start building a single source of truth from it. Thanks for your review, -Hsiang > On Aug 7, 2024, at 3:44 AM, Kevin Liu wrote: > > +1 on standardizing, and possibly e

[PROPOSAL] Add manifest-level statistics for CBO estimation

2024-10-15 Thread Xingyuan Lin
Hi everyone, Here's a doc for [Proposal] Add manifest-level statistics for CBO estimation . It's for more efficient derivation of stats for the CBO process. Original discussion thread

Re: Spec changes for deletion vectors

2024-10-15 Thread Scott Cowell
>From an engine perspective I think compatibility between Delta and Iceberg on DVs is a great thing to have. The additions for cross-compat seem a minor thing to me that is vastly outweighed by a future where Delta tables with DVs were supported in Delta Uniform and could be read by any Iceberg V3

Re: Spec changes for deletion vectors

2024-10-15 Thread Anton Okolnychyi
Are there engines/vendors/companies in the community that support both Iceberg and Delta and would benefit from having one blob layout for DVs? - Anton вт, 15 жовт. 2024 р. о 11:10 rdb...@gmail.com пише: > Thanks, Szehon. > > To clarify on compatibility, using the same format for the blobs make

Re: [VOTE] Standardize vended credentials in OpenAPI spec

2024-10-15 Thread Yufei Gu
+1 Yufei On Tue, Oct 15, 2024 at 12:09 PM Daniel Weeks wrote: > +1 > > On Tue, Oct 15, 2024 at 10:42 AM Russell Spitzer < > russell.spit...@gmail.com> wrote: > >> +1 >> >> On Tue, Oct 15, 2024 at 12:28 PM Bryan Keller wrote: >> >>> +1 >>> >>> On Oct 15, 2024, at 10:14 AM, Eduard Tudenhöfner <

Re: [VOTE] Standardize vended credentials in OpenAPI spec

2024-10-15 Thread Daniel Weeks
+1 On Tue, Oct 15, 2024 at 10:42 AM Russell Spitzer wrote: > +1 > > On Tue, Oct 15, 2024 at 12:28 PM Bryan Keller wrote: > >> +1 >> >> On Oct 15, 2024, at 10:14 AM, Eduard Tudenhöfner < >> etudenhoef...@apache.org> wrote: >> >> Hey everyone, >> >> I'd like to vote on #10722

Re: Spec changes for deletion vectors

2024-10-15 Thread rdb...@gmail.com
Thanks, Szehon. To clarify on compatibility, using the same format for the blobs makes it so that existing Delta readers can read and use the DVs written by Iceberg. I'd love for Delta to adopt Puffin, but if we adopt the extra fields they would not need to change how readers work. That's why I th

Re: Spec changes for deletion vectors

2024-10-15 Thread Szehon Ho
This is awesome work by Anton and Ryan, it looks like a ton of effort has gone into the V3 position vector proposal to make it clean and efficient, a long time coming and Im truly excited to see the great improvement in storage/perf. wrt to these fields, I think most of the concerns are already me

Re: [VOTE] Standardize vended credentials in OpenAPI spec

2024-10-15 Thread Russell Spitzer
+1 On Tue, Oct 15, 2024 at 12:28 PM Bryan Keller wrote: > +1 > > On Oct 15, 2024, at 10:14 AM, Eduard Tudenhöfner > wrote: > > Hey everyone, > > I'd like to vote on #10722 , > which has been open for quite a while now. > I believe we're in agreement

Re: [VOTE] Standardize vended credentials in OpenAPI spec

2024-10-15 Thread Bryan Keller
+1 > On Oct 15, 2024, at 10:14 AM, Eduard Tudenhöfner > wrote: > > Hey everyone, > > I'd like to vote on #10722 , > which has been open for quite a while now. > I believe we're in agreement on how we want to standardize credentials in the > Open

[VOTE] Standardize vended credentials in OpenAPI spec

2024-10-15 Thread Eduard Tudenhöfner
Hey everyone, I'd like to vote on #10722 , which has been open for quite a while now. I believe we're in agreement on how we want to standardize credentials in the OpenAPI spec. Please vote +1 if you generally agree with the path forward. Please vote

Re: [DISCUSS] REST: Standardize vended credentials in Spec

2024-10-15 Thread Eduard Tudenhöfner
This proposal has been open for quite a while now and there's a broad set of feedback and discussions on the PR itself. It seems we're overall in agreement on the direction of how we want to standardize credentials in the OpenAPI spec, so I'd like to open a VOTE thread shortly in order to get this

Re: [DISCUSS] REST: Refreshing vended credentials

2024-10-15 Thread Eduard Tudenhöfner
Hey Yufei, for 1) a client would choose the longest matching prefix. In terms of failure I guess it really depends on what kind of credentials the server sent to the client. If the server sent multiple credentials for the same table (one generic (*prefix=s3*) and one narrowed-down one ( *prefix=s3