Re: [DISCUSS] Allow Commit Conflicts in REPLACE TABLE transactions

2025-08-06 Thread Ryan Blue
Guy, I think you're right. The transaction should retry and update the latest metadata.json rather than blindly replacing it. Feel free to ping me for a review. On Wed, Aug 6, 2025 at 4:19 AM Guy Gadon wrote: > Hey all, > > Following up on the issue originally opened in > https://github.com/apac

RE: Re: [Discuss] Analytics Accelerator Library for Amazon S3 as default S3 Input Stream

2025-08-06 Thread Stubbs, Michael
I think Kevin has raised a few good points on the future of FileIO and the maintainability of the project going forwarded with the AAL default on Proposal. I think we should schedule community sync about this. Thank you! On 2025/07/31 13:12:48 Steve Loughran wrote: > On Fri, 25 Jul 2025 at 17:2

Re: [DISCUSS] Restructuring Docs side navigation

2025-08-06 Thread Robin Moffatt
Thanks Manu. Unfortunately I missed this due to vacation. Looking at the docs as they stand now, I think we've got one more iteration on this to make :) I think there is an issue for users who are coming to Iceberg from a non-coding point of view. Things like "Concepts" "Introduction" and many of

Re: Clarification on the flexibility of table statistics information

2025-08-06 Thread Gábor Kaszab
Hi, Puffin files are defined in a way that in practice, engines can put anything into them, not just what is standardized by the spec. I recall Hive serializes its own stats object and stores them in Puffin. However, while this information is not standardized, it can't be expected that any other e

[DISCUSS] Allow Commit Conflicts in REPLACE TABLE transactions

2025-08-06 Thread Guy Gadon
Hey all, Following up on the issue originally opened in https://github.com/apache/iceberg/issues/13651. The current behavior, which ignores any potential conflicts, can be very dangerous with metadata changes. This behavior can cause many potential issues - it can "revive" snapshots that were expi

Re: [DISCUSS] V4 - Parquet as Metadata File Format

2025-08-06 Thread Sreeram Garlapati
+1 This will be a great progression for iceberg format allowing efficient metadata pruning. pl. count me in. On Tue, Jun 17, 2025 at 3:45 AM Jacky Lee wrote: > Count me in. This solution effectively addresses the small files issue > caused by high-frequency writes in our scenario, and it also gr

Re: Clarification on the flexibility of table statistics information

2025-08-06 Thread Ajantha Bhat
Hi, >From my read of the spec, which may be overly pedantic, it seems like > attaching anything other than NDV + an associated compact theta sketch is > *not* compliant with the spec: True. In the section on Table Statistics > it’s explicit tha