Re: [DISCUSS] Iceberg REST Catalog Idempotency

2025-09-19 Thread huaxin gao
Thanks, Peter and Yufei. I agree the main use case is network‑infrastructure retries. To keep the specification simple and move the proposal forward, let’s make the baseline key‑only idempotency. If there’s demand, we can add an optional payload‑binding mode (canonical JSON + SHA‑256), advertised v

Re: [DISCUSS] Iceberg REST Catalog Idempotency

2025-09-19 Thread Dennis Huo
+1 to this being mostly targeting a "low-level" retry semantic. Expanding on that though I'd say even "client-side retries" really have two distinct flavors: A. Business-logic-agnostic retries, e.g. in a common low-level HTTP client library - behaviorally, these should behave largely the same as "

Re: [DISCUSS] Removal of Individually Curated Blogs and Talks and Position on Vendor Documentation

2025-09-19 Thread Russell Spitzer
I'm not running into an error, I just didn't have time to check the linter so I was wondering if it would throw an error or if it's ok with orphan pages. On Fri, Sep 19, 2025 at 6:04 PM Kevin Liu wrote: > Assuming you're referring to this markdown linter from #13977 >

Re: [DISCUSS] Removal of Individually Curated Blogs and Talks and Position on Vendor Documentation

2025-09-19 Thread Kevin Liu
Assuming you're referring to this markdown linter from #13977 , I think you can change the path to `**/*.md` so it searches through all the markdown files. What error are yo

Re: [Discuss] Deprecating Spark 3.4

2025-09-19 Thread Kevin Liu
Thanks Anton and Eduard. I'm ok with being more aggressive with the deprecation schedule. Looking at the git history for `spark/v3.4/` , there are 5 new commits since the 1.10 release. Only 1 commit (3bbdee9

Re: [DISCUSS] Removal of Individually Curated Blogs and Talks and Position on Vendor Documentation

2025-09-19 Thread Russell Spitzer
Does anyone know if we can support an orphaned page in MkDocs without the new Markdown linter complaining? I'm testing out a build where we keep the page but disable robots/nofollow on it. On Fri, Sep 19, 2025 at 1:24 PM Kevin Liu wrote: > Thank you, Alex! I think we can proceed with the removal

Re: [DISCUSS] Removal of Individually Curated Blogs and Talks and Position on Vendor Documentation

2025-09-19 Thread Alex Merced
I have new home for continued development of the list created that people will be able to make pull requests into to add blogs and will cover a few other Lakehouse related OSS projects. Will post the details early next week, earlier if possible. *Alex Merced , * *H

[QUESTION] RESTTableOperations refresh does not enforce Table UUID check

2025-09-19 Thread Ma, Limin
Hi All, HiveTableOperations (BaseMetastoreTableOperations) does enforce Table UUID check when refreshing TableMetadata https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L205 However, RESTTableOperations does not enforce Table UUID

Re: [DISCUSS] Iceberg REST Catalog Idempotency

2025-09-19 Thread Yufei Gu
"*Network infrastructure retries*" would be the dominant use case. I'd NOT recommend clients retry with the same idempotency key if it regenerated the request, instead, clients should reload before retry in that case. Yufei On Fri, Sep 19, 2025 at 2:05 AM Péter Váry wrote: > Hi Huaxin, > > Cou

Re: [DISCUSS] Removal of Individually Curated Blogs and Talks and Position on Vendor Documentation

2025-09-19 Thread Russell Spitzer
I could see us keeping a deprecated version of the page, but I think the rationale of boosting search engine impacts for blog posts that are already on the page is actually one of the reasons we should remove the page. As a community we don't want to have a set of "special" blog posts that the proj

Re: [Discuss] Deprecating Spark 3.4

2025-09-19 Thread Steven Wu
Following up on Manu's question, why not just remove Spark 3.4 for the next 1.11 release? Or do we usually wait for one more release and remove it in the 1.12 release after marking 3.4 as deprecated in the engine status doc page? On Fri, Sep 19, 2025 at 9:12 AM Kevin Liu wrote: > > Given the man

Re: [Discuss] Deprecating Spark 3.4

2025-09-19 Thread Anton Okolnychyi
I know we followed this rule of deprecating a Spark version in one release and then removing it in the next one. Shall we ask ourselves whether it is still the model we want to follow? My problem like before is that we release a new Iceberg jar that is supposed to contain the latest and greatest f

Re: [Discuss] Deprecating Spark 3.4

2025-09-19 Thread Eduard Tudenhöfner
I agree with Anton and I would be in favor of just removing it in the next release. By updating the docs now we can already signal immediately that Spark 3.4 is deprecated and people can always use Iceberg 1.10 when needing Spark 3.4 support. On Fri, Sep 19, 2025 at 7:06 PM Anton Okolnychyi wrote

Re: [DISCUSS] Removal of Individually Curated Blogs and Talks and Position on Vendor Documentation

2025-09-19 Thread Kevin Liu
Thank you, Alex! I think we can proceed with the removal first. I'm also +1 on an official blog for project announcements. Best, Kevin Liu On Fri, Sep 19, 2025 at 10:46 AM Alex Merced wrote: > I have new home for continued development of the list created that people > will be able to make pull

Re: [DISCUSS] Removal of Individually Curated Blogs and Talks and Position on Vendor Documentation

2025-09-19 Thread Kevin Liu
The relevant links are either the top-level pages: - https://iceberg.apache.org/blogs/ - https://iceberg.apache.org/talks/ or the individual posts they reference. Examples from each page: - https://iceberg.apache.org/blogs/#kafka-to-iceberg-exploring-the-options - https://iceberg.apache.org/talks/#

Re: [DISCUSS] Removal of Individually Curated Blogs and Talks and Position on Vendor Documentation

2025-09-19 Thread Anton Okolnychyi
I think the project is too big now for us to maintain the list in its current form. I believe the original intent was to include references to any mentions of Iceberg to boost visibility as there was no company that would sponsor any media coverage for Iceberg in early days. At that time the list o

Re: [Discuss] Deprecating Spark 3.4

2025-09-19 Thread Kevin Liu
> why not just remove Spark 3.4 for the next 1.11 release? Or do we usually wait for one more release and remove it in the 1.12 release after marking 3.4 as deprecated in the engine status doc page? My preference is to mark as deprecated for one release and remove in the following. To quote JB:

Re: [Discussion] Versioned SQL UDFs (Catalog routines) in Iceberg

2025-09-19 Thread Yufei Gu
Hi folks, Really appreciated feedback from you all over the past few months. I've filed the initial PR for the UDF spec: https://github.com/apache/iceberg/pull/14117. It captures the consensus we've built and addresses the write amplification concern raised in our last discussion. Please take a l

Re: [Discuss] Deprecating Spark 3.4

2025-09-19 Thread Kevin Liu
Given the many +1's here, I've moved the PR to deprecate 3.4 to "ready for review", https://github.com/apache/iceberg/pull/14099 > Does it mean we will stop back-porting PRs to Spark 3.4 for 1.11? Not necessarily. There's a lot of Spark 3.4 backports already, https://github.com/apache/iceberg/com

[NOTICE] Critical issue in Flink IcebergSink with Flink 1.19.3 / Flink 1.20.2

2025-09-19 Thread Maximilian Michels
Scope = We recently discovered an issue [1] for users of the Flink V2 Iceberg sink [2], which originates from the Flink runtime in the following Flink patch releases: - Flink 1.19.3 - Flink 1.20.2 Notably, any releases prior to these patch releases are _not_ affected. For example, Flink 1.19

Re: [Discuss] Deprecating Spark 3.4

2025-09-19 Thread Amogh Jahagirdar
+1 On Fri, Sep 19, 2025 at 2:03 AM Péter Váry wrote: > +1 > > Eduard Tudenhöfner ezt írta (időpont: 2025. > szept. 19., P, 8:56): > >> +1 on deprecating Spark 3.4 >> >> On Thu, Sep 18, 2025 at 8:36 AM Steve wrote: >> >>> +1 >>> >>> On Wed, Sep 17, 2025 at 22:52 Jean-Baptiste Onofré >>> wrote:

[VOTE][C++] Release Apache Iceberg C++ 0.1.0 RC4

2025-09-19 Thread Gang Wu
Hi, I would like to propose the following release candidate (RC4) of Apache Iceberg C++ version 0.1.0. This is the first ever release of Apache Iceberg for C++, highlights include: - Core Metadata: Implementation of the fundamental Iceberg data model, including schemas, partition specs, and tab

Re: [VOTE] Deprecation of Position Deletes with Row Data

2025-09-19 Thread Scott Haines
+1. On Wed, Sep 10, 2025 at 10:32 PM huaxin gao wrote: > +1 (non-binding) > > On Wed, Sep 10, 2025 at 9:28 PM Prashant Singh > wrote: > >> +1 (non-binding) >> >> Best, >> Prashant Singh >> >> On Wed, Sep 10, 2025 at 6:58 PM Steve wrote: >> >>> +1 non-binding >>> >>> On Wed, Sep 10, 2025 at 20:

Re: [Discuss] Deprecating Spark 3.4

2025-09-19 Thread Péter Váry
+1 Eduard Tudenhöfner ezt írta (időpont: 2025. szept. 19., P, 8:56): > +1 on deprecating Spark 3.4 > > On Thu, Sep 18, 2025 at 8:36 AM Steve wrote: > >> +1 >> >> On Wed, Sep 17, 2025 at 22:52 Jean-Baptiste Onofré >> wrote: >> >>> +1 >>> >>> I agree about the plan to "announce" the deprecation

Re: [DISCUSS] Iceberg REST Catalog Idempotency

2025-09-19 Thread Péter Váry
Hi Huaxin, Could you clarify the specific use cases we intend to support regarding retry checking? Here are a couple of possibilities I had in mind: - *Network infrastructure retries* – where the exact same request is retried. - *Client-side retries* – where the client regenerates the re

Re: [DISCUSS] v4 - Improved column statistics

2025-09-19 Thread Eduard Tudenhöfner
Hey everyone, I have updated the proposal with the following things: - removed *column_size*, since this hasn't been used anywhere in earlier versions. Please shout if you t