Re: [DISCUSS] V4 - indexing support

2025-07-15 Thread Sreeram Garlapati
Thanks Steven for starting this. I am interested in the - Index'ing related conversations. Here are some preliminary thoughts: 1. *Primary Index*: Conventionally Primary Index - just means what the Table's Primary storage layout/organization was. Given that Iceberg supports Sort-order -

Re: Iceberg 1.10.0 release update - July 1, 2025

2025-07-15 Thread Ajantha Bhat
I have approached Confluent people to help us publish the OSS Kafka Connect Iceberg sink plugin. It seems we have a CVE from dependency that blocks us from publishing the plugin. Please include the below PR for 1.10.0 release

Re: [DISCUSS] v4 - Improved column statistics

2025-07-15 Thread Eduard Tudenhöfner
Hey everyone, We met yesterday and talked about the column stats proposal. Please find the recording here and the notes here

Re: [DISCUSS] V4 - indexing support

2025-07-15 Thread Maximilian Michels
Thanks Steven for the summary. It would be great to extend the Iceberg spec with index files, such that they can be used for the different use cases. For my understanding, let me further outline the different types of use cases for index files: --- Topic 1: Accelerating the resolution of equality

Re: [discuss] ensure feature consistency across the 3 different spark versions

2025-07-15 Thread Wing Yew Poon
Kevin, Just a minor clarification: I want to point out that Spark 4.0 is in an interesting state right > now. Spark 4.0 is not yet the "latest supported version" since Iceberg > 1.10 should be the first version that works with Spark 4.0 according to > #13162 >

Re: [VOTE] Release Apache Iceberg 1.9.2 RC0

2025-07-15 Thread Yuya Ebihara
+1 (non-binding) Confirmed that Trino CI is green. It runs tests against several catalogs, including HMS, Glue, JDBC (PostgreSQL), REST (Polaris, Unity, S3 Tables, Tabular), Nessie, and Snowflake. Yuya On Wed, Jul 16, 2025 at 4:42 AM Kevin Liu wrote: > +1 (non-binding) > > - Verified signature

Re: [VOTE] Release Apache Iceberg 1.9.2 RC0

2025-07-15 Thread Kevin Liu
+1 (non-binding) - Verified signature, checksum, license. * Build + test passed using Java 17 * Ran a few examples on Spark * Ran pyiceberg integration tests > With only what I assume are 2 +1 (Prashant and Russell) does this pass? Releases (even a patch release like this one) requires 3 binding

Re: [DISCUSS] V4 - indexing support

2025-07-15 Thread Anurag Mantripragada
Thanks for starting this thread, Steven! I have been interested in secondary indexing in Iceberg. There was an old proposal secondary indexing [1], we may need to revist/redesign these structures. I agree this is a very broad topic and having indexing structures general enough to support a wide

Re: [discuss] ensure feature consistency across the 3 different spark versions

2025-07-15 Thread Kevin Liu
Thanks for the context, Wing and Anton! My main concern was around feature parity between the different Spark versions. And especially if a feature is only implemented in an older version of Spark. > I believe the general practice is to implement a feature in the latest supported version (current