Hi Manu,

Yes, documenting useClassPathFirst=true seems like a reasonable
approach for now.

However, this solution means we will likely encounter similar
dependency conflicts in the future, particularly regarding the Parquet
version used across our various extensions (Spark, Flink, Kafka
Connect, etc.). We should keep this in mind when defining dependencies
for our extensions going forward.

Regards,
JB

On Fri, Nov 21, 2025 at 5:00 AM Manu Zhang <[email protected]> wrote:
>
> Hi JB,
>
> We've already excluded Spark's parquet dependency from 
> iceberg-spark-runtime[1], and I don't think we should go back and block 
> variant/geometry support.
> On the other hand, it's unlikely for Spark 4.0.x to bump to Parquet 1.16.x in 
> a bug-fix release[2].
> Hence, `useClassPathFirst=true` is the best solution I can see. We can add it 
> to our documents.
>
> 1. https://github.com/apache/iceberg/blob/main/spark/v4.0/build.gradle#L80
> 2. https://github.com/apache/spark/pull/52165#issuecomment-3240831583
>
> Regards,
> Manu
>
> On Fri, Nov 21, 2025 at 2:23 AM Jean-Baptiste Onofré <[email protected]> 
> wrote:
>>
>> Hi,
>>
>> While testing the 1.10.x and main branches, I encountered an issue
>> regarding Parquet dependency versions that needs clarification.
>>
>> I noticed a mismatch in the Parquet versions used by Spark itself and
>> the Iceberg Spark extension:
>>
>> - Spark 4.0.1 uses Parquet version 1.15.2.
>> - Iceberg Spark 1.10.0 uses Parquet version 1.16.0.
>>
>> If I set spark.executor.userClassPathFirst=true, the execution is
>> fine. However, with the default setting (useClassPathFirst=false),
>> running table maintenance actions (such as expireSnapshots) results in
>> a java.lang.NoSuchMethodError:
>> org.apache.parquet.schema.LogicalTypeAnnotation$VariantLogicalTypeAnnotation
>> org.apache.parquet.schema.LogicalTypeAnnotation.variantType(byte).
>> This error originates within the Iceberg Spark
>> ParquetWithSparkSchemaVisitor. So, I suspect a change in variant (and
>> variant schema) in Parquet.
>>
>> This issue suggests there may be an incompatible change between
>> Parquet 1.15.2 and 1.16.0. Since shading does not seem to resolve
>> this, I wonder if we should enforce a consistent Parquet version
>> across both Spark and the Iceberg extension to prevent such conflicts.
>>
>> Do you have any thoughts on how to best address this dependency
>> mismatch for the upcoming 1.10.1 release?
>>
>> Regards,
>> JB
>>
>> On Mon, Sep 22, 2025 at 10:29 AM Amogh Jahagirdar <[email protected]> wrote:
>> >
>> > Hey folks,
>> >
>> > Iceberg 1.10 was released 2 weeks ago and there was one issue around 
>> > incorrect variant filtering reported that I think meets the criteria for a 
>> > patch release. The fix PR is in (thank you Drew). I wanted to kick this 
>> > discussion thread off in case folks had other issues in the 1.10 release 
>> > that they think warrant a patch release.
>> >
>> > I also think this PR  is a good candidate for a patch release; this is for 
>> > addressing a long-standing issue where closing the S3FileIO during an 
>> > event like moving broadcast variables from memory to disk leads to an 
>> > unexpected closing of the http client. There's still some discussion on 
>> > the approach of the fix but there's general recognition that it's a 
>> > legitimate issue, so I think it'd be ideal to get this in for a patch 
>> > release as well.
>> >
>> > I've also created a milestone here.
>> >
>> > Thanks,
>> >
>> > Amogh Jahagirdar

Reply via email to