Iceberg 1.4/spark3.5 seem to have some breaking issue with spark-connect

Nirav Patel Tue, 09 Jan 2024 16:52:29 -0800

Hi,
We are testing spark-connect with iceberg.
We tried spark 3.5, iceberg 1.4.x versions (all of
iceberg-spark-runtime-3.5_2.12-1.4.x.jar)


with all the 1.4.x jars we are having following issue when running iceberg
queries from sparkSession created using spark-connect (--remote
"sc://remote-master-node")

org.apache.iceberg.spark.source.SerializableTableWithSize cannot be cast to
org.apache.iceberg.Table at
org.apache.iceberg.spark.source.SparkInputPartition.table(SparkInputPartition.java:88)
at
org.apache.iceberg.spark.source.BatchDataReader.<init>(BatchDataReader.java:50)
at
org.apache.iceberg.spark.source.SparkColumnarReaderFactory.createColumnarReader(SparkColumnarReaderFactory.java:52)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:79)
at
org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
Source) at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.hashAgg_doAggregateWithKeys_0$(Unknown
Source) at

Someone else has reported this issue on github as well:
https://github.com/apache/iceberg/issues/8978

It's currently working with spark 3.4 and iceberg 1.3 . However Ideally
it'd be nice to get it working with spark 3.5 as well as 3.5 has many
improvements in spark-connect.

Thanks
Nirav

Iceberg 1.4/spark3.5 seem to have some breaking issue with spark-connect

Reply via email to