Hi, I get a NoClassDefFoundError from IcebergSparkExtensions when running Spark 3.3, with iceberg-spark-runtime-3.3_2.12-1.0.0.jar. I noticed this jar doesn't contain scala classes, unlike previous jars iceberg-spark-runtime-3.3_2.12-0.14.1.jar.
scala> spark.sql("show databases").show java.lang.NoClassDefFoundError: scala/collection/SeqOps at org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions.$anonfun$apply$2(IcebergSparkSessionExtensions.scala:50) at org.apache.spark.sql.SparkSessionExtensions.$anonfun$buildResolutionRules$1(SparkSessionExtensions.scala:152) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.sql.SparkSessionExtensions.buildResolutionRules(SparkSessionExtensions.scala:152) at org.apache.spark.sql.internal.BaseSessionStateBuilder.customResolutionRules(BaseSessionStateBuilder.scala:216) at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1.<init>(HiveSessionStateBuilder.scala:94) at org.apache.spark.sql.hive.HiveSessionStateBuilder.analyzer(HiveSessionStateBuilder.scala:85) at org.apache.spark.sql.internal.BaseSessionStateBuilder.$anonfun$build$2(BaseSessionStateBuilder.scala:360) at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:87) at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:87) at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97) at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617) ... 47 elided Caused by: java.lang.ClassNotFoundException: scala.collection.SeqOps at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ... 79 more Note, I usually verify by copying the spark-runtime jar to spark jars dir (can't usually get --packages flag to work as indicated on https://iceberg.apache.org/how-to-release/#verifying-with-spark, as version is not released yet), so let me know if I am using the wrong jar? Thanks Szehon On Mon, Oct 10, 2022 at 9:22 AM Eduard Tudenhoefner <edu...@tabular.io> wrote: > +1 (non-binding) > > - validated checksum and signature > - checked license docs & ran RAT checks > - ran build and tests with JDK11 > > > Eduard > > On Mon, Oct 10, 2022 at 8:01 AM Ajantha Bhat <ajanthab...@gmail.com> > wrote: > >> +1 (non-binding) >> >> >> - Verified the Spark runtime jar contents. >> - Checked license docs, ran RAT checks. >> - Validated checksum and signature. >> >> >> Thanks, >> Ajantha >> >> On Mon, Oct 10, 2022 at 10:45 AM Prashant Singh <prashant010...@gmail.com> >> wrote: >> >>> Hello Everyone, >>> >>> Wanted to know your thoughts on whether we should also include the >>> following bug fixes in this release as well: >>> >>> 1. MERGE INTO nullability fix, leads to query failure otherwise: >>> *Reported instances :* >>> a. >>> https://stackoverflow.com/questions/73424454/spark-iceberg-merge-into-issue-caused-by-org-apache-spark-sql-analysisexcep >>> b. https://github.com/apache/iceberg/issues/5739 >>> c. https://github.com/apache/iceberg/issues/5424#issuecomment-1220688298 >>> >>> *PR's (Merged):* >>> a. https://github.com/apache/iceberg/pull/5880 >>> b. https://github.com/apache/iceberg/pull/5679 >>> >>> 2. QueryFailure when running RewriteManifestProcedure on Date / >>> Timestamp partitioned table when >>> `spark.sql.datetime.java8API.enabled` is true. >>> *Reported instances :* >>> a. https://github.com/apache/iceberg/issues/5104 >>> b. >>> https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1663982635731469 >>> >>> *PR* : >>> a. https://github.com/apache/iceberg/pull/5860 >>> >>> Regards, >>> Prashant Singh >>> >>> On Mon, Oct 10, 2022 at 4:15 AM Ryan Blue <b...@apache.org> wrote: >>> >>>> +1 (binding) >>>> >>>> - Checked license docs, ran RAT checks >>>> - Validated checksum and signature >>>> - Built and tested with Java 11 >>>> - Built binary artifacts with Java 8 >>>> >>>> >>>> On Sun, Oct 9, 2022 at 3:42 PM Ryan Blue <b...@apache.org> wrote: >>>> >>>>> Hi Everyone, >>>>> >>>>> I propose that we release the following RC as the official Apache >>>>> Iceberg 1.0.0 release. >>>>> >>>>> The commit ID is e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01 >>>>> * This corresponds to the tag: apache-iceberg-1.0.0-rc0 >>>>> * https://github.com/apache/iceberg/commits/apache-iceberg-1.0.0-rc0 >>>>> * >>>>> https://github.com/apache/iceberg/tree/e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01 >>>>> >>>>> The release tarball, signature, and checksums are here: >>>>> * >>>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.0.0-rc0 >>>>> >>>>> You can find the KEYS file here: >>>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS >>>>> >>>>> Convenience binary artifacts are staged on Nexus. The Maven repository >>>>> URL is: >>>>> * >>>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1106/ >>>>> >>>>> Please download, verify, and test. >>>>> >>>>> This release is based on the latest 0.14.1 release. It includes >>>>> changes to remove deprecated APIs and the following additional bug fixes: >>>>> * Increase metrics limit to 100 columns >>>>> * Bump Spark patch versions for CVE-2022-33891 >>>>> * Exclude Scala from Spark runtime Jars >>>>> >>>>> Please vote in the next 72 hours. >>>>> >>>>> [ ] +1 Release this as Apache Iceberg 1.0.0 >>>>> [ ] +0 >>>>> [ ] -1 Do not release this because... >>>>> >>>>> >>>>> -- >>>>> Ryan Blue >>>>> >>>> >>>> >>>> -- >>>> Ryan Blue >>>> >>>