That was the only exception that I saw from running on the command line. The error is pretty easy to reproduce. All I did was generate the app from Maven template, then run it on baseline Dataproc 2.1 image:
generate app: $ mvn archetype:generate -DarchetypeGroupId=org.apache.beam -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples -DarchetypeVersion=2.50.0 -DgroupId=org.example -DartifactId=word-count-beam -Dversion="0.1" -Dpackage=org.apache.beam.examples -DinteractiveMode=false compile: $ mvn package -Pspark-runner spark submit to baseline dataproc: $ gcloud dataproc jobs submit spark --cluster=cluster-221f --region=us-central1 --class=org.apache.beam.examples.WordCount --jars=./target/word-count-beam-bundled-0.1.jar -- --runner=SparkRunner Same app ran fine on Dataproc 2.0 with these library versions from this page: https://beam.apache.org/documentation/runners/spark/ Note: This example executes successfully with Dataproc 2.0, Spark 3.1.2 and Beam 2.37.0. On Mon, Oct 9, 2023 at 12:12 PM Chamikara Jayalath via dev < dev@beam.apache.org> wrote: > > > On Thu, Oct 5, 2023 at 2:05 PM L. C. <chen....@gmail.com> wrote: > >> I'm getting class not found error while running the word count example on >> Dataproc 2.1 with Beam 2.50.0. The class exists under the jar. Does >> anyone know how to resolve this? >> >> This is a list of dependency versions: >> <beam.version>2.50.0</beam.version> >> >> <bigquery.version>v2-rev20230520-2.0.0</bigquery.version> >> <google-api-client.version>2.0.0</google-api-client.version> >> <guava.version>32.1.2-jre</guava.version> >> <hamcrest.version>2.1</hamcrest.version> >> <jackson.version>2.14.1</jackson.version> >> <joda.version>2.10.10</joda.version> >> <junit.version>4.13.1</junit.version> >> <kafka.version>2.4.1</kafka.version> >> <libraries-bom.version>26.22.0</libraries-bom.version> >> <maven-compiler-plugin.version>3.7.0</maven-compiler-plugin.version> >> <maven-exec-plugin.version>1.6.0</maven-exec-plugin.version> >> <maven-jar-plugin.version>3.0.2</maven-jar-plugin.version> >> <maven-shade-plugin.version>3.1.0</maven-shade-plugin.version> >> <mockito.version>3.7.7</mockito.version> >> <pubsub.version>v1-rev20220904-2.0.0</pubsub.version> >> <slf4j.version>1.7.30</slf4j.version> >> <spark.version>3.2.2</spark.version> >> <hadoop.version>2.10.2</hadoop.version> >> >> <maven-surefire-plugin.version>3.0.0-M5</maven-surefire-plugin.version> >> <nemo.version>0.1</nemo.version> >> <flink.artifact.name>beam-runners-flink-1.16</flink.artifact.name> >> >> >> I used this to build a shaded jar: >> $ mvn compile -Pspark-runner package >> >> Here's the stack trace: >> > > Given that this raised NoClassDefFoundError (and > not ClassNotFoundException) it's possible that the class initialization > failed. Is there another exception before this one (may be at the first > occurrence of NoClassDefFoundError) ? > > >> Waiting for job output... >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/beam/sdk/coders/CoderProviderRegistrar >> at java.base/java.lang.ClassLoader.defineClass1(Native Method) >> at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017) >> at >> java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:174) >> at >> java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:800) >> at >> java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:698) >> at >> java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:621) >> at >> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:579) >> at >> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) >> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:576) >> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) >> at java.base/java.lang.Class.forName0(Native Method) >> at java.base/java.lang.Class.forName(Class.java:398) >> at >> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1210) >> at >> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1221) >> at >> java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1265) >> at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1300) >> at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1385) >> at >> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.Iterators.addAll(Iterators.java:366) >> at >> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.Lists.newArrayList(Lists.java:146) >> at >> org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.collect.Lists.newArrayList(Lists.java:132) >> at >> org.apache.beam.sdk.coders.CoderRegistry.<clinit>(CoderRegistry.java:168) >> at org.apache.beam.sdk.Pipeline.getCoderRegistry(Pipeline.java:334) >> at >> org.apache.beam.sdk.values.PCollection.finishSpecifyingOutput(PCollection.java:94) >> at >> org.apache.beam.sdk.runners.TransformHierarchy.setOutput(TransformHierarchy.java:173) >> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:546) >> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:479) >> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44) >> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:175) >> at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:150) >> at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:134) >> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:545) >> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:496) >> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56) >> at org.apache.beam.sdk.io.TextIO$Read.expand(TextIO.java:413) >> at org.apache.beam.sdk.io.TextIO$Read.expand(TextIO.java:275) >> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:545) >> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:496) >> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56) >> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:190) >> at org.apache.beam.examples.WordCount.runWordCount(WordCount.java:201) >> at org.apache.beam.examples.WordCount.main(WordCount.java:213) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) >> at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.base/java.lang.reflect.Method.invoke(Method.java:566) >> at >> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) >> at org.apache.spark.deploy.SparkSubmit.org >> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) >> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) >> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) >> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) >> at >> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) >> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> Caused by: java.lang.ClassNotFoundException: >> org.apache.beam.sdk.coders.CoderProviderRegistrar >> at >> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) >> at >> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) >> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) >> ... 53 more >> >