Thanks Alonso, I think this gives me some ideas. My code is written in Python, and I use spark-submit to submit it. I am not sure what code is written in scala. Maybe the Phoenix driver based on the stack trace? How do I tell which version of scala that was compiled against?
Is there a jar that I need to add to the spark or hbase classpath? On Thursday, November 2, 2023 at 01:38:21 AM PDT, Aironman DirtDiver <alons...@gmail.com> wrote: The error message Caused by: java.lang.ClassNotFoundException: scala.Product$class indicates that the Spark job is trying to load a class that is not available in the classpath. This can happen if the Spark job is compiled with a different version of Scala than the version of Scala that is used to run the job. You have mentioned that you are using Spark 3.5.0, which is compatible with Scala 2.12. However, you have also mentioned that you have tried Scala versions 2.10, 2.11, 2.12, and 2.13. This suggests that you may have multiple versions of Scala installed on your system. To resolve the issue, you need to make sure that the Spark job is compiled and run with the same version of Scala. You can do this by setting the SPARK_SCALA_VERSION environment variable to the desired Scala version before starting the Spark job. For example, to compile the Spark job with Scala 2.12, you would run the following command: SPARK_SCALA_VERSION=2.12 sbt compile To run the Spark job with Scala 2.12, you would run the following command: SPARK_SCALA_VERSION=2.12 spark-submit spark-job.jar If you are using Databricks, you can set the Scala version for the Spark cluster in the cluster creation settings. Once you have ensured that the Spark job is compiled and run with the same version of Scala, the error should be resolved. Here are some additional tips for troubleshooting Scala version conflicts: * Make sure that you are using the correct version of the Spark libraries. The Spark libraries must be compiled with the same version of Scala as the Spark job. * If you are using a third-party library, make sure that it is compatible with the version of Scala that you are using. * Check the Spark logs for any ClassNotFoundExceptions. The logs may indicate the specific class that is missing from the classpath. * Use a tool like sbt dependency:tree to view the dependencies of your Spark job. This can help you to identify any conflicting dependencies. El jue, 2 nov 2023 a las 5:39, Harry Jamison (<harryjamiso...@yahoo.com.invalid>) escribió: > I am getting the error below when I try to run a spark job connecting to > phoneix. It seems like I have the incorrect scala version that some part of > the code is expecting. > > I am using spark 3.5.0, and I have copied these phoenix jars into the spark > lib > phoenix-server-hbase-2.5-5.1.3.jar > phoenix-spark-5.0.0-HBase-2.0.jar > > I have tried scala 2.10, 2.11, 2.12, and 2.13 > I do not see the scala version used in the logs so I am not 100% sure that it > is using the version I expect that it should be. > > > Here is the exception that I am getting > > 2023-11-01T16:13:00,391 INFO [Thread-4] handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@15cd3b2a{/static/sql,null,AVAILABLE,@Spark} > Traceback (most recent call last): > File "/hadoop/spark/spark-3.5.0-bin-hadoop3/copy_tables.py", line 10, in > <module> > .option("zkUrl", "namenode:2181").load() > File > "/hadoop/spark/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line > 314, in load > File > "/hadoop/spark/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", > line 1322, in __call__ > File > "/hadoop/spark/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", > line 179, in deco > File > "/hadoop/spark/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line > 326, in get_return_value > py4j.protocol.Py4JJavaError: An error occurred while calling o28.load. > : java.lang.NoClassDefFoundError: scala/Product$class > at > org.apache.phoenix.spark.PhoenixRelation.<init>(PhoenixRelation.scala:29) > at > org.apache.phoenix.spark.DefaultSource.createRelation(DefaultSource.scala:29) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:346) > at > org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:229) > at > org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:211) > at scala.Option.getOrElse(Option.scala:189) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:172) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374) > at py4j.Gateway.invoke(Gateway.java:282) > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) > at py4j.commands.CallCommand.execute(CallCommand.java:79) > at > py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) > at py4j.ClientServerConnection.run(ClientServerConnection.java:106) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.lang.ClassNotFoundException: scala.Product$class > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:527) > ... 20 more > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Alonso Isidoro Roman about.me/alonso.isidoro.roman --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org