Yeah. I understand the problem. One of the ways is to actually place the spark connect jar in the $SPARK_HOME/jars folder. That is how we run spark connect. Using the `--packages` or the `--jars` option is flaky in case of spark connect.
You can instead manually place the relevant spark connect jar file in the `$SPARK_HOME/jars` directory and remove the `--packages` or the `--jars` option from your start command. On Mon, Jul 29, 2024 at 7:01 PM Ilango <elango...@gmail.com> wrote: > > Thanks Prabodh, Yes I can see the spark connect logs in $SPARK_HOME/logs > path. It seems like the spark connect dependency issue. My spark node is > air gapped node so no internet is allowed. Can I download the spark connect > jar and pom files locally and share the local paths? How can I share the > local jars ? > > Error message: > > :: problems summary :: > > :::: WARNINGS > > module not found: org.apache.spark#spark-connect_2.12;3.5.1 > > > > ==== local-m2-cache: tried > > > > > file:/root/.m2/repository/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom > > > > -- artifact > org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar: > > > > > file:/root/.m2/repository/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar > > > > ==== local-ivy-cache: tried > > > > > /root/.ivy2/local/org.apache.spark/spark-connect_2.12/3.5.1/ivys/ivy.xml > > > > -- artifact > org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar: > > > > > /root/.ivy2/local/org.apache.spark/spark-connect_2.12/3.5.1/jars/spark-connect_2.12.jar > > > > ==== central: tried > > > > > https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom > > > > -- artifact > org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar: > > > > > https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar > > > > ==== spark-packages: tried > > > > > https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom > > > > -- artifact > org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar: > > > > > https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar > > > > :::::::::::::::::::::::::::::::::::::::::::::: > > > > :: UNRESOLVED DEPENDENCIES :: > > > > :::::::::::::::::::::::::::::::::::::::::::::: > > > > :: org.apache.spark#spark-connect_2.12;3.5.1: not found > > > > :::::::::::::::::::::::::::::::::::::::::::::: > > > > > > :::: ERRORS > > Server access error at url > https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom > (java.net.ConnectException: > Connection timed out (Connection timed out)) > > > > Server access error at url > https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar(java.net.ConnectException: > Connection timed out (Connection timed out)) > > > > Server access error at url > https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom > (java.net.ConnectException: > Connection timed out (Connection timed out)) > > > > Server access error at url > https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar(java.net.ConnectException: > Connection timed out (Connection timed out)) > > > > > > :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS > > Exception in thread "main" java.lang.RuntimeException: [unresolved > dependency: org.apache.spark#spark-connect_2.12;3.5.1: not found] > > at > org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1608) > > at > org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185) > > at > org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:334) > > at org.apache.spark.deploy.SparkSubmit.org > $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:964) > > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194) > > at > org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217) > > at > org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) > > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120) > > at > org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129) > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > > > > > Thanks, > Elango > > > On Mon, 29 Jul 2024 at 6:45 PM, Prabodh Agarwal <prabodh1...@gmail.com> > wrote: > >> The spark connect startup prints the log location. Is that not feasible >> for you? >> For me log comes to $SPARK_HOME/logs >> >> On Mon, 29 Jul, 2024, 15:30 Ilango, <elango...@gmail.com> wrote: >> >>> >>> Hi all, >>> >>> >>> I am facing issues with a Spark Connect application running on a Spark >>> standalone cluster (without YARN and HDFS). After executing the >>> start-connect-server.sh script with the specified packages, I observe a >>> process ID for a short period but am unable to see the corresponding port >>> (default 15002) associated with that PID. The process automatically stops >>> after around 10 minutes. >>> >>> Since the Spark History server is not enabled, I am unable to locate the >>> relevant logs or error messages. The logs for currently running Spark >>> applications are accessible from the Spark UI, but I am unsure where to >>> find the logs for the Spark Connect application and service. >>> >>> Could you please advise on where to find the logs or error messages >>> related to Spark Connect? >>> >>> >>> >>> >>> Thanks, >>> Elango >>> >>