Thanks Prabodh, Yes I can see the spark connect logs in $SPARK_HOME/logs
path. It seems like the spark connect dependency issue. My spark node is
air gapped node so no internet is allowed. Can I download the spark connect
jar and pom files locally and share the local paths? How can I share the
local jars ?
Error message:
:: problems summary ::
:::: WARNINGS
module not found: org.apache.spark#spark-connect_2.12;3.5.1
==== local-m2-cache: tried
file:/root/.m2/repository/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
-- artifact
org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar:
file:/root/.m2/repository/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar
==== local-ivy-cache: tried
/root/.ivy2/local/org.apache.spark/spark-connect_2.12/3.5.1/ivys/ivy.xml
-- artifact
org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar:
/root/.ivy2/local/org.apache.spark/spark-connect_2.12/3.5.1/jars/spark-connect_2.12.jar
==== central: tried
https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
-- artifact
org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar:
https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar
==== spark-packages: tried
https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
-- artifact
org.apache.spark#spark-connect_2.12;3.5.1!spark-connect_2.12.jar:
https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: org.apache.spark#spark-connect_2.12;3.5.1: not found
::::::::::::::::::::::::::::::::::::::::::::::
:::: ERRORS
Server access error at url
https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
(java.net.ConnectException:
Connection timed out (Connection timed out))
Server access error at url
https://repo1.maven.org/maven2/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar(java.net.ConnectException:
Connection timed out (Connection timed out))
Server access error at url
https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.pom
(java.net.ConnectException:
Connection timed out (Connection timed out))
Server access error at url
https://repos.spark-packages.org/org/apache/spark/spark-connect_2.12/3.5.1/spark-connect_2.12-3.5.1.jar(java.net.ConnectException:
Connection timed out (Connection timed out))
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved
dependency: org.apache.spark#spark-connect_2.12;3.5.1: not found]
at
org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1608)
at
org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:185)
at
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:334)
at org.apache.spark.deploy.SparkSubmit.org
$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:964)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
at
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Thanks,
Elango
On Mon, 29 Jul 2024 at 6:45 PM, Prabodh Agarwal <[email protected]>
wrote:
> The spark connect startup prints the log location. Is that not feasible
> for you?
> For me log comes to $SPARK_HOME/logs
>
> On Mon, 29 Jul, 2024, 15:30 Ilango, <[email protected]> wrote:
>
>>
>> Hi all,
>>
>>
>> I am facing issues with a Spark Connect application running on a Spark
>> standalone cluster (without YARN and HDFS). After executing the
>> start-connect-server.sh script with the specified packages, I observe a
>> process ID for a short period but am unable to see the corresponding port
>> (default 15002) associated with that PID. The process automatically stops
>> after around 10 minutes.
>>
>> Since the Spark History server is not enabled, I am unable to locate the
>> relevant logs or error messages. The logs for currently running Spark
>> applications are accessible from the Spark UI, but I am unsure where to
>> find the logs for the Spark Connect application and service.
>>
>> Could you please advise on where to find the logs or error messages
>> related to Spark Connect?
>>
>>
>>
>>
>> Thanks,
>> Elango
>>
>