Hi,

Here is a small patch that solves this issue.
Considering all the scripts, I'm not sure if sbin/stop-workers.sh and
sbin/stop-worker.sh need a similar change.
Do they really care about SPARK_CONF_DIR to do the job?

Note that I have also removed the following part in the script:
cd "${SPARK_HOME}" \;
in the command to workers.sh
To me, it doesn't seem helpful unless CWD is important, but it shouldn't..

Still in this consideration of minimalism, I think it could also be
removed from:

sbin/start-workers.sh:46:"${SPARK_HOME}/sbin/workers.sh" cd
"${SPARK_HOME}" \; "${SPARK_HOME}/sbin/start-worker.sh"
"spark://$SPARK_MASTER_HOST:$SPARK_MASTER_PORT"
sbin/stop-workers.sh:28:"${SPARK_HOME}/sbin/workers.sh" cd
"${SPARK_HOME}" \; "${SPARK_HOME}/sbin"/stop-worker.sh
sbin/spark-daemons.sh:36:exec "${SPARK_HOME}/sbin/workers.sh" cd
"${SPARK_HOME}" \; "${SPARK_HOME}/sbin/spark-daemon.sh" "$@"

Regards,
Patrice

Le jeu. 18 juil. 2024 à 15:34, Patrice Duroux
<patrice.dur...@gmail.com> a écrit :
>
> Hi,
>
> I'm trying to build a SLURM script to start a Spark environment
> (master+workers) dynamically configured by the job sent to the queue.
> Issue during the job execution is to start all the workers that are
> then using an default (empty) configuration.
> How could I "forward" the SPARK_CONF_DIR at this step?
> Using SPARK_SSH_OPTS in sbin/workers.sh is of no help because adding
> -o SendEnv requires
> authorization in sshd. Is there any way to add option/parameters to
> the ssh command?
> Currently, here is the corresponding call in start-workers.sh
>
> "${SPARK_HOME}/sbin/workers.sh" cd "${SPARK_HOME}" \;
> "${SPARK_HOME}/sbin/start-worker.sh"
> "spark://$SPARK_MASTER_HOST:$SPARK_MASTER_PORT"
>
> Also modifying files like .profile or .bashrc, etc. is risky and not a
> solution mainly here because each job will have its own conf_dir and
> multiple jobs could be run in parallel.
>
> Many thanks!
>
> Regards,
> Patrice
>
> Here is a sample of such a script:
>
> #!/usr/bin/sh
>
> #SBATCH -N 2
> #SBATCH --time=00:05:00
>
> SPARK_HOME="$WORK"/spark-3.5.1-bin-hadoop3
>
> create_spark_conf(){
> export SPARK_LOCAL_DIRS=$(mktemp -d spark-XXXXXXXX)
> export SPARK_CONF_DIR="$SPARK_LOCAL_DIRS"/conf
> mkdir -p $SPARK_CONF_DIR
> echo "export SPARK_LOCAL_DIRS=\"$(realpath "$SPARK_LOCAL_DIRS")\"
> export SPARK_CONF_DIR=\"$(realpath "$SPARK_LOCAL_DIRS")/conf\"
> export SPARK_LOG_DIR=\"$(realpath "$SPARK_LOCAL_DIRS")/logs\"
> module load openjdk/11.0.2
> " > "$SPARK_CONF_DIR"/spark-env.sh
> scontrol show hostname $SLURM_JOB_NODELIST > "$SPARK_CONF_DIR"/workers
> }
>
> cd "$SCRATCH"
> create_spark_conf
> "$SPARK_HOME"/sbin/start-all.sh
> "$SPARK_HOME"/bin/spark-submit "$HOME"/testspark-0.0.1-SNAPSHOT.jar "$@"
> "$SPARK_HOME"/sbin/stop-all.sh
diff --git a/sbin/start-worker.sh b/sbin/start-worker.sh
index fd58f01..bb808d5 100755
--- a/sbin/start-worker.sh
+++ b/sbin/start-worker.sh
@@ -39,8 +39,8 @@ fi
 # Any changes need to be reflected there.
 CLASS="org.apache.spark.deploy.worker.Worker"
 
-if [[ $# -lt 1 ]] || [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
-  echo "Usage: ./sbin/start-worker.sh <master> [options]"
+if [[ $# -lt 1 ]] || [[ "$@" = *--help ]] || [[ "$@" = *-h ]] || [[ "$@" = *--config ]]; then
+  echo "Usage: ./sbin/start-worker.sh [--config <conf-dir>] <master> [options]"
   pattern="Usage:"
   pattern+="\|Using Spark's default log4j profile:"
   pattern+="\|Started daemon with process name"
@@ -52,6 +52,22 @@ fi
 
 . "${SPARK_HOME}/sbin/spark-config.sh"
 
+# Check if --config is passed as an argument. It is an optional parameter.
+# Exit if the argument is not a directory.
+if [ "$1" == "--config" ]
+then
+  shift
+  conf_dir="$1"
+  if [ ! -d "$conf_dir" ]
+  then
+    echo "ERROR: $conf_dir is not a directory"
+    exit 1
+  else
+    export SPARK_CONF_DIR="$conf_dir"
+  fi
+  shift
+fi
+
 . "${SPARK_HOME}/bin/load-spark-env.sh"
 
 # First argument should be the master; we need to store it aside because we may
diff --git a/sbin/start-workers.sh b/sbin/start-workers.sh
index 3867ef3..c891545 100755
--- a/sbin/start-workers.sh
+++ b/sbin/start-workers.sh
@@ -43,4 +43,4 @@ if [ "$SPARK_MASTER_HOST" = "" ]; then
 fi
 
 # Launch the workers
-"${SPARK_HOME}/sbin/workers.sh" cd "${SPARK_HOME}" \; "${SPARK_HOME}/sbin/start-worker.sh" "spark://$SPARK_MASTER_HOST:$SPARK_MASTER_PORT"
+"${SPARK_HOME}/sbin/workers.sh" "${SPARK_HOME}/sbin/start-worker.sh" --config "$SPARK_CONF_DIR" "spark://$SPARK_MASTER_HOST:$SPARK_MASTER_PORT"
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to