Greetings !

We're using Slurm 23.11.8 in a small cluster .
The control node shares via NFS the directory /clusterprograms with the compute 
nodes (using the same name as mountpoint) to provide access to available 
software.
The users are instructed to use 
lmod<https://lmod.readthedocs.io/en/latest/060_locating.html> (module 
load/avail/purge) to setup their job's runtime environment.
We have the software and modules at /clusterprograms distributed like this:

/clusterprograms/
├── common
├── gpu/
│   ├── cuda8/
│   └── cuda10/
└── cpu/
   ├── intel/
   └── amd/

We configured the /etc/profile.d/ directory on each node so ,when we access 
them directly (via ssh) the paths used by module (MODULEPATH) are only these:

 *   /home/<user>/modulefiles
 *   /clusterprograms/common/modules
 *   /clusterprograms/<processor vendor or cuda version>/modules

The exception is the control/login node, which only has 
/home/<user>/modulefiles and /clusterprograms/common/modules

The issue we have now is this:
We are looking for a safe and practical way to automate the reconfiguration of 
MODULEPATH<https://lmod.readthedocs.io/en/latest/077_ref_counting.html> in the 
compute nodes when a job is submitted.
For now, the only method that is "working" is the following TaskProlog script, 
but it forces the users to load the modules inside the scripts
that may be called by srun steps, rather than being able to load them in the 
job base step:

#!/bin/bash
if [[ -n "${SLURM_JOB_CONSTRAINTS}" ]]; then
   echo "export ORIGINAL_MPATH=${MODULEPATH}"
   IFS=',' read -ra JOB_ARCH_CONSTRAINTS <<< "$SLURM_JOB_CONSTRAINTS"
   for constraint in "${JOB_ARCH_CONSTRAINTS[@]}"; do

       case "${constraint}" in
           cuda*)
               echo "export 
MODULEPATH=/clusterprograms/gpu/${constraint}:${MODULEPATH}"
           ;;
           intel-*)
               echo "export 
MODULEPATH=/clusterprograms/cpu/intel/${constraint}:${MODULEPATH}"
           ;;
           amd-*)
               echo "export 
MODULEPATH=/clusterprograms/cpu/amd/${constraint}:${MODULEPATH}"
           ;;
       esac
   done

fi

Is there any better way to do this ? How is that it doesn't work if we use 
Prolog ? I tried using the --export option but it doesn't work, since the 
control node has MODULEPATH
set via the main shell profiles.

Many thanks !!!


--

Daniel Garrapucho Lévy

Técnic informàtic

Departament de Física de la Matèria Condensada
Facultat de Física

Martí i Franquès, 1
08028 Barcelona, SPAIN
Despatx V302
Email: daniel.garrapu...@ub.edu<mailto:daniel.garrapu...@ub.edu>

[https://estatics.web.ub.edu/image/company_logo?img_id=2946262&t=1700143943385]



Aquest missatge, i els fitxers adjunts que hi pugui haver, pot contenir 
informació confidencial o protegida legalment i s’adreça exclusivament a la 
persona o entitat destinatària. Si no consteu com a destinatari final o no 
teniu l’encàrrec de rebre’l, no esteu autoritzat a llegir-lo, retenir-lo, 
modificar-lo, distribuir-lo, copiar-lo ni a revelar-ne el contingut. Si l’heu 
rebut per error, informeu-ne el remitent i elimineu del sistema tant el 
missatge com els fitxers adjunts que hi pugui haver.

Este mensaje, y los ficheros adjuntos que pueda incluir, puede contener 
información confidencial o legalmente protegida y está exclusivamente dirigido 
a la persona o entidad destinataria. Si usted no consta como destinatario final 
ni es la persona encargada de recibirlo, no está autorizado a leerlo, 
retenerlo, modificarlo, distribuirlo o copiarlo, ni a revelar su contenido. Si 
lo ha recibido por error, informe de ello al remitente y elimine del sistema 
tanto el mensaje como los ficheros adjuntos que pueda contener.

This email message and any attachments it carries may contain confidential or 
legally protected material and are intended solely for the individual or 
organization to whom they are addressed. If you are not the intended recipient 
of this message or the person responsible for processing it, then you are not 
authorized to read, save, modify, send, copy or disclose any part of it. If you 
have received the message by mistake, please inform the sender of this and 
eliminate the message and any attachments it carries from your account.
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to