We've had good luck putting the modules on an nfs-mounted file system.
Along with that, suggest creating /etc/profile.d/zmodule.sh that contains
module use <file-system>/modules
then symlink /etc/profile.d/zmodule.csh to it, and set this up on all
login and compute nodes.
Andy
On 10/04/2017 12:10 PM, Mike Cammilleri wrote:
Hi Everyone,
I'm in search of a best practice for setting up Environment Modules for our
Slurm 16.05.6 installation (we have not had the time to upgrade to 17.02 yet).
We're a small group and had no explicit need for this in the beginning, but as
we are growing larger with more users we clearly need something like this.
I see there are a couple ways to implement Environment Modules and I'm
wondering which would be the cleanest, most sensible way. I'll list my ideas
below:
1. Install Environment Modules package and relevant modulefiles on the slurm
head/submit/login node, perhaps in the default /usr/local/ location. The
modulefiles modules would define paths to various software packages that exist
in a location visible/readable to the compute nodes (NFS or similar). The user
then loads the modules manually at the command line on the submit/login node
and not in the slurm submit script - but specify #SBATCH --export=ALL and
import the environment before submitting the sbatch job.
2. Install Environment Modules packages in a location visible to the entire
cluster (NFS or similar), including the compute nodes, and the user then
includes their 'module load' commands in their actual slurm submit scripts
since the command would be available on the compute nodes - loading software
(either local or from network locations depending on what they're loading)
visible to the nodes
3. Another variation would be to use a configuration manager like bcfg2 to make
sure Environment Modules and necessary modulefiles and all configurations are
present on all compute/submit nodes. Seems like that's potential for a mess
though.
Is there a preferred approach? I see in the archives some folks have strange
behavior when a user uses --export=ALL, so it would seem to me that the cleaner
approach is to have the 'module load' command available on all compute nodes
and have users do this in their submit scripts. If this is the case, I'll need
to configure Environment Modules and relevant modulefiles to live in special
places when I build Environment Modules (./configure --prefix=/mounted-fs
--modulefilesdir=/mounted-fs, etc.).
We've been testing with modules-tcl-1.923
Thanks for any advice,
mike