Gilles,
Adding ess component that excludes slurm and slurmd.
I run into trouble about connection issue. I guess I need slurm and slurmd
in my runtime context ! Anyway, as you mentioned that not a good solution
regarding remaining mpi process when using scancel and I guess i will also
lose some pro
"mpirun takes the #slots for each node from the slurm allocation."
Yes this is my issue and what I was not expected. But I will stick with
--bynode solution.
Thanks a lot for your help.
Regards,
Nicolas
2018-05-17 14:33 GMT+02:00 r...@open-mpi.org :
> mpirun takes the #slots for each node from
mpirun takes the #slots for each node from the slurm allocation. Your hostfile
(at least, what you provided) retained that information and shows 2 slots on
each node. So both the original allocation _and_ your constructed hostfile are
both telling mpirun to assign 2 slots on each node.
Like I s
Nicolas,
This looks odd at first glance, but as stated before, 1.6 is an obsolete
series.
A workaround could be to
mpirun—mca ess ...
And replace ... with a comma separated list of ess components that excludes
both slurm and slurmd.
An other workaround could be to remove SLURM related environment
Hi all,
Thanks for your feedback,
about using " mpirun --mca ras ^slurm --mca plm ^slurm --mca ess
^slurm,slurmd ...". I am a bit confused since syntax sounds good, but I
keep getting following error at run time :
*-
The problem here is that you have made an incorrect assumption. In the older
OMPI versions, the -H option simply indicated that the specified hosts were
available for use - it did not imply the number of slots on that host. Since
you have specified 2 slots on each host, and you told mpirun to la
You can try to disable SLURM :
mpirun --mca ras ^slurm --mca plm ^slurm --mca ess ^slurm,slurmd ...
That will require you are able to SSH between compute nodes.
Keep in mind this is far form ideal since it might leave some MPI
processes on nodes if you cancel a job, and mess SLURM accounting too.
Hi all,
I am trying to run mpi application through SLURM job scheduler. Here is my
running sequence
sbatch --> my_env_script.sh --> my_run_script.sh --> mpirun
In order to minimize modification of my production environment, I had to
setup following hostlist management in different scripts: