On 3/22/19 12:40 PM, Reuti wrote:
Am 22.03.2019 um 16:20 schrieb Prentice Bisbal <pbis...@pppl.gov>:
On 3/21/19 6:56 PM, Reuti wrote:
Am 21.03.2019 um 23:43 schrieb Prentice Bisbal:
Slurm-users,
My users here have developed a GUI application which serves as a GUI interface
to various physics codes they use. From this GUI, they can submit jobs to
Slurm. On Tuesday, we upgraded Slurm from 18.08.5-2 to 18.08.6-2,and a user has
reported a problem when submitting Slurm jobs through this GUI app that do not
occur when the same sbatch script is submitted from sbatch on the command-line.
[…]
When I replaced the mpirun command with an equivalent srun command, everything
works as desired, so the user can get back to work and be productive.
While srun is a suitable workaround, and is arguably the correct way to run an
MPI job, I'd like to understand what is going on here. Any idea what is going
wrong, or additional steps I can take to get more debug information?
Was an alias to `mpirun` introduced? It may cover the real application and even
the `which mpirun` will return the correct value, but never be executed.
$ type mpirun
$ alias mpirun
may tell in the jobscript.
Unfortunately, the script is in tcsh,
Oh, I didn't notice this – correct.
so the 'type' command doesn't work since,
Is it really running in `tcsh`? The commands look like being generic and
available in various shells. Does SLURM honor the the first line of a script
and/or use a default? In Bash a function would cover the `mpirun` too.
(I'm more used to GridEngine, where this can be configured in both ways how to
start the scripts.)
Yes, it's running /bin/tcsh as the interpreter. When I used 'type' as
you instructed, I got an error from tcsh, which I wouldn't have gotten
in bash. Slurm respects the interpreter line at the start of the script.
( I know your more of a GridEngine guy. you helped me a lot through the
GridEngine mailing list when I used SGE. I was actually surprised to see
you here. Welcome to Slurm!)
In "tcsh" I see a defined "jobcmd" of having some effect.
-- Reuti
it's a bash built-in function. I did use the 'alias' command to see all the
defined aliases, and mpirun and mpiexec are not aliased. Any other ideas?
Prentice