CentOS 7.7.1908
Slurm 18.08.8
When trying to run an interactive job I am getting the following error:
srun: error: task 0 launch failed: Slurmd could not connect IO
Checking the log file on the compute node I see the following error:
[2020-03-25T01:42:08.262] launch task 13.0 request from UID:1
On 23/3/20 8:32 am, CB wrote:
I've looked at the heterogeneous job support but it creates two-separate
jobs.
Yes, but the web page does say:
# By default, the applications launched by a single execution of
# the srun command (even for different components of the
# heterogeneous job) are combi
Hello,
I am installing slurm on centos . I installed all supporting libraries
successfully. I also installed MariDB before installing slurm.
I get the following error for sudo rpmbuild -ta slurm-20.02.0.tar.bz2
error: File not found:
/root/rpmbuild/BUILDROOT/slurm-20.02.0-1.el7.x86_64/usr/lib64/
Thanks, Yair and Thomas. I’ll check out wrappers. My interest in this case is
primarily in job submission and control. I was hoping that by using an API into
Slurm, I would avoid problems I’ve had in the past, with interpreting
inconsistent exit codes of command line executables, and parsing out
In fact, we ARE using the perl API, but there are some flaws.
E.g. the array_task_str of the jobinfo structure. Slurm abbreviates long
list of array indices, like scontrol does:
e.g.
1-3,5-8,45-...
yes, you can really find there three dots. In my opinion, this is ok for
a general tool like s
Hi Michael,
Thanks for the comment.
I was just checking if there is any other way to do the job before
introducing another partition.
So it appears to me that creating a new partition is the way to go.
Thanks,
Chansup
On Mon, Mar 23, 2020 at 1:25 PM Renfro, Michael wrote:
> Others might have
Hi everyone,
We recently started to use a priority-based scheduling and after solving some
final issues (see this post:
https://groups.google.com/forum/m/#!topic/slurm-users/N8r8MoyjQAU), everything
seems to be running quite smoothly now. However, we realized that the data
shown by sshare, e.g
I also haven't got along with the Perl API shipped with slurm. I got it to
work, but there were things missing.
Currently I have some wrapper functions for most of slurm commands, and a
general parsing function to slurm's common outputs (of scontrol, sacctmgr,
etc.).
Not in CPAN, but you can see it
Hi Sean, Hi Marcus,
Changing from localhost to the actual IP seems to have solved the problem. Is
that because not only the slurmctld process on the control node but also the
slurmd processes on the compute nodes need to have access to the accounting
information?
Because although slurmdbd and