On 03/08/2018 04:49 PM, Renat Yakupov wrote:
Thank you, Ole.
That is exactly it. And it probably answers a lot of future questions,
since I know now how to see the configuration information.
Good to hear! The "scontrol show config" shows many, but not all, Slurm
parameters. You may have to
Hi,
I am trying to get interactive jobs to work from the machine we use as a login
node, i.e., where the users of the cluster log into and from where they
typically submit jobs.
I submit the job as follows:
vsc40075@test2802 (banette) ~> /bin/salloc -N1 -n1 /bin/srun bash -i
salloc: Granted
I'm confused. Why would you want to run an interactive program using srun?
Roger
-Original Message-
From: Andy Georges [mailto:andy.geor...@ugent.be]
Sent: Friday, March 09, 2018 12:20 PM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Problem launching interactive jobs using
I think that the piece you may be missing is --pty, but I also don't
think that salloc is necessary.
The most simple command that I typically use is:
srun -N1 -n1 --pty bash -i
Mike
On 3/9/18 10:20 AM, Andy Georges wrote:
Hi,
I am trying to get interactive jobs to work from the machine we
Hi,
Adding —pty makes no difference. I do not get a prompt and on the node the logs
show an error. If —pty is used, the error is somewhat different compared to not
using it but the end result is the same.
My main issue is that giving the same command on the machines running slurmd
and slurmctl
I'm having the same issue. The salloc command hangs on my login nodes, but
works fine on the head node. My default salloc command is:
SallocDefaultCommand="/usr/bin/srun -n1 -N1 --pty --preserve-env $SHELL"
I'm on the OpenHPC slurm 17.02.9-69.2.
The log says the job is assigned, then eventually
Hi all,
Cranked up the debug level a bit
Job was not started when using:
vsc40075@test2802 (banette) ~> /bin/salloc -N1 -n1 /bin/srun --pty bash -i
salloc: Granted job allocation 42
salloc: Waiting for resource configuration
salloc: Nodes node2801 are ready for job
For comparison purposes, runn
Connection refused makes me think a firewall issue.
Assuming this is a test environment, could you try on the compute node:
# iptables-save > iptables.bak
# iptables -F && iptables -X
Then test to see if it works. To restore the firewall use:
# iptables-restore < iptables.bak
You may have to
In my case I tested firewall. But I'm wondering if the login nodes need to
appear in the slurm.conf, and also if slurmd needs to be running on the
login nodes in order for them to be a submit host? Either or both could be
my issue.
On Fri, Mar 9, 2018 at 12:58 PM, Nicholas McCollum
wrote:
> Conn
Hi,
> On 9 Mar 2018, at 21:58, Nicholas McCollum wrote:
>
> Connection refused makes me think a firewall issue.
>
> Assuming this is a test environment, could you try on the compute node:
>
> # iptables-save > iptables.bak
> # iptables -F && iptables -X
>
> Then test to see if it works. To
OK, I'm eating my words now. Perhaps I have had multiple issues before, but
at the moment stopping the firewall allows salloc to work. Can anyone
suggest an iptables rule specific to slurm? Or a way to restrict slurm
communications to the right network?
On Fri, Mar 9, 2018 at 1:10 PM, Mark M wrot
Below is worked for cpu, with OverSubscribe, I can have more than 4 process in
running status, but if I add #SBATCH --gres=gpu:2 in the job file, there will
be just 1 process in running status, the other are in pending status.
The OverSubscribe can just be used for the resource cpu, whether it
12 matches
Mail list logo