Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Mark M
OK, I'm eating my words now. Perhaps I have had multiple issues before, but at the moment stopping the firewall allows salloc to work. Can anyone suggest an iptables rule specific to slurm? Or a way to restrict slurm communications to the right network? On Fri, Mar 9, 2018 at 1:10 PM, Mark M wrot

Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Andy Georges
Hi, > On 9 Mar 2018, at 21:58, Nicholas McCollum wrote: > > Connection refused makes me think a firewall issue. > > Assuming this is a test environment, could you try on the compute node: > > # iptables-save > iptables.bak > # iptables -F && iptables -X > > Then test to see if it works. To

Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Mark M
In my case I tested firewall. But I'm wondering if the login nodes need to appear in the slurm.conf, and also if slurmd needs to be running on the login nodes in order for them to be a submit host? Either or both could be my issue. On Fri, Mar 9, 2018 at 12:58 PM, Nicholas McCollum wrote: > Conn

Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Nicholas McCollum
Connection refused makes me think a firewall issue. Assuming this is a test environment, could you try on the compute node: # iptables-save > iptables.bak # iptables -F && iptables -X Then test to see if it works. To restore the firewall use: # iptables-restore < iptables.bak You may have to

Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Andy Georges
Hi all, Cranked up the debug level a bit Job was not started when using: vsc40075@test2802 (banette) ~> /bin/salloc -N1 -n1 /bin/srun --pty bash -i salloc: Granted job allocation 42 salloc: Waiting for resource configuration salloc: Nodes node2801 are ready for job For comparison purposes, runn

Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Mark M
I'm having the same issue. The salloc command hangs on my login nodes, but works fine on the head node. My default salloc command is: SallocDefaultCommand="/usr/bin/srun -n1 -N1 --pty --preserve-env $SHELL" I'm on the OpenHPC slurm 17.02.9-69.2. The log says the job is assigned, then eventually

Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Andy Georges
Hi, Adding —pty makes no difference. I do not get a prompt and on the node the logs show an error. If —pty is used, the error is somewhat different compared to not using it but the end result is the same. My main issue is that giving the same command on the machines running slurmd and slurmctl

Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Michael Robbert
I think that the piece you may be missing is --pty, but I also don't think that salloc is necessary. The most simple command that I typically use is: srun -N1 -n1 --pty bash -i Mike On 3/9/18 10:20 AM, Andy Georges wrote: Hi, I am trying to get interactive jobs to work from the machine we

Re: [slurm-users] Problem launching interactive jobs using srun

2018-03-09 Thread Pickering, Roger (NIH/NIAAA) [E]
I'm confused. Why would you want to run an interactive program using srun? Roger -Original Message- From: Andy Georges [mailto:andy.geor...@ugent.be] Sent: Friday, March 09, 2018 12:20 PM To: slurm-users@lists.schedmd.com Subject: [slurm-users] Problem launching interactive jobs using