On 6/28/19 9:57 AM, Valerio Bellizzomi wrote:
On Fri, 2019-06-28 at 09:39 +0200, Ole Holm Nielsen wrote:
On 6/28/19 9:18 AM, Valerio Bellizzomi wrote:
On Fri, 2019-06-28 at 08:51 +0200, Valerio Bellizzomi wrote:
On Thu, 2019-06-27 at 18:35 +0200, Valerio Bellizzomi wrote:
The nodes are now communicating however when I run the command
srun -w compute02 /bin/ls
it remains stuck and there is no output on the submit machine.
on the compute02 there is a Communication error and Timeout.
the network ports 6817 and 6818 are open.
Looking at the firewall logs, slurmctld wants to connect back to a range
of ports which are closed.
As a test I stopped the firewall service on the submit machine, now the
command above is working fine.
You may want to check your firewall settings according to Slurm's
requirements. I've summarized this in my Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#configure-firewall-for-slurm-daemons
/Ole
I am using another system and another firewall.
No problem, but you *must* ensure that the correct ports are open in the
firewall! This information is in the above Wiki page.
/Ole