I think I know what is going wrong. Actually the bug is not related to slurm or rocks itself. It is a result of some mismatches due to the update of softwares including ssh, centos, rocks and slurm.
Recently, I have updated my rocks using "yum update". The result was fetching the latest packages of Centos 7 (1804). I hit a problem that I was unable to ssh to the nodes while passwordless ssh was working before the update. See my discussion at [1]. With the help of Trevor [2], I was able to fix the problem. Please see the technical comment from him in the post. So, I am *guessing* that the latest version of slurm is not compatible with 1804 from Centos. In other word, something has been added/fixed in the ssh library which is now causing some mismatches. >and do you have in ~/.ssh/ and an ssh key named "id_rsa" and is passwordless ? Actually there is no id_rsa in my home for host based authentication. [mahmood@rocks7 ~]$ cat .ssh/known_hosts localhost ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNLjMdBkI636zSWU1N/dns3qHqc7Dr8JH/ulb2xryFM39Rk8p/4DIbkaV05fMpS6nXeJUjSY2X7U14bPRoiEeU4= [mahmood@rocks7 ~]$ ls .ssh/ known_hosts [mahmood@rocks7 ~]$ ssh compute-0-3 -Y Last login: Tue Nov 20 16:45:13 2018 from rocks7.local Rocks Compute Node Rocks 7.0 (Manzanita) Profile built 10:57 15-Nov-2018 Kickstarted 11:17 15-Nov-2018 [mahmood@compute-0-3 ~]$ xclock ^C >So that looks like for some reason your display is set to :0 (or similar). Are >you by some chance trying to run this on an X server on the console of rocks7? >That also looks like an error you should look into fixing first. I think that is caused by the issue that I described above. [1] https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2018-November/072367.html [2] https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2018-November/072380.html Regards, Mahmood On Tue, Nov 20, 2018 at 2:58 PM Chris Samuel <ch...@csamuel.org> wrote: > On Tuesday, 20 November 2018 2:51:26 AM AEDT Mahmood Naderan wrote: > > > With and without --x11, I am not able to see xclock on a compute node. > > > > [mahmood@rocks7 ~]$ srun --x11 --nodelist=compute-0-3 -n 1 -c 6 > --mem=8G -A > > y8 -p RUBY xclock > > srun: error: Cannot forward to local display. Can only use X11 forwarding > > with network displays. > > So that looks like for some reason your display is set to :0 (or similar). > Are > you by some chance trying to run this on an X server on the console of > rocks7? > > > [mahmood@rocks7 ~]$ rocks run host compute-0-3 "yum list libssh2-devel" > > Warning: untrusted X11 forwarding setup failed: xauth key data not > generated > > That also looks like an error you should look into fixing first. > > All the best, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC > > > > >