Got it! It was the firewall... Thanks to all for all the suggestions.
Heath Professor Graduate Coordinator Chemical and Biological Engineering http://che.eng.ua.edu University of Alabama 3448 SEC, Box 870203 Tuscaloosa, AL 35487 (205) 348-1733 (phone) (205) 561-7450 (cell) (205) 348-7558 (fax) htur...@eng.ua.edu http://turnerresearchgroup.ua.edu -----Original Message----- From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of Andy Riebs Sent: Monday, May 21, 2018 10:22 AM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] network/communication failure Do you have a firewall running? On 05/21/2018 11:05 AM, Turner, Heath wrote: > If anyone has advice, I would really appreciate... > > I am running (just installed) slurm-11.17.6, with a master + 2 hosts. It > works locally on the master (controller + execution). However, I cannot > establish communication from master [triumph01] with the 2 hosts > [triumph02,triumph03]. Here is some more info: > > 1. munge is running, and munge verification tests all pass. > 2. system clocks are in sync on master/hosts. > 3. identical slurm.conf files are on master/hosts. > 4. configuration of resources (memory/cpus/etc) are correct and have been > confirmed on all machines (all hardware is identical). > 5. I have attached: > a) slurm.conf > b) log file from master slurmctld > c) log file from host slurmd > > Any ideas about what to try next? > > Heath Turner > > Professor > Graduate Coordinator > Chemical and Biological Engineering > http://che.eng.ua.edu > > University of Alabama > 3448 SEC, Box 870203 > Tuscaloosa, AL 35487 > (205) 348-1733 (phone) > (205) 561-7450 (cell) > (205) 348-7558 (fax) > htur...@eng.ua.edu > http://turnerresearchgroup.ua.edu > -- Andy Riebs andy.ri...@hpe.com Hewlett-Packard Enterprise High Performance Computing Software Engineering +1 404 648 9024 My opinions are not necessarily those of HPE May the source be with you!