Just wanted to follow up. In addition to passing all traffic to the SLURM controller, opened port 6818/TCP to all other compute nodes and this seems to have resolved the issue. Thanks again, Matthieu!
Best, Sean On Thu, May 17, 2018 at 8:06 PM, Sean Caron <sca...@umich.edu> wrote: > Awesome tip. Thanks so much, Matthieu. I hadn't considered that. I will > give that a shot and see what happens. > > Best, > > Sean > > > On Thu, May 17, 2018 at 4:49 PM, Matthieu Hautreux < > matthieu.hautr...@gmail.com> wrote: > >> Hi, >> >> Communications in Slurm are not only performed from controller to slurmd >> and from slurmd to controller. You need to ensure that your login nodes can >> reach the controller and the slurmd nodes as well as ensure that slurmd on >> the various nodes can contact each other. This last requirement is because >> of the tree logic used in slurm communication : >> >> - to ensure scalability, slurmctld use a communication tree (see >> TreeWidth in "man slurm.conf"), used for example to periodically check that >> all the nodes are working properly >> - the same exact logic is used by srun when it contacts the various >> slurmd involved in its step >> - reversed tree communications are performed among slurmds of steps at >> their end to send accounting data and other stuff to the controller >> >> - only some communications are point-to-point between slurmd and >> controller, especially the "registering call" performed at slurmd startup. >> >> When slurmd can not contact each other because of network failures >> (partitioning) or too restrictive filtering, then you see the kind of >> flapping that you have. This is because point-to-point communication at >> slurmd registering make them appears to the controller, tree checks make >> some of them dissapear, retries can lead to point to point communications >> to some nodes when the amount of destination nodes contacted by the >> controller at the same time is lower than the configured TreeWidth, thus >> nodes suddenly reappear... until the next check... and so on. >> >> Two options for you : >> >> - be less restrictive in your filtering rules >> - set TreeWidth to 1 in slurm.conf but you will loose the >> performance/scalability of slurm internals communication >> >> If your cluster is large, I would recommend to use the first one. >> >> HTH >> Matthieu >> >> PS : you can look at that presentation for a few details on the >> communication logic : >> https://slurm.schedmd.com/SUG14/message_aggregation.pdf >> >> >> >> 2018-05-17 22:21 GMT+02:00 Sean Caron <sca...@umich.edu>: >> >>> Sorry, how do you mean? The environment is very basic. Compute nodes and >>> SLURM controller are on an RFC1918 subnet. Gateways are dual homed with one >>> leg on a public IP and one leg on the RFC1918 cluster network. It used to >>> be that nodes that only had a leg on the RFC1918 network (compute nodes and >>> the SLURM controller) had no firewall at all and nodes that were dual homed >>> basically were set to just permit all traffic from the cluster side NIC >>> (i.e. iptables rule like -A INPUT -i ethX -j ACCEPT). >>> >>> Now we're trying to go back to the gateways and compute nodes and >>> actually codify, instead of just passing all traffic from the cluster side >>> NIC, what ports and protocols are actually in use, or at least, what >>> server-to-server communication is expected and normative, and then define a >>> rule set to permit those while dropping other traffic not explicitly >>> whitelisted. >>> >>> The compute and gateway nodes work fine with SLURM even when iptables is >>> enabled and the policy is "permit all traffic from that NIC" but once we >>> tighten it down just a little bit to "permit all traffic to and from the >>> SLURM controller" we see these weird instances of node state flapping. It's >>> not clear to me why this is the case since from the standpoint of node to >>> controller communications, these policies are logically very similar, but >>> there it is. The nodes shouldn't have to talk to anything else besides the >>> SLURM controller for SLURM to work, so long as time is synched up between >>> them and there are no issues with the nodes getting to slurm.conf. >>> >>> Best, >>> >>> Sean >>> >>> >>> On Thu, May 17, 2018 at 1:21 PM, Patrick Goetz <pgo...@math.utexas.edu> >>> wrote: >>> >>>> Does your SMS have a dedicated interface for node traffic? >>>> >>>> On 05/16/2018 04:00 PM, Sean Caron wrote: >>>> >>>>> I see some chatter on 6818/TCP from the compute node to the SLURM >>>>> controller, and from the SLURM controller to the compute node. >>>>> >>>>> The policy is to permit all packets inbound from SLURM controller >>>>> regardless of port and protocol, and perform no filtering whatsoever on >>>>> any >>>>> output packets to anywhere. I wouldn't expect this to interfere. >>>>> >>>>> Anyway, it's not that it NEVER works once the firewall is switched on. >>>>> It's that it flaps. The firewall is clearly passing enough traffic to have >>>>> the node marked as up some of the time. But why the periodic "not >>>>> responding" ... "responding" cycles? Once it says "not responding" I can >>>>> still scontrol ping from the compute node in question, and standard ICMP >>>>> ping from one to the other works as well. >>>>> >>>>> Best, >>>>> >>>>> Sean >>>>> >>>>> >>>>> On Wed, May 16, 2018 at 2:13 PM, Alex Chekholko <a...@calicolabs.com >>>>> <mailto:a...@calicolabs.com>> wrote: >>>>> >>>>> Add a logging rule to your iptables and look at what traffic is >>>>> actually being blocked? >>>>> >>>>> On Wed, May 16, 2018 at 11:11 AM Sean Caron <sca...@umich.edu >>>>> <mailto:sca...@umich.edu>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> Does anyone use SLURM in a scenario where there is an iptables >>>>> firewall on the compute nodes on the same network it uses to >>>>> communicate with the SLURM controller and DBD machine? >>>>> >>>>> I have the very basic situation where ... >>>>> >>>>> 1. There is no iptables firewall enabled at all on the SLURM >>>>> controller/DBD machine. >>>>> >>>>> 2. Compute nodes are set to permit all ports and protocols from >>>>> the SLURM controller with a rule like: >>>>> >>>>> -A INPUT -s IP.of.SLURM.controller/32 -j ACCEPT >>>>> >>>>> If I enable this on the compute nodes, they flap up in down in >>>>> "Not responding state". If I switch off the firewall on the >>>>> compute nodes, they work fine. >>>>> >>>>> When firewall is up on the compute nodes, SLURM controller can >>>>> ping compute nodes, no problem. I have no reason to believe all >>>>> ports and protocols are not being passed. Time is synched. No >>>>> trouble accessing slurm.conf on any of the clients. >>>>> >>>>> Has anyone seen this before? There seems to be very little >>>>> information about SLURM's interactions with iptables. I know >>>>> this is kind of a funky scenario but regulatory requirements >>>>> have me needing to tighten down our cluster network a little >>>>> bit. Is this like a latency issue, or ...? >>>>> >>>>> Thanks, >>>>> >>>>> Sean >>>>> >>>>> >>>>> >>>> >>> >> >