Glad to hear that you make it work. Regards Matthieu
2018-05-21 21:21 GMT+02:00 Sean Caron <sca...@umich.edu>: > Just wanted to follow up. In addition to passing all traffic to the SLURM > controller, opened port 6818/TCP to all other compute nodes and this seems > to have resolved the issue. Thanks again, Matthieu! > > Best, > > Sean > > > On Thu, May 17, 2018 at 8:06 PM, Sean Caron <sca...@umich.edu> wrote: > >> Awesome tip. Thanks so much, Matthieu. I hadn't considered that. I will >> give that a shot and see what happens. >> >> Best, >> >> Sean >> >> >> On Thu, May 17, 2018 at 4:49 PM, Matthieu Hautreux < >> matthieu.hautr...@gmail.com> wrote: >> >>> Hi, >>> >>> Communications in Slurm are not only performed from controller to slurmd >>> and from slurmd to controller. You need to ensure that your login nodes can >>> reach the controller and the slurmd nodes as well as ensure that slurmd on >>> the various nodes can contact each other. This last requirement is because >>> of the tree logic used in slurm communication : >>> >>> - to ensure scalability, slurmctld use a communication tree (see >>> TreeWidth in "man slurm.conf"), used for example to periodically check that >>> all the nodes are working properly >>> - the same exact logic is used by srun when it contacts the various >>> slurmd involved in its step >>> - reversed tree communications are performed among slurmds of steps at >>> their end to send accounting data and other stuff to the controller >>> >>> - only some communications are point-to-point between slurmd and >>> controller, especially the "registering call" performed at slurmd startup. >>> >>> When slurmd can not contact each other because of network failures >>> (partitioning) or too restrictive filtering, then you see the kind of >>> flapping that you have. This is because point-to-point communication at >>> slurmd registering make them appears to the controller, tree checks make >>> some of them dissapear, retries can lead to point to point communications >>> to some nodes when the amount of destination nodes contacted by the >>> controller at the same time is lower than the configured TreeWidth, thus >>> nodes suddenly reappear... until the next check... and so on. >>> >>> Two options for you : >>> >>> - be less restrictive in your filtering rules >>> - set TreeWidth to 1 in slurm.conf but you will loose the >>> performance/scalability of slurm internals communication >>> >>> If your cluster is large, I would recommend to use the first one. >>> >>> HTH >>> Matthieu >>> >>> PS : you can look at that presentation for a few details on the >>> communication logic : >>> https://slurm.schedmd.com/SUG14/message_aggregation.pdf >>> >>> >>> >>> 2018-05-17 22:21 GMT+02:00 Sean Caron <sca...@umich.edu>: >>> >>>> Sorry, how do you mean? The environment is very basic. Compute nodes >>>> and SLURM controller are on an RFC1918 subnet. Gateways are dual homed with >>>> one leg on a public IP and one leg on the RFC1918 cluster network. It used >>>> to be that nodes that only had a leg on the RFC1918 network (compute nodes >>>> and the SLURM controller) had no firewall at all and nodes that were dual >>>> homed basically were set to just permit all traffic from the cluster side >>>> NIC (i.e. iptables rule like -A INPUT -i ethX -j ACCEPT). >>>> >>>> Now we're trying to go back to the gateways and compute nodes and >>>> actually codify, instead of just passing all traffic from the cluster side >>>> NIC, what ports and protocols are actually in use, or at least, what >>>> server-to-server communication is expected and normative, and then define a >>>> rule set to permit those while dropping other traffic not explicitly >>>> whitelisted. >>>> >>>> The compute and gateway nodes work fine with SLURM even when iptables >>>> is enabled and the policy is "permit all traffic from that NIC" but once we >>>> tighten it down just a little bit to "permit all traffic to and from the >>>> SLURM controller" we see these weird instances of node state flapping. It's >>>> not clear to me why this is the case since from the standpoint of node to >>>> controller communications, these policies are logically very similar, but >>>> there it is. The nodes shouldn't have to talk to anything else besides the >>>> SLURM controller for SLURM to work, so long as time is synched up between >>>> them and there are no issues with the nodes getting to slurm.conf. >>>> >>>> Best, >>>> >>>> Sean >>>> >>>> >>>> On Thu, May 17, 2018 at 1:21 PM, Patrick Goetz <pgo...@math.utexas.edu> >>>> wrote: >>>> >>>>> Does your SMS have a dedicated interface for node traffic? >>>>> >>>>> On 05/16/2018 04:00 PM, Sean Caron wrote: >>>>> >>>>>> I see some chatter on 6818/TCP from the compute node to the SLURM >>>>>> controller, and from the SLURM controller to the compute node. >>>>>> >>>>>> The policy is to permit all packets inbound from SLURM controller >>>>>> regardless of port and protocol, and perform no filtering whatsoever on >>>>>> any >>>>>> output packets to anywhere. I wouldn't expect this to interfere. >>>>>> >>>>>> Anyway, it's not that it NEVER works once the firewall is switched >>>>>> on. It's that it flaps. The firewall is clearly passing enough traffic to >>>>>> have the node marked as up some of the time. But why the periodic "not >>>>>> responding" ... "responding" cycles? Once it says "not responding" I can >>>>>> still scontrol ping from the compute node in question, and standard ICMP >>>>>> ping from one to the other works as well. >>>>>> >>>>>> Best, >>>>>> >>>>>> Sean >>>>>> >>>>>> >>>>>> On Wed, May 16, 2018 at 2:13 PM, Alex Chekholko <a...@calicolabs.com >>>>>> <mailto:a...@calicolabs.com>> wrote: >>>>>> >>>>>> Add a logging rule to your iptables and look at what traffic is >>>>>> actually being blocked? >>>>>> >>>>>> On Wed, May 16, 2018 at 11:11 AM Sean Caron <sca...@umich.edu >>>>>> <mailto:sca...@umich.edu>> wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> Does anyone use SLURM in a scenario where there is an iptables >>>>>> firewall on the compute nodes on the same network it uses to >>>>>> communicate with the SLURM controller and DBD machine? >>>>>> >>>>>> I have the very basic situation where ... >>>>>> >>>>>> 1. There is no iptables firewall enabled at all on the SLURM >>>>>> controller/DBD machine. >>>>>> >>>>>> 2. Compute nodes are set to permit all ports and protocols >>>>>> from >>>>>> the SLURM controller with a rule like: >>>>>> >>>>>> -A INPUT -s IP.of.SLURM.controller/32 -j ACCEPT >>>>>> >>>>>> If I enable this on the compute nodes, they flap up in down in >>>>>> "Not responding state". If I switch off the firewall on the >>>>>> compute nodes, they work fine. >>>>>> >>>>>> When firewall is up on the compute nodes, SLURM controller can >>>>>> ping compute nodes, no problem. I have no reason to believe >>>>>> all >>>>>> ports and protocols are not being passed. Time is synched. No >>>>>> trouble accessing slurm.conf on any of the clients. >>>>>> >>>>>> Has anyone seen this before? There seems to be very little >>>>>> information about SLURM's interactions with iptables. I know >>>>>> this is kind of a funky scenario but regulatory requirements >>>>>> have me needing to tighten down our cluster network a little >>>>>> bit. Is this like a latency issue, or ...? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Sean >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >