hi everybody, i try to user dynamic mode with configless mode with slurm 24.11.3 and upgrade to 24.11.4
with slurmd and i found a problem. slurmctld is a container with docker, and my node is outside the container network . slurmctld register my ip with a function getpeeraddr on the slurmctld socket. but my ip connected to the socket come from the docker nat/bridge so slurmctld register my ip bridged ( not my real ip ) that is to say the docker gateway (172.20.0.1) *scontrol show node* *-------------------------* *NodeName=ltlsbubble1 Arch=x86_64 CoresPerSocket=4..NodeAddr=172.20.0.1 NodeHostName=ltlsbubble1 Version=24.11.4* so the node go down after the "not pinging it" timeout i try to update the config *scontrol uupdate NodeName=ltlsbubble1* *NodeAddr=xx.xx.xx.xx* but a the first *scontrol reconfigure * it comes back to : *NodeAddr=172.20.0.1* in normal mode ------------------- *scontrol show nodeNodeName=ltlsbubble1 Arch=x86_64 CoresPerSocket=4..NodeAddr=ltlsbubble1 NodeHostName=ltlsbubble1 Version=24.11.4* in normal mode NodeAddr is the same than NodeName , so it use DNS resolution for communication. to verify my hypothesis, i go to the c code of slurm, identify the register function and replace it with the same mechanism than normal node in src/slurmctld/node_mgr.c i replace : set_node_comm_name(node_ptr, *comm_name*, reg_msg->hostname); by set_node_comm_name(node_ptr, NULL, reg_msg->hostname); i rebuild slutmctld with this patch and try it with dynamic mode , it works like expected *scontrol show nodeNodeName=ltlsbubble1 Arch=x86_64 CoresPerSocket=4..NodeAddr=ltlsbubble1 NodeHostName=ltlsbubble1 Version=24.1* no ip in nodeAddr , but only the nodename, so it use DNS resolution . the node works fine and no goes down for timeout ping so my question : can we have an option to force DNS resolution instead ip discover in Dynamic mode ? ( i try the option cloud_dns, but it not seems the purpose of this option) best regard, Stephane
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com