Hi Paulo,

On 11/1/23 01:12, Paulo Jose Braga Estrela wrote:
I think that you should use NetworkManager-wait-online.service In RHEL 8. Take 
a look at its man page. It only allows the system reach network-online after 
all network interfaces are online. So, if your OP interfaces are managed by 
Network Manager, you can use it.

Unfortunately NetworkManager-wait-online.service returns as soon as 1 network interface is up. It doesn't wait for any other networks, including the Infiniband/OPA network, unfortunately :-(

You can see that the NetworkManager-wait-online.service file executes:

ExecStart=/usr/bin/nm-online -s -q

and this is causing our problems with Infiniband/OPA networks. This is the reason why we need Max's workaround wait-for-interfaces.service.

/Ole


-----Mensagem original-----
De: slurm-users <slurm-users-boun...@lists.schedmd.com> Em nome de Ole Holm 
Nielsen
Enviada em: terça-feira, 31 de outubro de 2023 07:00
Para: Slurm User Community List <slurm-users@lists.schedmd.com>
Assunto: Re: [slurm-users] How to delay the start of slurmd until 
Infiniband/OPA network is fully up?

Hi Jeffrey,

On 10/30/23 20:15, Jeffrey R. Lang wrote:
The service is available in RHEL 8 via the EPEL package repository as 
system-networkd, i.e. systemd-networkd.x86_64                                   
        253.4-1.el8    epel

Thanks for the info.  We can install the systemd-networkd RPM from the EPEL 
repo as you suggest.

I tried to understand the properties of systemd-networkd before implementing it 
in our compute nodes.  While there are lots of networkd man-pages, it's harder 
to find an overview of the actual properties of networkd.  This is what I found:

* Networkd is a service included in recent versions of Systemd.  It seems to be 
an alternative to NetworkManager.

* Red Hat has stated that systemd-networkd is NOT going to be implemented in 
RHEL 8 or 9.

* Comparing systemd-networkd and NetworkManager:
https://fedoracloud.readthedocs.io/en/latest/networkd.html

* Networkd is described in the Wikipedia article
https://en.wikipedia.org/wiki/Systemd

While networkd seems to be really nifty, I hesitate to replace NetworkManager 
by networkd on our EL8 and EL9 systems because this is an unsupported and only 
lightly tested setup, and it may require additional work to keep our systems 
up-to-date in the future.

It seems to me that Max Rutkowski's solution in
https://github.com/maxlxl/network.target_wait-for-interfaces is less intrusive 
than converting to systemd-networkd.

Best regards,
Ole


-----Original Message-----
From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of
Ole Holm Nielsen
Sent: Monday, October 30, 2023 1:56 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] How to delay the start of slurmd until 
Infiniband/OPA network is fully up?

◆ This message was sent from a non-UWYO address. Please exercise caution when 
clicking links or opening attachments from external sources.


Hi Jens,

Thanks for your feedback:

On 30-10-2023 15:52, Jens Elkner wrote:
Actually there is no need for such a script since
/lib/systemd/systemd-networkd-wait-online should be able to handle it.

It seems that systemd-networkd exists in Fedora FC38 Linux, but not in
RHEL 8 and clones, AFAICT.

Reply via email to