I would like to report how the Infiniband/OPA network device starts up
step by step as reported by Max's Systemd service from
https://github.com/maxlxl/network.target_wait-for-interfaces
This is the sequence of events during boot:
$ grep wait-for-interfaces.sh /var/log/messages
Nov 1 16:13:39 d064 wait-for-interfaces.sh[1610]: Wait for network devices
Nov 1 16:13:39 d064 wait-for-interfaces.sh[1610]: Available connections are:
Nov 1 16:13:40 d064 wait-for-interfaces.sh[1613]: NAME UUID
TYPE DEVICE
Nov 1 16:13:40 d064 wait-for-interfaces.sh[1613]: eno8403
1108d0aa-8841-4f2e-b42e-bd9509a2aba0 ethernet --
Nov 1 16:13:40 d064 wait-for-interfaces.sh[1613]: System eno8303
44931a14-005a-415d-a82b-8c1a2007a118 ethernet --
Nov 1 16:13:40 d064 wait-for-interfaces.sh[1613]: System ib0
2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89 infiniband --
Nov 1 16:13:40 d064 wait-for-interfaces.sh[2011]: Error: Device 'ib0' not
found.
Nov 1 16:13:41 d064 wait-for-interfaces.sh[2127]: Error: Device 'ib0' not
found.
Nov 1 16:13:41 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online:
Nov 1 16:13:42 d064 wait-for-interfaces.sh[2134]: Error: Device 'ib0' not
found.
Nov 1 16:13:42 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online:
Nov 1 16:13:43 d064 wait-for-interfaces.sh[2148]: Error: Device 'ib0' not
found.
Nov 1 16:13:43 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online:
Nov 1 16:13:44 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 20 (unavailable)
Nov 1 16:13:45 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 20 (unavailable)
Nov 1 16:13:46 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 20 (unavailable)
Nov 1 16:13:47 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 20 (unavailable)
Nov 1 16:13:48 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 20 (unavailable)
Nov 1 16:13:49 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 20 (unavailable)
Nov 1 16:13:50 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 20 (unavailable)
Nov 1 16:13:51 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 20 (unavailable)
Nov 1 16:13:52 d064 wait-for-interfaces.sh[1610]: Waiting for interface
100 (connected)ib0 to come online: 20 (unavailable)
Nov 1 16:13:53 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 80 (connecting (checking IP connectivity))
Nov 1 16:13:54 d064 wait-for-interfaces.sh[1610]: Waiting for interface
ib0 to come online: 100 (connected)
As you can see there are many intermediate steps before the "100
(connected)" status reports that ib0 is up.
The slurmd service will only start after this, which is what we wanted.
Best regards,
Ole
On 11/1/23 14:03, Paulo Jose Braga Estrela wrote:
Ole,
Look at the NetworkManager-wait-online.service man page bellow (from RHEL 8.8).
Maybe your IB interfaces aren't properly configured in NetworkManager. The ***
were added by me.
" NetworkManager-wait-online.service blocks until NetworkManager logs "startup
complete" and announces startup
complete on D-Bus. How long that takes depends on the network and the
NetworkManager configuration. If it
takes longer than expected, then the reasons need to be investigated in
NetworkManager.
There are various reasons what affects NetworkManager reaching "startup
complete" and how long
NetworkManager-wait-online.service blocks.
· In general, ***startup complete is not reached as long as
NetworkManager is busy activating a device and as
long as there are profiles in activating state ***. During boot,
NetworkManager starts autoactivating
suitable profiles that are ***configured to autoconnect***. If
activation fails, NetworkManager might retry
right away (depending on connection.autoconnect-retries setting).
While trying and retrying,
NetworkManager is busy until all profiles and devices either
reached an activated or disconnected state
and no further events are expected.
***Basically, as long as there are devices and connections in
activating state visible with nmcli device
and nmcli connection, startup is still pending. ***"
PÚBLICA
-----Mensagem original-----
De: slurm-users <slurm-users-boun...@lists.schedmd.com> Em nome de Ole Holm
Nielsen
Enviada em: quarta-feira, 1 de novembro de 2023 05:19
Para: slurm-users@lists.schedmd.com
Assunto: Re: [slurm-users] RES: How to delay the start of slurmd until
Infiniband/OPA network is fully up?
Hi Paulo,
> O emitente desta mensagem é responsável por seu conteúdo e endereçamento e
deve observar as normas internas da Petrobras. Cabe ao destinatário assegurar que
as informações e dados pessoais contidos neste correio eletrônico somente sejam
utilizados com o grau de sigilo adequado e em conformidade com a legislação de
proteção de dados e privacidade aplicável. A utilização das informações e dados
pessoais contidos neste correio eletrônico em desconformidade com as normas
aplicáveis acarretará a aplicação das sanções cabíveis.
The sender of this message is responsible for its content and address and must
comply with Petrobras' internal rules. It is up to the recipient to ensure that
the information and personal data contained in this email are only used with
the appropriate degree of confidentiality and in compliance with applicable
data protection and privacy legislation. The use of the information and
personal data contained in this e-mail in violation of the applicable rules
will result in the application of the applicable sanctions.
El remitente de este mensaje es responsable por su contenido y dirección y debe
cumplir con las normas internas de Petrobras. Corresponde al destinatario
asegurarse de que la información y los datos personales contenidos en este
correo electrónico solo se utilicen con el grado adecuado de confidencialidad y
de conformidad con la legislación aplicable en materia de privacidad y
protección de datos. El uso de la información y datos personales contenidos en
este correo electrónico en contravención de las normas aplicables dará lugar a
la aplicación de las sanciones correspondientes.
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark,
Fysikvej Building 309, DK-2800 Kongens Lyngby, Denmark
E-mail: ole.h.niel...@fysik.dtu.dk
Homepage: http://dcwww.fysik.dtu.dk/~ohnielse/
Mobile: (+45) 5180 1620
On 11/1/23 01:12, Paulo Jose Braga Estrela wrote:
I think that you should use NetworkManager-wait-online.service In RHEL 8. Take
a look at its man page. It only allows the system reach network-online after
all network interfaces are online. So, if your OP interfaces are managed by
Network Manager, you can use it.
Unfortunately NetworkManager-wait-online.service returns as soon as 1 network
interface is up. It doesn't wait for any other networks, including the
Infiniband/OPA network, unfortunately :-(
You can see that the NetworkManager-wait-online.service file executes:
ExecStart=/usr/bin/nm-online -s -q
and this is causing our problems with Infiniband/OPA networks. This is the
reason why we need Max's workaround wait-for-interfaces.service.
/Ole
-----Mensagem original-----
De: slurm-users <slurm-users-boun...@lists.schedmd.com> Em nome de Ole
Holm Nielsen Enviada em: terça-feira, 31 de outubro de 2023 07:00
Para: Slurm User Community List <slurm-users@lists.schedmd.com>
Assunto: Re: [slurm-users] How to delay the start of slurmd until
Infiniband/OPA network is fully up?
Hi Jeffrey,
On 10/30/23 20:15, Jeffrey R. Lang wrote:
The service is available in RHEL 8 via the EPEL package repository as
system-networkd, i.e. systemd-networkd.x86_64
253.4-1.el8 epel
Thanks for the info. We can install the systemd-networkd RPM from the EPEL
repo as you suggest.
I tried to understand the properties of systemd-networkd before implementing it
in our compute nodes. While there are lots of networkd man-pages, it's harder
to find an overview of the actual properties of networkd. This is what I found:
* Networkd is a service included in recent versions of Systemd. It seems to be
an alternative to NetworkManager.
* Red Hat has stated that systemd-networkd is NOT going to be implemented in
RHEL 8 or 9.
* Comparing systemd-networkd and NetworkManager:
https://fedo/
racloud.readthedocs.io%2Fen%2Flatest%2Fnetworkd.html&data=05%7C01%7Cpa
ulo.estrela%40petrobras.com.br%7Cb488d8141bdd4e0fde0908dbdab42982%7C5b
6f62419a574be48e501dfa72e79a57%7C0%7C0%7C638344239576802836%7CUnknown%
7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJX
VCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gPEtcsxK5IYKUrY4j7YwzI3TClHCjGUl%2BCO
TxfCvupc%3D&reserved=0
* Networkd is described in the Wikipedia article
https://en.w/
ikipedia.org%2Fwiki%2FSystemd&data=05%7C01%7Cpaulo.estrela%40petrobras
.com.br%7Cb488d8141bdd4e0fde0908dbdab42982%7C5b6f62419a574be48e501dfa7
2e79a57%7C0%7C0%7C638344239576802836%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
C4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
%7C&sdata=tmTrTlFh67hQ4XjjWHv3reLrNiNiXGirgcAstFigGWk%3D&reserved=0
While networkd seems to be really nifty, I hesitate to replace NetworkManager
by networkd on our EL8 and EL9 systems because this is an unsupported and only
lightly tested setup, and it may require additional work to keep our systems
up-to-date in the future.
It seems to me that Max Rutkowski's solution in
https://github.com/maxlxl/network.target_wait-for-interfaces is less intrusive
than converting to systemd-networkd.
Best regards,
Ole
-----Original Message-----
From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf
Of Ole Holm Nielsen
Sent: Monday, October 30, 2023 1:56 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] How to delay the start of slurmd until
Infiniband/OPA network is fully up?
◆ This message was sent from a non-UWYO address. Please exercise caution when
clicking links or opening attachments from external sources.
Hi Jens,
Thanks for your feedback:
On 30-10-2023 15:52, Jens Elkner wrote:
Actually there is no need for such a script since
/lib/systemd/systemd-networkd-wait-online should be able to handle it.
It seems that systemd-networkd exists in Fedora FC38 Linux, but not
in RHEL 8 and clones, AFAICT.