Ricardo, Your --mgsnode specification with all commas implies that you have four NIDs on a single host. But the rest of your writeup indicates two hosts.
>From the Lustre manual, "13.12. Specifying NIDs and Failover": > Where multiple NIDs are specified separated by commas (for example, > 10.67.73.200@tcp,192.168.10.1@tcp), the two NIDs refer to the same host, > and the Lustre software chooses the best one for communication. When a pair > of NIDs is separated by a colon (for example, 10.67.73.200@tcp > :10.67.73.201@tcp), the two NIDs refer to two different hosts and are > treated as a failover pair (the Lustre software tries the first one, and if > that fails, it tries the second one.) Hope this helps, Nathan On Sat, Dec 4, 2021 at 5:27 AM Thomas Roth <[email protected]> wrote: > Dear Ricardo, > > perhaps the syntax of the --mgsnode specification? > > Which Lustre version are you running? There might have been changes in the > way mgsnodes are specified. > > And the four NIDs you mentioned, are these all failover partners? Or DNS > nodes? > > Example from our site: > We have three MDS, each a pair of active server and failover partner. > The format command for the first (MGS+MDT0) read (under Lustre 2.10.6): > > ... --servicenode=10.20.3.0@o2ib5 --servicenode=10.20.3.1@o2ib5 > --mgsnode=10.20.3.0@o2ib5 --mgsnode=10.20.3.1@o2ib5 ... > No comma, no colon. > The format command for the second (MDT1) read: > > ... --servicenode=10.20.2.236@o2ib5 --servicenode=10.20.2.237@o2ib5 > --mgsnode=10.20.3.0@o2ib5 --mgsnode=10.20.3.1@o2ib5 ... > Obviously the servicenodes are the IPs of MDT1 and its failover partner, > the mgsnodes are again the IPs of MGS and its partner. > > > Regards, > Thomas > > On 11/30/21 19:05, Ricardo Brugman wrote: > > Hi all, > > > > I’ve seen many questions/issues came by and I decided to post the issue > that I encountered. > > > > Recently I tried updating the mgsnode IP address on a lustre node and > although the command executed successfully, the old IP value remained. > > > > Old value: 10.10.10.2 (points to a server that is not a mgsnode) > > New value: 10.10.10.201@o2ib,10.10.10.202@o2ib,10.10.10.203@o2ib > ,10.10.10.204@o2ib > > > > Please find the command and output below: > > > > [root@xxx ~]# tunefs.lustre --erase-param mgsnode --writeconf > --mgsnode=10.10.10.201@o2ib,10.10.10.202@o2ib,10.10.10.203@o2ib > ,10.10.10.204@o2ib zfs_R10_nvme0-4/dne_mdt1 > > checking for existing Lustre data: found > > > > Read previous values: > > Target: neohpfs-MDT0001 > > Index: 1 > > Lustre FS: neohpfs > > Mount type: zfs > > Flags: 0x1 > > (MDT ) > > Persistent mount opts: > > Parameters: mgsnode=10.10.10.2@o2ib > > > > Permanent disk data: > > Target: neohpfs=MDT0001 > > Index: 1 > > Lustre FS: neohpfs > > Mount type: zfs > > Flags: 0x141 > > (MDT update writeconf ) > > Persistent mount opts: > > Parameters: mgsnode=:10.10.10.201@o2ib,10.10.10.202@o2ib > ,10.10.10.203@o2ib,10.10.10.204@o2ib > > [root@xxx ~]# > > > > I did restart the lustre service thinking this would perhaps load the > new value/config and although the service came up successfully, it still > had not loaded the new value. > > > > Appreciate any help, suggestions you can provide as to why the new value > was not saved/loaded. In case I made a mistake, or I followed the incorrect > step(s)/process than please, feel free to point that out. > > > > Best Regards, > > Ricardo > > > > > > _______________________________________________ > > lustre-discuss mailing list > > [email protected] > > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
