Thank you Thomas and Nathan for your responses. Some more information regarding the setup:
Lustre version: 2.12.0 Two Lustre nodes each consisting of four InfiniBand interfaces (NIDs) and there’s only one mgs, which is running on the first Lustre node. The four NIDs relate back to the four IPs listed in the previously shared mgsnode syntax (i.e. .201, .202, etc.) of the first Lustre node so although it’s not a separate failover partner it is at least a separate IB interface. There is no failover in regard to the MGS. @Thomas, I did not come across the --servicenode syntax in the information that I found, but I’ll look into this and use it for the new virtualized Lustre environment I’m building. Thanks again for your help and insights, Ricardo From: lustre-discuss <[email protected]> on behalf of Nathan Dauchy - NOAA Affiliate via lustre-discuss <[email protected]> Date: Monday, December 6, 2021 at 7:25 AM To: lustre-discuss <[email protected]> Subject: Re: [lustre-discuss] Updating mgsnode IP command completes successfully, but old IP remains CAUTION: External Sender. ________________________________ Ricardo, Your --mgsnode specification with all commas implies that you have four NIDs on a single host. But the rest of your writeup indicates two hosts. >From the Lustre manual, "13.12. Specifying NIDs and Failover": Where multiple NIDs are specified separated by commas (for example, 10.67.73.200@tcp,192.168.10.1@tcp), the two NIDs refer to the same host, and the Lustre software chooses the best one for communication. When a pair of NIDs is separated by a colon (for example, 10.67.73.200@tcp:10.67.73.201@tcp), the two NIDs refer to two different hosts and are treated as a failover pair (the Lustre software tries the first one, and if that fails, it tries the second one.) Hope this helps, Nathan On Sat, Dec 4, 2021 at 5:27 AM Thomas Roth <[email protected]<mailto:[email protected]>> wrote: Dear Ricardo, perhaps the syntax of the --mgsnode specification? Which Lustre version are you running? There might have been changes in the way mgsnodes are specified. And the four NIDs you mentioned, are these all failover partners? Or DNS nodes? Example from our site: We have three MDS, each a pair of active server and failover partner. The format command for the first (MGS+MDT0) read (under Lustre 2.10.6): > ... --servicenode=10.20.3.0@o2ib5 --servicenode=10.20.3.1@o2ib5 > --mgsnode=10.20.3.0@o2ib5 --mgsnode=10.20.3.1@o2ib5 ... No comma, no colon. The format command for the second (MDT1) read: > ... --servicenode=10.20.2.236@o2ib5 --servicenode=10.20.2.237@o2ib5 > --mgsnode=10.20.3.0@o2ib5 --mgsnode=10.20.3.1@o2ib5 ... Obviously the servicenodes are the IPs of MDT1 and its failover partner, the mgsnodes are again the IPs of MGS and its partner. Regards, Thomas On 11/30/21 19:05, Ricardo Brugman wrote: > Hi all, > > I’ve seen many questions/issues came by and I decided to post the issue that > I encountered. > > Recently I tried updating the mgsnode IP address on a lustre node and > although the command executed successfully, the old IP value remained. > > Old value: 10.10.10.2 (points to a server that is not a mgsnode) > New value: > 10.10.10.201@o2ib,10.10.10.202@o2ib,10.10.10.203@o2ib,10.10.10.204@o2ib > > Please find the command and output below: > > [root@xxx ~]# tunefs.lustre --erase-param mgsnode --writeconf > --mgsnode=10.10.10.201@o2ib,10.10.10.202@o2ib,10.10.10.203@o2ib,10.10.10.204@o2ib > zfs_R10_nvme0-4/dne_mdt1 > checking for existing Lustre data: found > > Read previous values: > Target: neohpfs-MDT0001 > Index: 1 > Lustre FS: neohpfs > Mount type: zfs > Flags: 0x1 > (MDT ) > Persistent mount opts: > Parameters: mgsnode=10.10.10.2@o2ib > > Permanent disk data: > Target: neohpfs=MDT0001 > Index: 1 > Lustre FS: neohpfs > Mount type: zfs > Flags: 0x141 > (MDT update writeconf ) > Persistent mount opts: > Parameters: > mgsnode=:10.10.10.201@o2ib,10.10.10.202@o2ib,10.10.10.203@o2ib,10.10.10.204@o2ib > [root@xxx ~]# > > I did restart the lustre service thinking this would perhaps load the new > value/config and although the service came up successfully, it still had not > loaded the new value. > > Appreciate any help, suggestions you can provide as to why the new value was > not saved/loaded. In case I made a mistake, or I followed the incorrect > step(s)/process than please, feel free to point that out. > > Best Regards, > Ricardo > > > _______________________________________________ > lustre-discuss mailing list > [email protected]<mailto:[email protected]> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org> > _______________________________________________ lustre-discuss mailing list [email protected]<mailto:[email protected]> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
