It looks like the writeconf flag is set on the ost you are trying to mount. Did you completely replace the ost with a newly formatted ost? Or did you set the writeconf flag on the existing ost?
The writeconf flag is an indicator for lustre to regenerated configuration logs, but it needs to be regenerated on all the mdts and osts. This is why the mds logs contained the message to writeconf the mdt first. The luster manual contains the procedure on how to do this. — Rick Mohr Senior HPC System Administrator Joint Institute for Computational Sciences University of Tennessee > On Jun 1, 2020, at 12:05 PM, Quijano, Omar E. <[email protected]> wrote: > > [External Email] > > Dear Lustre Users, > > There was an issue with a degraded volume group. > After replacing the failed disks and mount the OST in question, I get the > following error: > > From OSS side: > # mount -v -t lustre /ost_5 > arg[0] = /sbin/mount.lustre > arg[1] = -v > arg[2] = -o > arg[3] = rw > arg[4] = /dev/sdg > arg[5] = /ost_5 > source = /dev/sdg (/dev/sdg), target = /ost_5 > options = rw > checking for existing Lustre data: found > Reading CONFIGS/mountdata > mounting device /dev/sdg at /ost_5, flags=0x1000000 > options=osd=osd-ldiskfs,,errors=remount-ro,mgsnode=172.21.49.70@tcp,writeconf,param=mgsnode=172.21.49.70@tcp,svname=ana04-OST0005,device=/dev/sdg > mount.lustre: mount /dev/sdg at /ost_5 failed: File exists retries left: 0 > mount.lustre: mount /dev/sdg at /ost_5 failed: File exists > > From the MDS Side: > MGS: Connection restored to 172.21.52.57@o2ib (at 172.21.49.57@tcp) > Jun 1 08:52:13 kernel: [283815.063427] Lustre: MGS: Regenerating > ana04-OST0005 log by user request. > Jun 1 08:52:13 kernel: [283815.063435] Lustre: Found index 5 for > ana04-OST0005, updating log > Jun 1 08:52:13 kernel: [283815.063588] Lustre: Client log for ana04-OST0005 > was not updated; writeconf the MDT first to regenerate it. > Jun 1 08:52:16 kernel: [283818.785764] Lustre: ana04-MDT0000: Connection > restored to 172.21.52.57@o2ib (at 172.21.49.57@tcp) > Jun 1 08:56:56 kernel: [284098.343206] Lustre: > 21769:0:(client.c:2063:ptlrpc_expire_one_request()) @@@ Request sent has > timed out for slow reply: [sent 1591026960/real 1591026960] > req@ffff8803cbb52d00 x1668283989359148/t0(0) > o8->[email protected]@tcp:28/4 lens 520/544 e 0 to 1 dl > 1591027016 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 > Jun 1 08:56:56 kernel: [284098.343214] Lustre: > 21769:0:(client.c:2063:ptlrpc_expire_one_request()) Skipped 96 previous > similar messages > > Any input would be greatly appreciated it. > Thank you, > — > Omar E. Quijano > LCLS IT/Networking Department Head > SLAC National Accelerator Laboratory > T: (650) 926-5436 > > _______________________________________________ > lustre-discuss mailing list > [email protected] > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
