If it is taking too long for targets to sync-up you can tune the activity and speed things up by adjusting some osp tunables.
First, monitor osp sync_in_progress and destroys_in_flight to see if that’s what’s going on. Then you can tune up the MDS’s osp’s max_rpcs_in_progress if necessary. -Cory On 4/19/22, 7:31 PM, "lustre-discuss" <[email protected]> wrote: One thing you can look at is running 'zpool iostat 1' (there are many options) to monitor that ZFS is still doing I/O during that time gap. With NVMe though, as Andreas said, I would expect that time gap to last seconds to minutes, not hours. On 4/19/22 02:16, Einar Næss Jensen wrote: > Thank you for answering Andreas. > > Lustre version is 2.12.8 > > It is indeed when we delete io500 files when we discovered this, but we also > see it when deleting other files, that the "df" lags 1-2 hours behind. > We see it both on nvme and ssd drives. Haven't checkd hdd drives/osts yet. > > This is a new lustre setup, and benchmarks are good (in our opinion). For now > it is just this annoyance bugging us. > We didn't notice on previous lustre setup but will check if we see it there > also. > > > Einar > > > > > ________________________________________ > From: Andreas Dilger <[email protected]> > Sent: Monday, April 11, 2022 18:01 > To: Einar Næss Jensen > Cc: [email protected] > Subject: Re: [lustre-discuss] question regarding du vs df on lustre > > Lustre is returning the file unlink from the MDS immediately, but deleting > the objects from the OSTs asynchronously in the background. > > How many files are being deleted in this case? If you are running tests > like IO500, where there are many millions of small files plus some huge > files, then it may be that huge object deletion is behind small objects? > > That said, it probably shouldn't take hours to finish if the OST storage is > NVMe based. > > Cheers, Andreas > >> On Apr 4, 2022, at 05:05, Einar Næss Jensen <[email protected]> >> wrote: >> >> Hello lustre people. >> >> We are experimenting with lustre on nvme, and observe the following issue: >> After running benchmarks and deleting benchmark files, we see that df and du >> reports different sizes: >> >> [root@idun-02-27 ~]# du -hs /nvme/ >> 38M /nvme/ >> [root@idun-02-27 ~]# df -h|grep nvme >> 10.3.1.2@o2ib:/nvme 5.5T 3.9T 1.3T 76% /nvme >> >> >> It takes several hours before du and df agrees. >> >> What is causing this? >> How can we get updated records for df immediately when deleting files? >> >> >> Best REegards >> Einar >> _______________________________________________ >> lustre-discuss mailing list >> [email protected] >> https://urldefense.us/v3/__http:/lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<https://urldefense.us/v3/__http:/lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org> > _______________________________________________ > lustre-discuss mailing list > [email protected] > https://urldefense.us/v3/__http:/lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<https://urldefense.us/v3/__http:/lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org> _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
