Nathan, Ive created a Jira issue for this, 
LU-13285<https://jira.whamcloud.com/browse/LU-13285>. In it I attached the 
output of an strace where I was able to capture a string of both successful and 
failed df's.
________________________________
From: Nathan Dauchy - NOAA Affiliate <[email protected]>
Sent: Thursday, February 20, 2020 2:35 PM
To: Konzem, Kevin P <[email protected]>
Cc: [email protected] <[email protected]>
Subject: [EXTERNAL] Re: [lustre-discuss] DF bug with lustre 2.12.4

On Thu, Feb 20, 2020 at 11:47 AM Konzem, Kevin P 
<[email protected]<mailto:[email protected]>> wrote:
test this by running 'while [ true ];do /bin/df -TP /performance;done' on two 
sessions on the same client. As soon as I start the second while loop, the 
outputs go from:
Filesystem                 Type   1024-blocks   Used Available Capacity Mounted 
on
192.168.0.181@tcp:/perform lustre    71467728 100416  67664944       1% 
/performance

to:
Filesystem                 Type   1024-blocks  Used Available Capacity Mounted 
on
192.168.0.181@tcp:/perform lustre           0    -0        -0      50% 
/performance

Kevin,

I can confirm seeing this issue intermittently as well, and usually with a 
re-run of df the results are once again reasonable.  It looks like you have a 
more reliable reproducer though, which is good!  A support ticket was opened 
with our vendor, and they said if we can capture a "strace" of it for a bad run 
that might be helpful... but I haven't caught it in the act yet.  With your 
reproducer, can you get that and open a Jira ticket to track the problem?

As a workaround, try "lfs df" instead, it may take a different code path that 
avoids the bug.

-Nathan

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to