Hi, experts

we have a cephfs(15.2.13) cluster with kernel mount, and when we read from 
2000+ processes to one ceph path(called /path/to/A/), then all of the process 
hung, and ls -lrth /path/to/A/ always stuck, but list other directory are 
health( /path/to/B/), 

health detail always report mds has slow request.  And then we need to restart 
the mds fix this issue.

How can we fix this without restart mds(restart mds always impact other users)?

Any suggestions are welcome! Thanks a ton!

from this dump_ops_in_flight:

"description": "client_request(client.100807215:2856632 getattr AsLsXsFs 
#0x200978a3326 2022-08-31T09:36:30.444927+0800 caller_id=2049, 
caller_gid=2049})",
"initiated_at": "2022-08-31T09:36:30.454570+0800",
"age": 17697.012491966001,
"duration": 17697.012805568,
"type_data": {
"flag_point": "dispatched",
"reqid": "client. 100807215:2856632",
"op_type": "client_request",
"client_info":
"client": "client.100807215",
"tid": 2856632
"events":
"time": "2022-08-31T09:36:30.454570+0800",
"event": "initiated"

"time": "2022-08-31T09:36:30.454572+0800",
"event": "throttled"

"time": "2022-08-31T09:36:30.454570+0800",
"event": "header read"

"time": "2022-08-31T09:36:30.454580+0800",
'event": "all_read"
"time": "2022-08-31T09:36:30.454604+0800",
"event": "dispatched"
}



Thanks,
Xiong
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to