Was it the servers that were migrated or the clients? Or both? (asking because we upgraded clients to 9.6 (Rocky) very recent, but servers are still on 8.10 (Rocky)
We have not seen this behaviour you describe for our setup. Best Regards Einar ________________________________________ From: lustre-discuss <[email protected]> on behalf of Äkäslompolo Simppa via lustre-discuss <[email protected]> Sent: Friday, August 29, 2025 11:05 To: lustre-discuss Subject: [lustre-discuss] rhel9.5 --> rhel9.6 ==> unstable, ldlm_cnxx_yyy on metadata server Hi! I thought to give an early warning / cry for help in case others are facing similar issues. Coincidence or not, but our lustre setup has become unstable soon after starting to migrate nodes from RHEL9.5 to RHEL9.6. The key symptom is high load on metadata servers, processes like ldlm_cn03_017 take all available CPU time. Also memory hogging happened yesterday, which crashed the servers totally. The processes are distributed lock kernel "daemon"s. Best regards, -- - Simppa - Mr. Simppa Äkäslompolo High performance computing specialist Doctor of Science (Tech.) Aalto Scientific Computing School of Science, Aalto University, Finland +358-50-5311327 https://scicomp.aalto.fi/ _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
