Hi, We have lustre clients installed with CentOS 6.2 (Kernel : 2.6.32-220) with lustre client version : 2.1.1 and lustre based filesystem has been mounted on it. Lustre server is installed with CenOS 7.6 ( kernel : 3.10.0.-957) with lustre server version : 2.10.7). The lustre based filesystem gets mounted on the clients without any issues and read write also happens. However, random;y some I/O requests stucks like this (may be out of memerory) and then cpu load keeps increasing and many processes stuck.
########################################################################################################################### LustreError: 6382:0:(client.c:1060:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff8807611c4000 x1657240365560063/t0(0) o-1->[email protected]@o2ib:28/4 lens 296/352 e 0 to 0 dl 0 ref 1 fl Rpc:/ffffffff/ffffffff rc 0/-1 Mar 16 13:34:50 ycn161 kernel: LustreError: 6382:0:(client.c:1060:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff880afe79ac00 x1657240365560079/t0(0) o-1->[email protected]@o2ib:28/4 lens 296/352 e 0 to 0 dl 0 ref 1 fl Rpc:/ffffffff/ffffffff rc 0/-1 Mar 16 13:34:50 ycn161 kernel: LustreError: 6382:0:(client.c:1060:ptlrpc_import_delay_req()) Skipped 4 previous similar messages Mar 16 13:34:50 ycn161 kernel: pw.x invoked oom-killer: gfp_mask=0x0, order=0, oom_adj=0, oom_score_adj=0 Mar 16 13:34:50 ycn161 kernel: pw.x cpuset=/ mems_allowed=0-1 Mar 16 13:34:50 ycn161 kernel: Pid: 113644, comm: pw.x Tainted: GF ---------------- 2.6.32-220.el6.x86_64 #1 Mar 16 13:34:50 ycn161 kernel: Call Trace: Mar 16 13:34:50 ycn161 kernel: [<ffffffff810c2cb1>] ? cpuset_print_task_mems_allowed+0x91/0xb0 Mar 16 13:34:50 ycn161 kernel: [<ffffffff81113a30>] ? dump_header+0x90/0x1b0 Mar 16 13:34:50 ycn161 kernel: [<ffffffff8120d97c>] ? security_real_capable_noaudit+0x3c/0x70 Mar 16 13:34:50 ycn161 kernel: [<ffffffff81113eba>] ? oom_kill_process+0x8a/0x2c0 Mar 16 13:34:50 ycn161 kernel: [<ffffffff81113dae>] ? select_bad_process+0x9e/0x120 Mar 16 13:34:50 ycn161 kernel: [<ffffffff81114310>] ? out_of_memory+0x220/0x3c0 Mar 16 13:34:50 ycn161 kernel: [<ffffffff81047187>] ? pte_alloc_one+0x37/0x50 Mar 16 13:34:50 ycn161 kernel: [<ffffffff81114575>] ? pagefault_out_of_memory+0xc5/0x110 Mar 16 13:34:50 ycn161 kernel: [<ffffffff8104277e>] ? mm_fault_error+0x4e/0x100 Mar 16 13:34:50 ycn161 kernel: [<ffffffff81042d36>] ? __do_page_fault+0x336/0x480 Mar 16 13:34:50 ycn161 kernel: [<ffffffff8100bc0e>] ? apic_timer_interrupt+0xe/0x20 Mar 16 13:34:50 ycn161 kernel: [<ffffffff814f246e>] ? do_page_fault+0x3e/0xa0 Mar 16 13:34:50 ycn161 kernel: [<ffffffff814ef825>] ? page_fault+0x25/0x30 Mar 16 13:34:50 ycn161 kernel: Mem-Info: Mar 16 13:34:50 ycn161 kernel: Node 0 DMA per-cpu: Mar 16 13:34:50 ycn161 kernel: CPU 0: hi: 0, btch: 1 usd: 0 ####################################################################################################################### ------------------------------------------------------------------------------------------------------------ [ C-DAC is on Social-Media too. Kindly follow us at: Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] This e-mail is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies and the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email is strictly prohibited and appropriate legal action will be taken. ------------------------------------------------------------------------------------------------------------ _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
