Hello,

We are currently using lustre 1.8.1.1 and using kernel version 
2.6.18_128.7.1.el5_lustre.

We are experiencing problems when performing reads of large files from my 
lustre filesystem, small reads are not affected.

The read process hangs and the following message is reported in 
/var/log/messages:

Feb 22 15:59:38 leopard kernel: LustreError: 11-0: an error occurred while 
communicating with 192.168.13.200@o2ib. The obd_ping operation failed with -107
Feb 22 15:59:38 leopard kernel: Lustre: lustre-OST0000-osc-ffff81067e0eac00: 
Connection to service lustre-OST0000 via nid 192.168.13.200@o2ib was lost; in 
progress operations using this service will wait for recovery to complete.
Feb 22 15:59:38 leopard kernel: LustreError: 
6811:0:(import.c:939:ptlrpc_connect_interpret()) lustre-OST0000_UUID went back 
in time (transno 476754140074 was previously committed, server now claims 0)!  
See https://bugzilla.lustre.org/show_bug.cgi?id=9646
Feb 22 15:59:38 leopard kernel: LustreError: 167-0: This client was evicted by 
lustre-OST0000; in progress operations using this service will fail.
Feb 22 15:59:38 leopard kernel: Lustre: lustre-OST0000-osc-ffff81067e0eac00: 
Connection restored to service lustre-OST0000 using nid 192.168.13.200@o2ib.
Feb 22 15:59:38 leopard kernel: LustreError: 
17592:0:(lov_request.c:196:lov_update_enqueue_set()) enqueue objid 0x18f87222 
subobj 0x4d0c9f on OST idx 0: rc -5

I have checked the bugzilla report but we have not had a disk crash and the 
system was not restarted. Could this be an underlying hardware problem that's 
not getting logged?

Any additional help on this matter would be much appreciated.

Kind Regards

Chris


_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to