Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread David Dillow
On Thu, 2013-07-04 at 10:01 +0200, Bart Van Assche wrote: > On 07/03/13 20:57, David Dillow wrote: > > And I'm getting the strong sense that the answer to my question about > > fast_io_fail_tmo >= 0 when dev_loss_tmo is that we should not allow that > > combination, even if it doesn't break the ker

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread Bart Van Assche
On 07/08/13 19:26, Vu Pham wrote: > After running cable pull test on two local IB links for several hrs, > I/Os got stuck. > Further commands "multipath -ll" or "fdisk -l" got stuck and never return > Here are the stack dump for srp-x kernel threads. > I'll run with #DEBUG to get more debug info o

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-08 Thread Vu Pham
Though, now that I've unpacked it -- I don't think it is OK for dev_loss_tmo to be off, but fast IO to be on? That drops another conditional. The combination of dev_loss_tmo off and reconnect_delay > 0 worked fine in my tests. An I/O failure was detected shortly after the cable to the

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-04 Thread Bart Van Assche
On 07/04/13 10:01, Bart Van Assche wrote: On 07/03/13 20:57, David Dillow wrote: And I'm getting the strong sense that the answer to my question about fast_io_fail_tmo >= 0 when dev_loss_tmo is that we should not allow that combination, even if it doesn't break the kernel. If it doesn't make sen

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-04 Thread Bart Van Assche
On 07/03/13 20:57, David Dillow wrote: And I'm getting the strong sense that the answer to my question about fast_io_fail_tmo >= 0 when dev_loss_tmo is that we should not allow that combination, even if it doesn't break the kernel. If it doesn't make sense, there is no reason to create an opportu

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Vu Pham
David Dillow wrote: On Wed, 2013-07-03 at 20:24 +0200, Bart Van Assche wrote: On 07/03/13 19:27, David Dillow wrote: On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote: The combination of dev_loss_tmo off and reconnect_delay > 0 worked fine in my tests. An I/O failure was

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread David Dillow
On Wed, 2013-07-03 at 20:24 +0200, Bart Van Assche wrote: > On 07/03/13 19:27, David Dillow wrote: > > On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote: > >> The combination of dev_loss_tmo off and reconnect_delay > 0 worked fine > >> in my tests. An I/O failure was detected shortly after t

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Bart Van Assche
On 07/03/13 19:27, David Dillow wrote: On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote: The combination of dev_loss_tmo off and reconnect_delay > 0 worked fine in my tests. An I/O failure was detected shortly after the cable to the target was pulled. I/O resumed shortly after the cable

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread David Dillow
On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote: > On 07/03/13 17:14, David Dillow wrote: > > On Wed, 2013-07-03 at 14:54 +0200, Bart Van Assche wrote: > >> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo) > >> +{ > >> + return (fast_io_fail_tmo < 0 || dev_loss_tmo < 0 || > >> +

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Bart Van Assche
On 07/03/13 17:14, David Dillow wrote: On Wed, 2013-07-03 at 14:54 +0200, Bart Van Assche wrote: +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo) +{ + return (fast_io_fail_tmo < 0 || dev_loss_tmo < 0 || + fast_io_fail_tmo < dev_loss_tmo) && + fast_io_f

Re: [PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread David Dillow
On Wed, 2013-07-03 at 14:54 +0200, Bart Van Assche wrote: > +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo) > +{ > + return (fast_io_fail_tmo < 0 || dev_loss_tmo < 0 || > + fast_io_fail_tmo < dev_loss_tmo) && > + fast_io_fail_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEO

[PATCH v3 07/13] scsi_transport_srp: Add transport layer error handling

2013-07-03 Thread Bart Van Assche
Add the necessary functions in the SRP transport module to allow an SRP initiator driver to implement transport layer error handling similar to the functionality already provided by the FC transport layer. This includes: - Support for implementing fast_io_fail_tmo, the time that should elapse aft