Re: dm-mq and end_clone_request()

2016-08-05 Thread Bart Van Assche
On 08/05/2016 08:39 AM, Laurence Oberman wrote: > I completely forgot I had set no_path_retry=12, so after 12 retries it will > error out. > This is likely why I had different results seemingly affected by timing. > Mike reminded me of it this morning. > > What do you have set for no_path_retry,

Re: dm-mq and end_clone_request()

2016-08-05 Thread Laurence Oberman
- Original Message - > From: "Laurence Oberman" > To: "Mike Snitzer" > Cc: "Bart Van Assche" , dm-de...@redhat.com, > linux-scsi@vger.kernel.org > Sent: Friday, August 5, 2016 7:43:30 AM > Subject: Re: dm-mq and end_clone_request()

Re: dm-mq and end_clone_request()

2016-08-05 Thread Laurence Oberman
- Original Message - > From: "Laurence Oberman" > To: "Mike Snitzer" > Cc: "Bart Van Assche" , dm-de...@redhat.com, > linux-scsi@vger.kernel.org > Sent: Thursday, August 4, 2016 9:07:28 PM > Subject: Re: dm-mq and end_clone_request()

Re: dm-mq and end_clone_request()

2016-08-04 Thread Laurence Oberman
- Original Message - > From: "Mike Snitzer" > To: "Bart Van Assche" > Cc: dm-de...@redhat.com, "Laurence Oberman" , > linux-scsi@vger.kernel.org > Sent: Thursday, August 4, 2016 7:58:50 PM > Subject: Re: dm-mq and end_clone_request()

Re: dm-mq and end_clone_request()

2016-08-04 Thread Mike Snitzer
I've staged another fix, Laurence is seeing success with this added: https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.8&id=d50a6450104c237db1dc75314d17b78c990a8c05 I'll be sending all the fixes I've queued to Linus tonight or early tomorrow (since I'll then be

Re: dm-mq and end_clone_request()

2016-08-04 Thread Bart Van Assche
On 08/04/2016 09:10 AM, Mike Snitzer wrote: Anyway, at this point you're having us test too many changes that aren't yet upstream: $ git diff bart/srp-initiator-for-next dm/dm-4.7-mpath-fixes -- drivers block include kernel | diffstat block/bio-integrity.c |1 block/blk-c

Re: dm-mq and end_clone_request()

2016-08-04 Thread Mike Snitzer
On Wed, Aug 03 2016 at 12:55pm -0400, Bart Van Assche wrote: > On 08/02/2016 05:40 PM, Mike Snitzer wrote: > >But I asked you to run the v4.7 kernel patches I > >pointed to _without_ any of your debug patches. > > I need several patches to fix bugs that are not related to the > device mapper, e.

Re: dm-mq and end_clone_request()

2016-08-03 Thread Laurence Oberman
- Original Message - > From: "Laurence Oberman" > To: "Mike Snitzer" > Cc: "Bart Van Assche" , dm-de...@redhat.com, > linux-scsi@vger.kernel.org > Sent: Tuesday, August 2, 2016 10:55:59 PM > Subject: Re: dm-mq and end_clone_request()

Re: dm-mq and end_clone_request()

2016-08-02 Thread Laurence Oberman
- Original Message - > From: "Laurence Oberman" > To: "Mike Snitzer" > Cc: "Bart Van Assche" , dm-de...@redhat.com, > linux-scsi@vger.kernel.org > Sent: Tuesday, August 2, 2016 10:18:30 PM > Subject: Re: dm-mq and end_clone_request()

Re: dm-mq and end_clone_request()

2016-08-02 Thread Laurence Oberman
- Original Message - > From: "Mike Snitzer" > To: "Laurence Oberman" > Cc: "Bart Van Assche" , dm-de...@redhat.com, > linux-scsi@vger.kernel.org > Sent: Tuesday, August 2, 2016 10:10:12 PM > Subject: Re: dm-mq and end_clone_req

Re: dm-mq and end_clone_request()

2016-08-02 Thread Mike Snitzer
On Tue, Aug 02 2016 at 9:33pm -0400, Laurence Oberman wrote: > Hi Bart > > I simplified the test to 2 simple scripts and only running against one XFS > file system. > Can you validate these and tell me if its enough to emulate what you are > doing. > Perhaps our test-suite is too simple. > >

Re: dm-mq and end_clone_request()

2016-08-02 Thread Laurence Oberman
urence - Original Message - > From: "Mike Snitzer" > To: "Bart Van Assche" > Cc: dm-de...@redhat.com, "Laurence Oberman" , > linux-scsi@vger.kernel.org > Sent: Tuesday, August 2, 2016 8:40:14 PM > Subject: Re: dm-mq and end_clone_request()

Re: dm-mq and end_clone_request()

2016-08-02 Thread Mike Snitzer
On Tue, Aug 02 2016 at 8:19pm -0400, Bart Van Assche wrote: > On 08/02/2016 10:45 AM, Mike Snitzer wrote: > > Please do these same tests against a v4.7 kernel with the 4 patches from > > this branch applied (no need for your other debug patches): > > https://git.kernel.org/cgit/linux/kernel/git/

Re: dm-mq and end_clone_request()

2016-08-02 Thread Bart Van Assche
On 08/02/2016 10:45 AM, Mike Snitzer wrote: > Please do these same tests against a v4.7 kernel with the 4 patches from > this branch applied (no need for your other debug patches): > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.7-mpath-fixes > > I've had good

Re: dm-mq and end_clone_request()

2016-08-02 Thread Mike Snitzer
On Mon, Aug 01 2016 at 6:41pm -0400, Bart Van Assche wrote: > On 08/01/2016 01:46 PM, Mike Snitzer wrote: > > Please retry both variant (CONFIG_DM_MQ_DEFAULT=y first) with this patch > > applied. Interested to see if things look better for you (WARN_ON_ONCEs > > added just to see if we hit the

Re: dm-mq and end_clone_request()

2016-08-01 Thread Mike Snitzer
On Mon, Aug 01 2016 at 2:55P -0400, Bart Van Assche wrote: > On 08/01/2016 10:59 AM, Mike Snitzer wrote: > >This says to me that must_push_back is returning false because > >dm_noflush_suspending() is false. When this happens -EIO will escape up > >the IO stack. > > > >And this confirms that mu

Re: dm-mq and end_clone_request()

2016-08-01 Thread Mike Snitzer
On Mon, Aug 01 2016 at 2:55pm -0400, Bart Van Assche wrote: > On 08/01/2016 10:59 AM, Mike Snitzer wrote: > >This says to me that must_push_back is returning false because > >dm_noflush_suspending() is false. When this happens -EIO will escape up > >the IO stack. > > > >And this confirms that m

Re: dm-mq and end_clone_request()

2016-08-01 Thread Bart Van Assche
On 08/01/2016 10:59 AM, Mike Snitzer wrote: This says to me that must_push_back is returning false because dm_noflush_suspending() is false. When this happens -EIO will escape up the IO stack. And this confirms that must_push_back() calling dm_noflush_suspending() is quite suspect given queue_i

Re: dm-mq and end_clone_request()

2016-08-01 Thread Mike Snitzer
With this debug patch ontop of v4.7: diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 52baf8a..22baf29 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -433,10 +433,22 @@ failed: */ static int must_push_back(struct multipath *m) { + bool queue_if_no_path

Re: dm-mq and end_clone_request()

2016-07-28 Thread Mike Snitzer
On Thu, Jul 28 2016 at 11:23am -0400, Bart Van Assche wrote: > On 07/28/2016 06:33 AM, Mike Snitzer wrote: > >On Wed, Jul 27 2016 at 7:05pm -0400, > >Bart Van Assche wrote: > >>Thanks again for having made this patch available. I will test it as > >>soon as I have the time. BTW, in the meantime

Re: dm-mq and end_clone_request()

2016-07-28 Thread Bart Van Assche
On 07/28/2016 06:33 AM, Mike Snitzer wrote: On Wed, Jul 27 2016 at 7:05pm -0400, Bart Van Assche wrote: Thanks again for having made this patch available. I will test it as soon as I have the time. BTW, in the meantime I ran a few tests with DM_MQ_DEFAULT=n since until now I ran all tests with

Re: dm-mq and end_clone_request()

2016-07-28 Thread Mike Snitzer
On Wed, Jul 27 2016 at 7:05pm -0400, Bart Van Assche wrote: > On 07/27/2016 01:09 PM, Mike Snitzer wrote: > > In addition to the above patch, please apply this patch and retest your > >4.7 kernel: > > > >diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > >index 287caa7..16583c1 100644

Re: dm-mq and end_clone_request()

2016-07-27 Thread Mike Snitzer
On Mon, Jul 25 2016 at 9:16pm -0400, Mike Snitzer wrote: > Hi Bart, > > Please try this patch to see if it fixes your issue, thanks. > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > index 52baf8a..287caa7 100644 > --- a/drivers/md/dm-mpath.c > +++ b/drivers/md/dm-mpath.c > @@

Re: dm-mq and end_clone_request()

2016-07-27 Thread Mike Snitzer
On Wed, Jul 27 2016 at 3:06pm -0400, Bart Van Assche wrote: > On 07/27/2016 08:52 AM, Benjamin Marzinski wrote: > >if you look in drivers/md/dm-ioctl.c at do_resume(), device mapper > >internally does a suspend when you call resume with a new table loaded. > >That's when these suspends are happe

Re: dm-mq and end_clone_request()

2016-07-27 Thread Mike Snitzer
On Tue, Jul 26 2016 at 6:51pm -0400, Bart Van Assche wrote: > On 07/25/2016 06:15 PM, Mike Snitzer wrote: > > Please try this patch to see if it fixes your issue, thanks. > > > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > > index 52baf8a..287caa7 100644 > > --- a/drivers/md/dm

Re: dm-mq and end_clone_request()

2016-07-26 Thread Bart Van Assche
On 07/25/2016 06:15 PM, Mike Snitzer wrote: > Please try this patch to see if it fixes your issue, thanks. > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > index 52baf8a..287caa7 100644 > --- a/drivers/md/dm-mpath.c > +++ b/drivers/md/dm-mpath.c > @@ -433,10 +433,17 @@ failed: >

Re: dm-mq and end_clone_request()

2016-07-25 Thread Mike Snitzer
On Mon, Jul 25 2016 at 6:00P -0400, Bart Van Assche wrote: > On 07/25/2016 02:23 PM, Mike Snitzer wrote: > >So I'd be curious to know if your debugging has enabled you to identify > >exactly where in the dm-mapth.c code the -EIO return is being > >established. do_end_io() is the likely candidat

Re: dm-mq and end_clone_request()

2016-07-25 Thread Bart Van Assche
On 07/25/2016 02:23 PM, Mike Snitzer wrote: So I'd be curious to know if your debugging has enabled you to identify exactly where in the dm-mapth.c code the -EIO return is being established. do_end_io() is the likely candidate -- but again the __must_push_back() check should prevent it and DM_EN

Re: dm-mq and end_clone_request()

2016-07-25 Thread Mike Snitzer
On Mon, Jul 25 2016 at 1:53pm -0400, Mike Snitzer wrote: > On Thu, Jul 21 2016 at 4:58pm -0400, > Bart Van Assche wrote: > > > On 07/20/2016 11:33 AM, Mike Snitzer wrote: > > >Would be interesting to know the error returned from map_request()'s > > >ti->type->clone_and_map_rq(). Really shoul

Re: dm-mq and end_clone_request()

2016-07-25 Thread Mike Snitzer
On Thu, Jul 21 2016 at 4:58pm -0400, Bart Van Assche wrote: > On 07/20/2016 11:33 AM, Mike Snitzer wrote: > >Would be interesting to know the error returned from map_request()'s > >ti->type->clone_and_map_rq(). Really should just be DM_MAPIO_REQUEUE. > >But the stack you've provided shows map_r

Re: dm-mq and end_clone_request()

2016-07-21 Thread Mike Snitzer
On Wed, Jul 20 2016 at 1:37pm -0400, Bart Van Assche wrote: > On 07/20/2016 07:27 AM, Mike Snitzer wrote: > >On Wed, Jul 20 2016 at 10:08am -0400, > >Mike Snitzer wrote: > >[ ... ] > >diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c > >index 7a96618..347ff25 100644 > >--- a/drivers/md/dm-rq

Re: dm-mq and end_clone_request()

2016-07-20 Thread Mike Snitzer
On Wed, Jul 20 2016 at 1:37pm -0400, Bart Van Assche wrote: > On 07/20/2016 07:27 AM, Mike Snitzer wrote: > >On Wed, Jul 20 2016 at 10:08am -0400, > >Mike Snitzer wrote: > >[ ... ] > >diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c > >index 7a96618..347ff25 100644 > >--- a/drivers/md/dm-rq

Re: dm-mq and end_clone_request()

2016-07-20 Thread Mike Snitzer
On Wed, Jul 20 2016 at 10:08am -0400, Mike Snitzer wrote: > On Tue, Jul 19 2016 at 6:57pm -0400, > Bart Van Assche wrote: > > > Hello Mike, > > > > If I run a fio data integrity test against kernel v4.7-rc7 then I > > see often that fio reports I/O errors if a path is removed despite > > queu

Re: dm-mq and end_clone_request()

2016-07-20 Thread Mike Snitzer
On Tue, Jul 19 2016 at 6:57pm -0400, Bart Van Assche wrote: > Hello Mike, > > If I run a fio data integrity test against kernel v4.7-rc7 then I > see often that fio reports I/O errors if a path is removed despite > queue_if_no_path having been set in /etc/multipath.conf. Further > analysis show