On a dual controller setup with multipath enabled, some MEDIUM ERRORs
caused both paths to be failed, thus I/O got queued/blocked since the
'queue_if_no_path' feature is enabled by default on IPR controllers.
This example disabled 'queue_if_no_path' so the I/O failure is seen at
the sg_dd program.
On Mon, 10 Apr 2017, James Bottomley wrote:
> On Tue, 2017-04-11 at 08:52 +0900, Tejun Heo wrote:
> [...]
> > > Any comments? Any clues on how to make the delay "smarter" to
> > > trigger only once during platform shutdown, but still trigger per
> > > -device when doing per-device hotswapping ?
>
On Tue, 11 Apr 2017, Tejun Heo wrote:
> > The kernel then continues the shutdown path while the SSD is still
> > preparing itself to be powered off, and it becomes a race. When the
> > kernel + firmware wins, platform power is cut before the SSD has
> > finished (i.e. the SSD is subject to an uncl
On 04/10/2017 10:17 PM, Mauricio Faria de Oliveira wrote:
For documentation purposes, I'll reply to this cover letter with the analysis
of such cases of this problem, and the accompanying messages from kernel logs.
Here it goes, for anyone interested.
Scenario: 4 LUNs, 2 target port groups (PGs
Currently, alua_rtpg() can change the 'state' and 'preferred'
values for the current port group _and_ of other port groups
present in the response buffer/descriptors.
However, it reports such changes _only_ for the current port
group (i.e., only for 'pg' but not for 'tmp_pg').
This might cause un
Factor out the sdev_printk() statement with the RTPG information
in alua_rtpg() into a new function, alua_rtpg_print(), that will
also be used in the following patch.
The only functional difference is that the 'valid_states' value
is now referenced via a pointer, and can be NULL (optional), in
whi
This patch series resolves a problem in which all paths of a multipath device
became _permanently_ failed after a storage system had moved both controllers
into a _temporarily_ unavailable state (that is SCSI_ACCESS_STATE_UNAVAILABLE).
This happened because once scsi_dh_alua had set the 'pg->state
According to SPC-4 (5.15.2.4.5 Unavailable state), the unavailable
state may (or may not) transition to other states (e.g., microcode
downloading or hardware error, which may be temporary or permanent
conditions, respectively).
But, scsi_dh_alua currently fails the I/O requests early once that
sta
Path checkers will periodically check all paths to a target port group
in unavailable state more often (as they are 'failed'), possibly for a
long or indefinite period of time, or for a large number of paths.
That might end up flooding the kernel log with the scsi_dh_alua target
port group state m
On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote:
> The block layer always calls the timeout function from a workqueue
> context, so there is no need to have yet another workqueue for
> running command aborts.
>
> [ ... ]
> @@ -271,10 +266,14 @@ enum blk_eh_timer_return scsi_times_out(stru
On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote:
> +is invoked to schedule an asynchrous abort.
^^
Sorry that I hadn't noticed this before but if you have to repost this patch
please fix the spelling of this word.
Bart.
On Tue, 2017-04-11 at 08:52 +0900, Tejun Heo wrote:
[...]
> > Any comments? Any clues on how to make the delay "smarter" to
> > trigger only once during platform shutdown, but still trigger per
> > -device when doing per-device hotswapping ?
>
> So, if this is actually an issue, sure, we can try
On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote:
> When a command has timed out we always should be sending an
> abort; with the previous code a failed abort might signal
> SCSI EH to start, and all other timed out commands will
> never be aborted, even though they might belong to a
> diff
On Thu, 2017-04-06 at 15:36 +0200, Hannes Reinecke wrote:
> If sd_eh_action() decides to take the device offline there is
> no point in returning FAILED, as taking the device offline
> is the ultimate step in SCSI EH anyway.
> So further escalation via SCSI EH is not likely to make a
> difference a
Hello,
On Mon, Apr 10, 2017 at 08:21:19PM -0300, Henrique de Moraes Holschuh wrote:
...
> Per spec (and device manuals), SCSI, SATA and ATA-attached SSDs must be
> informed of an imminent poweroff to checkpoing background tasks, flush
> RAM caches and close logs. For SCSI SSDs, you must tissue a
On Mon, 10 Apr 2017, Bart Van Assche wrote:
> On Mon, 2017-04-10 at 20:21 -0300, Henrique de Moraes Holschuh wrote:
> > A proof of concept patch is attached
>
> Thank you for the very detailed write-up. Sorry but no patch was attached
> to the e-mail I received from you ...
Indeed. It should arr
Author: Henrique de Moraes Holschuh
Date: Wed Feb 1 20:42:02 2017 -0200
sd: wait for slow devices on shutdown path
Wait 1s during suspend/shutdown for the device to settle after
we issue the STOP command.
Otherwise we race ATA SSDs to powerdown, possibly causing damage
On Mon, 2017-04-10 at 20:21 -0300, Henrique de Moraes Holschuh wrote:
> A proof of concept patch is attached
Thank you for the very detailed write-up. Sorry but no patch was attached
to the e-mail I received from you ...
Bart.
Summary:
Linux properly issues the SSD prepare-to-poweroff command to SATA SSDs,
but it does not wait for long enough to ensure the SSD has carried it
through.
This causes a race between the platform power-off path, and the SSD
device. When the SSD loses the race, its power is cut while it is st
s/past/paste/
On 05/04/17 15:21, Christoph Hellwig wrote:
Copy and past the REQ_OP_WRITE_SAME code to prepare to implementations
that limit the write zeroes size.
Cheers,
Wol
https://bugzilla.kernel.org/show_bug.cgi?id=195285
himanshu.madh...@cavium.com (himanshu.madh...@qlogic.com) changed:
What|Removed |Added
CC|
The driver is sending a response to the aborted task response
along with LIO sending the tmr response.
ibmvscsis_tgt does not send the response to the client until
release_cmd time. The reason for this was because if we did it
at queue_status time, then the client would be free to reuse the
tag for
On Mon, 10 Apr 2017, 5:12pm -, Sebastian Andrzej Siewior wrote:
> This is a repost to get the patches applied against v4.11-rc6. mkp's scsi
> for-next tree can be merged with no conflicts.
>
> The last repost [0] was not merged and stalled after Martin pinged Chad
> [1]. He didn't even reply
Hello
I have had issues with the target mode working since moving to 4.10+.
I am using a qla25xx card at 8Gbit
Latest testing with 4.11 RC6 sees the same issue.
Going back to 4.10.4 I can map targets but when I use my jammer I get into
other issues.
Its rock solid on 4.9 with the jammer.
I wil
Since scsi_target_unblock() uses starget_for_each_device(), since
starget_for_each_device() uses scsi_device_get(), since
scsi_device_get() fails after unloading of the LLD kernel module
has been started scsi_target_unblock() may skip devices that were
affected by scsi_target_block(). Ensure that _
Move the code for submitting a SCSI command from scsi_execute()
into scsi_build_rq(). Introduce scsi_execute_async(). This patch
does not change any functionality.
Signed-off-by: Bart Van Assche
Cc: Israel Rukshin
Cc: Max Gurtovoy
Cc: Hannes Reinecke
---
drivers/scsi/scsi_lib.c| 89 ++
Several weeks ago Israel Rukshin reported that __scsi_remove_device()
hangs if it is waiting for the SYNCHRONIZE CACHE command submitted by
the sd driver to finish if the block layer queue is stopped and does
not get restarted. This patch series avoids that that hang occurs.
Bart Van Assche (4):
This patch does not change any functionality.
Signed-off-by: Bart Van Assche
Cc: Israel Rukshin
Cc: Max Gurtovoy
Cc: Hannes Reinecke
---
drivers/scsi/scsi_lib.c | 25 +++--
drivers/scsi/scsi_priv.h | 1 +
2 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/d
This patch avoids that sd_shutdown() hangs on the SYNCHRONIZE CACHE
command if the block layer queue has been stopped by
scsi_target_block().
Signed-off-by: Bart Van Assche
Cc: Israel Rukshin
Cc: Max Gurtovoy
Cc: Hannes Reinecke
---
drivers/scsi/sd.c | 25 +++--
1 file cha
On Fri, 2017-03-17 at 05:54 -0700, James Bottomley wrote:
> but if you want to pursue your approach fixing the
> race with module exit is a requirement.
Hello James,
Sorry that it took so long but I finally found the time to implement and
test an alternative. I will post the patches that implemen
From: Gabriel Krisman Bertazi
[ Upstream commit 36e1f3d107867b25c616c2fd294f5a1c9d4e5d09 ]
While stressing memory and IO at the same time we changed SMT settings,
we were able to consistently trigger deadlocks in the mm system, which
froze the entire machine.
I think that under memory stress co
- All symbols which are only used within one .c file are marked static
and removed from the bnx2fc.h file if possible.
- the declarion of bnx2fc_percpu is moved into the header file
This patch was only compile-tested due to -ENODEV.
Cc: qlogic-storage-upstr...@qlogic.com
Cc: Christoph Hellwig
This is not driven by the hotplug conversation but while I am at it
looks like a good candidate. Converting the thread to a workqueue user
removes also the kthread member from struct fcoe_percpu_s.
This driver uses the struct fcoe_percpu_s but it does not need the
crc_eof_page member, only the wor
The caller of bnx2fc_abts_cleanup() holds the tgt->tgt_lock lock and it
is expected to release the lock during wait_for_completion() and acquire
the lock afterwards.
This patch was only compile-tested due to -ENODEV.
Cc: qlogic-storage-upstr...@qlogic.com
Cc: Christoph Hellwig
Signed-off-by: Seb
The driver creates its own per-CPU threads which are updated based on CPU
hotplug events. It is also possible to use kworkers and remove some of the
infrastructure get the same job done while saving a few lines of code.
bnx2fc_percpu_io_thread() becomes bnx2fc_percpu_io_work() which is
mostly the
This is a repost to get the patches applied against v4.11-rc6. mkp's scsi
for-next tree can be merged with no conflicts.
The last repost [0] was not merged and stalled after Martin pinged Chad
[1]. He didn't even reply after tglx pinged him approx two weeks later.
Johannes Thumshirn was so kind t
The driver creates its own per-CPU threads which are updated based on CPU
hotplug events. It is also possible to use kworkers and remove some of the
infrastructure get the same job done while saving a few lines of code.
The DECLARE_PER_CPU() definition is moved into the header file where it
belong
Now that we are using REQ_OP_WRITE_ZEROES for all zeroing needs in the
kernel there is very little use left for REQ_OP_WRITE_SAME. We only
have two callers left, and both just export optional protocol features
to remote systems: DRBD and the target code.
Do we have any major users of those? If n
Linux only used it for zeroing, for which we have better methods now.
Signed-off-by: Christoph Hellwig
---
drivers/block/drbd/drbd_main.c | 28 ++
drivers/block/drbd/drbd_nl.c | 60 --
drivers/block/drbd/drbd_receiver.c | 38 +++--
Signed-off-by: Christoph Hellwig
---
drivers/md/dm-core.h | 1 -
drivers/md/dm-io.c| 21 +
drivers/md/dm-linear.c| 1 -
drivers/md/dm-mpath.c | 1 -
drivers/md/dm-rq.c| 3 ---
drivers/md/dm-stripe.c| 4 +---
drivers
Now that Write Same is gone and discard bios never have a payload we
can simply use bio_has_data as an indicator that the bio has bvecs
that need to be handled.
Signed-off-by: Christoph Hellwig
---
block/bio.c | 8 +---
block/blk-merge.c | 9 +
include/linux/bio.h | 21 ++
Signed-off-by: Christoph Hellwig
---
drivers/md/linear.c| 1 -
drivers/md/md.h| 7 ---
drivers/md/multipath.c | 1 -
drivers/md/raid0.c | 2 --
drivers/md/raid1.c | 4 +---
drivers/md/raid10.c| 1 -
drivers/md/raid5.c | 1 -
7 files changed, 1 insertion(+), 16 dele
Now that we don't have to support the odd Write Same special case
we can simply increment the iter if the bio has data, else just
manipulate bi_size directly.
Signed-off-by: Christoph Hellwig
---
include/linux/bio.h | 13 +++--
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git
Use the pscsi driver to support arbitrary command passthrough
instead.
Signed-off-by: Christoph Hellwig
---
drivers/target/target_core_iblock.c | 34 --
1 file changed, 34 deletions(-)
diff --git a/drivers/target/target_core_iblock.c
b/drivers/target/target_core
There are no more end-users of REQ_OP_WRITE_SAME left, so we can start
deleting it.
Signed-off-by: Christoph Hellwig
---
drivers/scsi/sd.c | 70 ---
drivers/scsi/sd_zbc.c | 1 -
2 files changed, 71 deletions(-)
diff --git a/drivers/scsi/sd.c
Signed-off-by: Christoph Hellwig
---
block/bio.c | 3 --
block/blk-core.c| 11 +-
block/blk-lib.c | 90 -
block/blk-merge.c | 32
block/blk-settings.c| 16
block/bl
On 10/04/17 02:29 AM, Sagi Grimberg wrote:
> What you are saying is surprising to me. The switch needs to preserve
> ordering across different switch ports ??
> You are suggesting that there is a *switch-wide* state that tracks
> MemRds never pass MemWrs across all the switch ports? That is a ve
On 04/10/2017 01:12 AM, Christoph Hellwig wrote:
>> +if (msecs == 0)
>> +kblockd_schedule_work_on(blk_mq_hctx_next_cpu(hctx),
>> + &hctx->run_work);
>> +else
>> +kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
>> +
On Mon, Apr 10, 2017 at 09:21:57PM +0800, John Garry wrote:
> From: Xiaofei Tan
>
> This patch provides a workaround a SoC bug where SATA IPTTs for
> different devices may conflict.
>
> The workaround solution requests the following:
> 1. SATA device id must be even and not equal to SAS IPTT.
>
On Mon, Apr 10, 2017 at 09:21:56PM +0800, John Garry wrote:
> From: Xiaofei Tan
>
> After resetting the controller, the process of scanning SATA disks
> attached to an expander may fail occasionally. The issue is that
> the controller can't close the STP link created by target if the
> max link t
From: Xiang Chen
For 1 bit ECC errors, those errors can be recovered by hw. But for
multi-bits ECC and AXI errors, there are something wrong with whole
module or system, so try reset the controller to recover those errors
instead of calling panic().
Signed-off-by: Xiang Chen
Signed-off-by: John
From: Xiaofei Tan
This patch adds a workaround solution for a SoC bug which
may cause SoC logic fatal error when disabling a PHY.
Then we find internal abort IO timeout may occur, and the
controller IO breakpoint may be corrupted.
We work around this SoC bug by optimizing the flow of disabling
a
This patchset introduces some v2 hw bug workarounds. Mostly
they are related to SATA devices, but there is also a
workaround for a scenario when internal abort command may
timeout.
The general rule for implementing workarounds was to do it
in the hw layer, as the next hw revision should not
includ
From: Xiaofei Tan
After resetting the controller, the process of scanning SATA disks
attached to an expander may fail occasionally. The issue is that
the controller can't close the STP link created by target if the
max link time is 0.
To workaround this issue, we reject STP link after resetting
This patch is a workaround for a SoC bug where an internal abort
command may timeout. In v2 hw, the channel should become idle in
order to finish abort process. If the target side has been sending
HOLD, host side channel failed to complete the frame to send, and
can not enter the idle state. Then i
From: Xiaofei Tan
This patch provides a workaround a SoC bug where SATA IPTTs for
different devices may conflict.
The workaround solution requests the following:
1. SATA device id must be even and not equal to SAS IPTT.
2. SATA device can not share the same IPTT with other SAS or
SATA device.
B
If a TMF timeouts (maybe due to unlikely scenario of an expander
being unplugged when TMF for remote device is active), when we
eventually try to free the slot, we crash as we dereference the
slot's task, which has already been released.
As a fix, add checks in the slot release code for a NULL tas
Sagi
As long as legA, legB and the RC are all connected to the same switch then
ordering will be preserved (I think many other topologies also work). Here is
how it would work for the problem case you are concerned about (which is a read
from the NVMe drive).
1. Disk device DMAs out the dat
Looks good,
Reviewed-by: Christoph Hellwig
On Fri, Apr 07, 2017 at 07:59:08PM +, Bart Van Assche wrote:
> On Wed, 2017-04-05 at 07:41 -0400, Martin K. Petersen wrote:
> > +static ssize_t
> > +zeroing_mode_store(struct device *dev, struct device_attribute *attr,
> > + const char *buf, size_t count)
> > +{
> > + struct scsi
Looks good,
Reviewed-by: Christoph Hellwig
> + if (msecs == 0)
> + kblockd_schedule_work_on(blk_mq_hctx_next_cpu(hctx),
> + &hctx->run_work);
> + else
> + kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
> + &hctx->d
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
65 matches
Mail list logo