Acked-by: Sudarsana Kalluru
-Original Message-
From: Arnd Bergmann [mailto:a...@arndb.de]
Sent: 04 December 2017 20:17
To: Gurumurthy, Anil ; Kalluru, Sudarsana
; James E.J. Bottomley ;
Martin K. Petersen
Cc: Arnd Bergmann ; Hannes Reinecke ; Kees Cook
; Benjamin Poirier ; Mody, Rase
Hello Bart,
I am returning BLK_EH_HANDLED in iscsi_eh_cmd_timed_out(). Do you mean
something different ?
That paragraph means that I have tried to return BLK_EH_NOT_HANDLED first,
because that would be the other option instead of BLK_EH_RESET_TIMER (which is
causing this issue), but if I did
When unregistering a remote port the lpfc driver would eventually
wait for the remoteport_unreg done callback. But the driver never
completed the io aborts that would allow the connections to terminate
thus the unreg done callback was never issued. Turns out the coding
style of the driver allowed
Increased the sizes of the SCSI WQ's and CQ's so that SCSI operation is
similar to that used by NVME. However, size increase restricted only to
those newer adapters that can support the larger WQE size, thus bigger
queue sizes.
Signed-off-by: Dick Kennedy
Signed-off-by: James Smart
---
drivers/
In the lpfc discovery engine, when as a nvme target, where the
driver was performing mailbox io with the adapter for port login
when a NVME PRLI is received from the host. Rather than queue and
eventually get back to sending a response after the mailbox traffic,
the driver rejected the io with an e
Update the driver version to 11.4.0.6
Signed-off-by: Dick Kennedy
Signed-off-by: James Smart
---
drivers/scsi/lpfc/lpfc_version.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/lpfc/lpfc_version.h b/drivers/scsi/lpfc/lpfc_version.h
index cc2f5cec98c5..c232bf0e8
When enabled for both SCSI and NVME support, and connected pt2pt to a
SCSI only target, the driver nodelist entry for the remote port is
left in PRLI_ISSUE state and no SCSI LUNs are discovered. Works fine
if only configured for SCSI support.
Error was due to some of the prli points still reflecti
If log verbose in not turned on, its hard to tell when
certain error paths get hit. Add stats counters and
corresponding logic to debugfs/sysfs to aid understanding
what paths were traversed.
Signed-off-by: Dick Kennedy
Signed-off-by: James Smart
---
drivers/scsi/lpfc/lpfc_attr.c| 46 ++
Handling a rcv'ed PRLI incorrectly can cause the ndlp to end up
in the wrong state or the driver to ACC and PRLI when it should
send LS_RJT.
The cause was due to the driver not properly looking at the PRLI
type and taking the multiple protocol support into consideration.
Resolved by adding checks
The driver is all set to handle the defer_rcv api for the
nvmet_fc transport, yet didn't properly recognize the return
status when the defer_rcv occurred. The driver treated it simply
as an error and aborted the io. Several residual issues occurred
at that point.
Finish the defer_rcv support: reco
NVME targets appear to randomly disconnect from the initiator
when running heavy IO.
The error is due to the host aggregate (across all controllers)
io load was beyond the maximum exchange count for nvme on the
adapter. The driver was properly returning a resource busy status,
but the io load was
This patch set provides a number of bug fixes and 1 addition to
the driver.
The patches were cut against the Martin's 4.16/scsi-queue tree.
There are no outside dependencies and are expected to be pulled
via Martins tree.
James Smart (9):
lpfc: Fix random heartbeat timeouts during heavy IO
lp
From: Colin Ian King
The check for secs being less than zero is redundant for two reasons.
Firstly, secs is unsigned so the check is always going to be false.
Secondly, if secs was signed the proceeding calculation of secs is
never going to be negative. Hence we can remove this redundant check
a
On Thu, 2017-12-07 at 19:59 -0200, Rafael David Tinoco wrote:
> This happens because iscsi_eh_cmd_timed_out(), the transport layer
> timeout helper, would tell the queue timeout function (scsi_times_out)
> to reset the request timer over and over, until the session state is
> back to logged in stat
The scsi_sd_probe_domain allows to wait until all disk-probing
activity has finished system-wide. This slows down SCSI host removal
that occurs concurrently with SCSI disk probing because sd_remove()
waits on scsi_sd_probe_domain. Additionally, since each function that
waits on scsi_sd_probe_domain
From: Xiang Chen
It is required to initialize the dq spinlock before use, which
was not being done, so fix it. This issue can be detected when
CONFIG_DEBUG_SPINLOCK is enabled.
Signed-off-by: Xiang Chen
Signed-off-by: John Garry
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 1 +
1 file changed,
From: Xiang Chen
For v3 hw, each NCQ will return a CQ, so it is no need to
acquire IPTT from ITCT, just acquire it from IPTT field of
CQ.
Signed-off-by: Xiang Chen
Signed-off-by: John Garry
---
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 40 +-
1 file changed, 6 i
This patchset contains support for some new
features, and also some modifications and other
fixes.
Headline changes include:
- v3 hw Suspend and Resume support
- v3 hw RAS (PCI AER) support
- v2 hw HW port error handling support
- other misc fixes and tidy-up
Xiang Chen (8):
scsi: hisi_sas: ini
From: Xiang Chen
Do a couple of changes for when HISI_SAS_RESET_BIT is
set for HBA:
- Clearing ITCT is not necessary
- Remove internal abort as it will fail during reset
Flag sas_dev->dev_type is kept as SAS_PHY_UNUSED.
Signed-off-by: Xiang Chen
Signed-off-by: John Garry
---
drivers/scsi/his
From: Xiaofei Tan
Sometimes it is required to know when the controller reset
has completed and also if it has completed successfully.
For such places, we call hisi_sas_controller_reset() directly
before. That may lead to multiple calls to this function.
This patch create a per-reset structure wh
From: Xiaofei Tan
Currently refreshing the PHY port id after reset is
done in the rescan topology function, which is quite
late in the reset process. It could be moved earlier in
the process, as the port id can be refreshed once the
PHYs become ready.
In addition to this, we should set the hisi_
From: Xiang Chen
Add some print at some places such as error info and cq
of exception IO, device found etc, and also adjust some
log levels.
All this to assist debugging ability.
Signed-off-by: Xiang Chen
Signed-off-by: John Garry
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 15 ++---
From: Xiaofei Tan
We should do link reset of PHY when identify timeout or
STP link timeout. They are internal events of SOC and are
notified to driver through interrupts of CHL_INT2.
Besides, we should add an delay work to do link reset as
it needs sleep. So, this patch add an new PHY event
HISI
From: Xiaofei Tan
We use PCIe AER to support RAS feature for v3 hw.
This driver should do following two things to support this:
1. Enable RAS interrupts, so that errors can be reported to
RAS module.
2. Realize err_handler for sas_v3_pci_driver. Then if non-fatal
error is detected, print error so
From: Xiaofei Tan
This patch do following optimizations to host controller reset:
1. Unblock scsi requests before rescanning topology, as SCSI
command need be used if new device is found during rescanning
topology.
2. Remove drain_workqueue(hisi_hba->wq) and
drain_workqueue(shost->work_q), as th
From: Xiang Chen
In function sas_suspend_devices(), it requires
callback lldd_port_deformed callback to be
implemented if lldd_port_deformed is
implemented.
So add a stub for lldd_port_deformed.
Callback lldd_port_deformed was not required as the
port deformation is done elsewhere in the LLDD.
From: Xiang Chen
This patch fix SAS_QUEUE_FULL problem. The test situation is
close port while running IO.
In sas_eh_handle_sas_errors(), SCSI EH will free sas_task of
the device if lldd_I_T_nexus_reset() return
TMF_RESP_FUNC_COMPLETE or -ENODEV.
But in our SAS driver, we only free slots of the
From: Xiaofei Tan
In certain scenarios we may just want to clear the ITCT for
a device, and not free other resources like the SATA bitmap
using in v2 hw.
To facilitate this, this patch relocates the code of clearing
ITCT from free_device() to an new hw interface clear_itct().
Then for some hw, w
From: Xiaofei Tan
We should do internal abort dev before
TMF_ABORT_TASK_SET and TMF_LU_RESET. Because we may
only have done internal abort for single IO in the
earlier part of SCSI EH process. Even the internal abort
to the single IO, we also don't know whether it is
successful.
Besides, we shou
From: Xiaofei Tan
Use an general way to do delay work for a PHY. Then it will
be easier to add new delayed work for a PHY in future.
Signed-off-by: Xiaofei Tan
Signed-off-by: John Garry
---
drivers/scsi/hisi_sas/hisi_sas.h | 9 -
drivers/scsi/hisi_sas/hisi_sas_main.c | 22
From: Xiaofei Tan
Change code format of int_chnl_int_v2_hw() to be consistent with
v3 hw to reduce an tag indent.
Signed-off-by: Xiaofei Tan
Signed-off-by: John Garry
---
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 58 --
1 file changed, 28 insertions(+), 30 delet
From: Xiang Chen
For v3 hw SAS, it supports configuring power state
from D0 to D3 for entering Low Power status and
power state from D3 to D0 for quit Low Power status.
When power state from D0 to D3, HW will send FLR to
clear the registers of ECAM and BAR space, and when
power state from D3 to
From: Xiaofei Tan
Normally, hardware should ensure that internal abort
timeout will never happen. If happen, it would be an SoC
failure. What's more, HW will not process any other
commands if an internal abort hasn't return CQ, and they
will time out also.
So, we should judge the result of inter
From: Xiaofei Tan
Add port AXI errors handling for v2 hw. We do host controller
reset for such errors.
Besides, change port muli-bits ECC error handling, and we
should also do host reset for such error. So, this patch put
them in the same struct with port AXI error.
Signed-off-by: Xiaofei Tan
From: Xiang Chen
For function dma_unmap_sg(), the parameter
should be number of elements in the scatterlist
prior to the mapping, not after the mapping.
Fix this usage.
Signed-off-by: Xiang Chen
Signed-off-by: John Garry
---
drivers/scsi/hisi_sas/hisi_sas_main.c | 6 --
1 file changed,
Looks good,
Reviewed-by: Johannes Thumshirn
--
Johannes Thumshirn Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG N
Lee, Chris,
Some test results.
- Single unmounted disk, with transport connection wiped before final logout:
http://pastebin.ubuntu.com/26139576/
- Multiple mounted disks, multipath dev-mapper, all transport connections were
wiped before the final logout, with heavy write workload:
http://pas
On 08/12/2017 09:42, Jason Yan wrote:
Add a sysfs attr that LLDD can configure it for every host. We made
a example in hisi_sas. Other LLDDs using libsas can implement it if
they want.
Suggested-by: Hannes Reinecke
Signed-off-by: Jason Yan
CC: John Garry
CC: Johannes Thumshirn
CC: Ewan Milne
Hi Martin,
On Fri, Dec 08, 2017 at 04:44:55PM +0800, Ming Lei wrote:
> Hi Martin,
>
> On Thu, Dec 07, 2017 at 09:46:21PM -0500, Martin K. Petersen wrote:
> >
> > Ming,
> >
> > > As I explained in [1], the use-after-free is inevitable no matter if
> > > clearing 'SCpnt->cmnd' before mempool_free
From: Hannes Reinecke
For hardware only supporting active/optimized there's no point in
ever re-issuing RTPG as the only new state we can possibly read is
active/optimized.
This avoid spurious errors during path failover on such arrays.
Signed-off-by: Hannes Reinecke
---
drivers/scsi/device_ha
Now we are processing sas event and discover event in different workqueues.
It's safe to wait the discover event done in the sas event work. Use
flush_workqueue() to insure the disco and revalidate events processed
synchronously so that the whole discover and revalidate process will not
be interrup
There are two places queuing the disco event DISCE_REVALIDATE_DOMAIN.
One is in sas_porte_broadcast_rcvd() and uses sas_chain_event() to queue
the event. The other is in sas_enable_revalidation() and uses
sas_queue_event() to queue the event. We have diffrent work queues for
event and discovery now
In commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery
competing with ata error handling") introduced disco mutex to prevent
rediscovery competing with ata error handling and put the whole
revalidation in the mutex. But the rphy add/remove needs to wait for the
error handling which also
Now all libsas works are queued to scsi host workqueue,
include sas event work post by LLDD and sas discovery
work, and a sas hotplug flow may be divided into several
works, e.g libsas receive a PORTE_BYTES_DMAED event,
currently we process it as following steps:
sas_form_port --- run in work in s
If the PHY burst too many events, we will alloc a lot of events for the
worker. This may leads to memory exhaustion.
Dan Williams suggested to shut down the PHY if the events reached the
threshold, because in this case the PHY may have gone into some
erroneous state. Users can re-enable the PHY by
Now the libsas hotplug has some issues, Dan Williams report
a similar bug here before
https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg39187.html
The issues we have found
1. if LLDD burst reports lots of phy-up/phy-down sas events, some events
may lost because a same sas events is pen
On 07/12/2017 10:57, Jason Yan wrote:
The bsg_job_done() requires the length of payload received, but we give
it the untransferred residual.
As I understand, this patches fixes (SES) enclosure management for
libsas, so it's quite an important patch.
Thanks,
John
Fixes: 651a01364994 ("scsi
Now libsas hotplug work is static, every sas event type has its own
static work, LLDD driver queues the hotplug work into shost->work_q.
If LLDD driver burst posts lots hotplug events to libsas, the hotplug
events may pending in the workqueue like
shost->work_q
new work[PORTE_BYTES_DMAED] --> |[PH
Add a sysfs attr that LLDD can configure it for every host. We made
a example in hisi_sas. Other LLDDs using libsas can implement it if
they want.
Suggested-by: Hannes Reinecke
Signed-off-by: Jason Yan
CC: John Garry
CC: Johannes Thumshirn
CC: Ewan Milne
CC: Christoph Hellwig
CC: Tomas Henzl
Hi Martin,
On Thu, Dec 07, 2017 at 09:46:21PM -0500, Martin K. Petersen wrote:
>
> Ming,
>
> > As I explained in [1], the use-after-free is inevitable no matter if
> > clearing 'SCpnt->cmnd' before mempool_free() in sd_uninit_command() or
> > not, so we need to comment the fact that cdb may poin
50 matches
Mail list logo