RE: [PATCH] scsi: bfa: convert to strlcpy/strlcat

2017-12-08 Thread Kalluru, Sudarsana
Acked-by: Sudarsana Kalluru -Original Message- From: Arnd Bergmann [mailto:a...@arndb.de] Sent: 04 December 2017 20:17 To: Gurumurthy, Anil ; Kalluru, Sudarsana ; James E.J. Bottomley ; Martin K. Petersen Cc: Arnd Bergmann ; Hannes Reinecke ; Kees Cook ; Benjamin Poirier ; Mody, Rase

Re: [PATCH] scsi: libiscsi: Allow sd_shutdown on bad transport

2017-12-08 Thread Rafael David Tinoco
Hello Bart, I am returning BLK_EH_HANDLED in iscsi_eh_cmd_timed_out(). Do you mean something different ? That paragraph means that I have tried to return BLK_EH_NOT_HANDLED first, because that would be the other option instead of BLK_EH_RESET_TIMER (which is causing this issue), but if I did

[PATCH 7/9] lpfc: Fix infinite wait when driver unregisters a remote NVME port.

2017-12-08 Thread James Smart
When unregistering a remote port the lpfc driver would eventually wait for the remoteport_unreg done callback. But the driver never completed the io aborts that would allow the connections to terminate thus the unreg done callback was never issued. Turns out the coding style of the driver allowed

[PATCH 4/9] lpfc: Increase SCSI CQ and WQ sizes.

2017-12-08 Thread James Smart
Increased the sizes of the SCSI WQ's and CQ's so that SCSI operation is similar to that used by NVME. However, size increase restricted only to those newer adapters that can support the larger WQE size, thus bigger queue sizes. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/

[PATCH 6/9] lpfc: Fix issues connecting with nvme initiator

2017-12-08 Thread James Smart
In the lpfc discovery engine, when as a nvme target, where the driver was performing mailbox io with the adapter for port login when a NVME PRLI is received from the host. Rather than queue and eventually get back to sending a response after the mailbox traffic, the driver rejected the io with an e

[PATCH 9/9] lpfc: update driver version to 11.4.0.6

2017-12-08 Thread James Smart
Update the driver version to 11.4.0.6 Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_version.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/lpfc/lpfc_version.h b/drivers/scsi/lpfc/lpfc_version.h index cc2f5cec98c5..c232bf0e8

[PATCH 5/9] lpfc: Fix SCSI LUN discovery when SCSI and NVME enabled

2017-12-08 Thread James Smart
When enabled for both SCSI and NVME support, and connected pt2pt to a SCSI only target, the driver nodelist entry for the remote port is left in PRLI_ISSUE state and no SCSI LUNs are discovered. Works fine if only configured for SCSI support. Error was due to some of the prli points still reflecti

[PATCH 8/9] lpfc: Beef up stat counters for debug

2017-12-08 Thread James Smart
If log verbose in not turned on, its hard to tell when certain error paths get hit. Add stats counters and corresponding logic to debugfs/sysfs to aid understanding what paths were traversed. Signed-off-by: Dick Kennedy Signed-off-by: James Smart --- drivers/scsi/lpfc/lpfc_attr.c| 46 ++

[PATCH 3/9] lpfc: Fix receive PRLI handling

2017-12-08 Thread James Smart
Handling a rcv'ed PRLI incorrectly can cause the ndlp to end up in the wrong state or the driver to ACC and PRLI when it should send LS_RJT. The cause was due to the driver not properly looking at the PRLI type and taking the multiple protocol support into consideration. Resolved by adding checks

[PATCH 2/9] lpfc: Fix -EOVERFLOW behavior for NVMET and defer_rcv

2017-12-08 Thread James Smart
The driver is all set to handle the defer_rcv api for the nvmet_fc transport, yet didn't properly recognize the return status when the defer_rcv occurred. The driver treated it simply as an error and aborted the io. Several residual issues occurred at that point. Finish the defer_rcv support: reco

[PATCH 1/9] lpfc: Fix random heartbeat timeouts during heavy IO

2017-12-08 Thread James Smart
NVME targets appear to randomly disconnect from the initiator when running heavy IO. The error is due to the host aggregate (across all controllers) io load was beyond the maximum exchange count for nvme on the adapter. The driver was properly returning a resource busy status, but the io load was

[PATCH 0/9] lpfc updates for 11.4.0.6

2017-12-08 Thread James Smart
This patch set provides a number of bug fixes and 1 addition to the driver. The patches were cut against the Martin's 4.16/scsi-queue tree. There are no outside dependencies and are expected to be pulled via Martins tree. James Smart (9): lpfc: Fix random heartbeat timeouts during heavy IO lp

[PATCH][next] scsi: arcmsr: remove redundant check for secs < 0

2017-12-08 Thread Colin King
From: Colin Ian King The check for secs being less than zero is redundant for two reasons. Firstly, secs is unsigned so the check is always going to be false. Secondly, if secs was signed the proceeding calculation of secs is never going to be negative. Hence we can remove this redundant check a

Re: [PATCH] scsi: libiscsi: Allow sd_shutdown on bad transport

2017-12-08 Thread Bart Van Assche
On Thu, 2017-12-07 at 19:59 -0200, Rafael David Tinoco wrote: > This happens because iscsi_eh_cmd_timed_out(), the transport layer > timeout helper, would tell the queue timeout function (scsi_times_out) > to reset the request timer over and over, until the session state is > back to logged in stat

[PATCH] sd: Increase SCSI disk probing concurrency

2017-12-08 Thread Bart Van Assche
The scsi_sd_probe_domain allows to wait until all disk-probing activity has finished system-wide. This slows down SCSI host removal that occurs concurrently with SCSI disk probing because sd_remove() waits on scsi_sd_probe_domain. Additionally, since each function that waits on scsi_sd_probe_domain

[PATCH 01/19] scsi: hisi_sas: initialize dq spinlock before use

2017-12-08 Thread John Garry
From: Xiang Chen It is required to initialize the dq spinlock before use, which was not being done, so fix it. This issue can be detected when CONFIG_DEBUG_SPINLOCK is enabled. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 1 + 1 file changed,

[PATCH 08/19] scsi: hisi_sas: change ncq process for v3 hw

2017-12-08 Thread John Garry
From: Xiang Chen For v3 hw, each NCQ will return a CQ, so it is no need to acquire IPTT from ITCT, just acquire it from IPTT field of CQ. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 40 +- 1 file changed, 6 i

[PATCH 00/19] hisi_sas: PM, RAS, and other misc changes

2017-12-08 Thread John Garry
This patchset contains support for some new features, and also some modifications and other fixes. Headline changes include: - v3 hw Suspend and Resume support - v3 hw RAS (PCI AER) support - v2 hw HW port error handling support - other misc fixes and tidy-up Xiang Chen (8): scsi: hisi_sas: ini

[PATCH 06/19] scsi: hisi_sas: modify hisi_sas_dev_gone() for reset

2017-12-08 Thread John Garry
From: Xiang Chen Do a couple of changes for when HISI_SAS_RESET_BIT is set for HBA: - Clearing ITCT is not necessary - Remove internal abort as it will fail during reset Flag sas_dev->dev_type is kept as SAS_PHY_UNUSED. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/his

[PATCH 07/19] scsi: hisi_sas: add an mechanism to do reset work synchronously

2017-12-08 Thread John Garry
From: Xiaofei Tan Sometimes it is required to know when the controller reset has completed and also if it has completed successfully. For such places, we call hisi_sas_controller_reset() directly before. That may lead to multiple calls to this function. This patch create a per-reset structure wh

[PATCH 04/19] scsi: hisi_sas: optimise port id refresh function

2017-12-08 Thread John Garry
From: Xiaofei Tan Currently refreshing the PHY port id after reset is done in the rescan topology function, which is quite late in the reset process. It could be moved earlier in the process, as the port id can be refreshed once the PHYs become ready. In addition to this, we should set the hisi_

[PATCH 10/19] scsi: hisi_sas: add some print to enhance debugging

2017-12-08 Thread John Garry
From: Xiang Chen Add some print at some places such as error info and cq of exception IO, device found etc, and also adjust some log levels. All this to assist debugging ability. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 15 ++---

[PATCH 14/19] scsi: hisi_sas: do link reset for some CHL_INT2 ints

2017-12-08 Thread John Garry
From: Xiaofei Tan We should do link reset of PHY when identify timeout or STP link timeout. They are internal events of SOC and are notified to driver through interrupts of CHL_INT2. Besides, we should add an delay work to do link reset as it needs sleep. So, this patch add an new PHY event HISI

[PATCH 09/19] scsi: hisi_sas: add RAS feature for v3 hw

2017-12-08 Thread John Garry
From: Xiaofei Tan We use PCIe AER to support RAS feature for v3 hw. This driver should do following two things to support this: 1. Enable RAS interrupts, so that errors can be reported to RAS module. 2. Realize err_handler for sas_v3_pci_driver. Then if non-fatal error is detected, print error so

[PATCH 05/19] scsi: hisi_sas: some optimizations of host controller reset

2017-12-08 Thread John Garry
From: Xiaofei Tan This patch do following optimizations to host controller reset: 1. Unblock scsi requests before rescanning topology, as SCSI command need be used if new device is found during rescanning topology. 2. Remove drain_workqueue(hisi_hba->wq) and drain_workqueue(shost->work_q), as th

[PATCH 18/19] scsi: hisi_sas: re-add the lldd_port_deformed()

2017-12-08 Thread John Garry
From: Xiang Chen In function sas_suspend_devices(), it requires callback lldd_port_deformed callback to be implemented if lldd_port_deformed is implemented. So add a stub for lldd_port_deformed. Callback lldd_port_deformed was not required as the port deformation is done elsewhere in the LLDD.

[PATCH 17/19] scsi: hisi_sas: fix SAS_QUEUE_FULL problem while running IO

2017-12-08 Thread John Garry
From: Xiang Chen This patch fix SAS_QUEUE_FULL problem. The test situation is close port while running IO. In sas_eh_handle_sas_errors(), SCSI EH will free sas_task of the device if lldd_I_T_nexus_reset() return TMF_RESP_FUNC_COMPLETE or -ENODEV. But in our SAS driver, we only free slots of the

[PATCH 03/19] scsi: hisi_sas: relocate clearing ITCT and freeing device

2017-12-08 Thread John Garry
From: Xiaofei Tan In certain scenarios we may just want to clear the ITCT for a device, and not free other resources like the SATA bitmap using in v2 hw. To facilitate this, this patch relocates the code of clearing ITCT from free_device() to an new hw interface clear_itct(). Then for some hw, w

[PATCH 16/19] scsi: hisi_sas: add internal abort dev in some places

2017-12-08 Thread John Garry
From: Xiaofei Tan We should do internal abort dev before TMF_ABORT_TASK_SET and TMF_LU_RESET. Because we may only have done internal abort for single IO in the earlier part of SCSI EH process. Even the internal abort to the single IO, we also don't know whether it is successful. Besides, we shou

[PATCH 13/19] scsi: hisi_sas: use an general way to delay PHY work

2017-12-08 Thread John Garry
From: Xiaofei Tan Use an general way to do delay work for a PHY. Then it will be easier to add new delayed work for a PHY in future. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas.h | 9 - drivers/scsi/hisi_sas/hisi_sas_main.c | 22

[PATCH 11/19] scsi: hisi_sas: improve int_chnl_int_v2_hw() consistency with v3 hw

2017-12-08 Thread John Garry
From: Xiaofei Tan Change code format of int_chnl_int_v2_hw() to be consistent with v3 hw to reduce an tag indent. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 58 -- 1 file changed, 28 insertions(+), 30 delet

[PATCH 19/19] scsi: hisi_sas: add v3 hw suspend and resume

2017-12-08 Thread John Garry
From: Xiang Chen For v3 hw SAS, it supports configuring power state from D0 to D3 for entering Low Power status and power state from D3 to D0 for quit Low Power status. When power state from D0 to D3, HW will send FLR to clear the registers of ECAM and BAR space, and when power state from D3 to

[PATCH 15/19] scsi: hisi_sas: judge result of internal abort

2017-12-08 Thread John Garry
From: Xiaofei Tan Normally, hardware should ensure that internal abort timeout will never happen. If happen, it would be an SoC failure. What's more, HW will not process any other commands if an internal abort hasn't return CQ, and they will time out also. So, we should judge the result of inter

[PATCH 12/19] scsi: hisi_sas: add v2 hw port AXI error handling support

2017-12-08 Thread John Garry
From: Xiaofei Tan Add port AXI errors handling for v2 hw. We do host controller reset for such errors. Besides, change port muli-bits ECC error handling, and we should also do host reset for such error. So, this patch put them in the same struct with port AXI error. Signed-off-by: Xiaofei Tan

[PATCH 02/19] scsi: hisi_sas: fix dma_unmap_sg() parameter

2017-12-08 Thread John Garry
From: Xiang Chen For function dma_unmap_sg(), the parameter should be number of elements in the scatterlist prior to the mapping, not after the mapping. Fix this usage. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 6 -- 1 file changed,

Re: [PATCH] scsi: bfa: convert to strlcpy/strlcat

2017-12-08 Thread Johannes Thumshirn
Looks good, Reviewed-by: Johannes Thumshirn -- Johannes Thumshirn Storage jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG N

Re: [PATCH] scsi: libiscsi: Allow sd_shutdown on bad transport

2017-12-08 Thread Rafael David Tinoco
Lee, Chris, Some test results. - Single unmounted disk, with transport connection wiped before final logout: http://pastebin.ubuntu.com/26139576/ - Multiple mounted disks, multipath dev-mapper, all transport connections were wiped before the final logout, with heavy write workload: http://pas

Re: [PATCH v5 3/7] scsi: libsas: make the event threshold configurable

2017-12-08 Thread John Garry
On 08/12/2017 09:42, Jason Yan wrote: Add a sysfs attr that LLDD can configure it for every host. We made a example in hisi_sas. Other LLDDs using libsas can implement it if they want. Suggested-by: Hannes Reinecke Signed-off-by: Jason Yan CC: John Garry CC: Johannes Thumshirn CC: Ewan Milne

Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference

2017-12-08 Thread Ming Lei
Hi Martin, On Fri, Dec 08, 2017 at 04:44:55PM +0800, Ming Lei wrote: > Hi Martin, > > On Thu, Dec 07, 2017 at 09:46:21PM -0500, Martin K. Petersen wrote: > > > > Ming, > > > > > As I explained in [1], the use-after-free is inevitable no matter if > > > clearing 'SCpnt->cmnd' before mempool_free

[PATCH] scsi_dh_alua: skip RTPG for devices only supporting active/optimized

2017-12-08 Thread Hannes Reinecke
From: Hannes Reinecke For hardware only supporting active/optimized there's no point in ever re-issuing RTPG as the only new state we can possibly read is active/optimized. This avoid spurious errors during path failover on such arrays. Signed-off-by: Hannes Reinecke --- drivers/scsi/device_ha

[PATCH v5 5/7] scsi: libsas: use flush_workqueue to process disco events synchronously

2017-12-08 Thread Jason Yan
Now we are processing sas event and discover event in different workqueues. It's safe to wait the discover event done in the sas event work. Use flush_workqueue() to insure the disco and revalidate events processed synchronously so that the whole discover and revalidate process will not be interrup

[PATCH v5 7/7] scsi: libsas: notify event PORTE_BROADCAST_RCVD in sas_enable_revalidation()

2017-12-08 Thread Jason Yan
There are two places queuing the disco event DISCE_REVALIDATE_DOMAIN. One is in sas_porte_broadcast_rcvd() and uses sas_chain_event() to queue the event. The other is in sas_enable_revalidation() and uses sas_queue_event() to queue the event. We have diffrent work queues for event and discovery now

[PATCH v5 6/7] scsi: libsas: direct call probe and destruct

2017-12-08 Thread Jason Yan
In commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery competing with ata error handling") introduced disco mutex to prevent rediscovery competing with ata error handling and put the whole revalidation in the mutex. But the rphy add/remove needs to wait for the error handling which also

[PATCH v5 4/7] scsi: libsas: Use new workqueue to run sas event and disco event

2017-12-08 Thread Jason Yan
Now all libsas works are queued to scsi host workqueue, include sas event work post by LLDD and sas discovery work, and a sas hotplug flow may be divided into several works, e.g libsas receive a PORTE_BYTES_DMAED event, currently we process it as following steps: sas_form_port --- run in work in s

[PATCH v5 2/7] scsi: libsas: shut down the PHY if events reached the threshold

2017-12-08 Thread Jason Yan
If the PHY burst too many events, we will alloc a lot of events for the worker. This may leads to memory exhaustion. Dan Williams suggested to shut down the PHY if the events reached the threshold, because in this case the PHY may have gone into some erroneous state. Users can re-enable the PHY by

[PATCH v5 0/7] Enhance libsas hotplug feature

2017-12-08 Thread Jason Yan
Now the libsas hotplug has some issues, Dan Williams report a similar bug here before https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg39187.html The issues we have found 1. if LLDD burst reports lots of phy-up/phy-down sas events, some events may lost because a same sas events is pen

Re: [PATCH v2] scsi: libsas: fix length error in sas_smp_handler()

2017-12-08 Thread John Garry
On 07/12/2017 10:57, Jason Yan wrote: The bsg_job_done() requires the length of payload received, but we give it the untransferred residual. As I understand, this patches fixes (SES) enclosure management for libsas, so it's quite an important patch. Thanks, John Fixes: 651a01364994 ("scsi

[PATCH v5 1/7] scsi: libsas: Use dynamic alloced work to avoid sas event lost

2017-12-08 Thread Jason Yan
Now libsas hotplug work is static, every sas event type has its own static work, LLDD driver queues the hotplug work into shost->work_q. If LLDD driver burst posts lots hotplug events to libsas, the hotplug events may pending in the workqueue like shost->work_q new work[PORTE_BYTES_DMAED] --> |[PH

[PATCH v5 3/7] scsi: libsas: make the event threshold configurable

2017-12-08 Thread Jason Yan
Add a sysfs attr that LLDD can configure it for every host. We made a example in hisi_sas. Other LLDDs using libsas can implement it if they want. Suggested-by: Hannes Reinecke Signed-off-by: Jason Yan CC: John Garry CC: Johannes Thumshirn CC: Ewan Milne CC: Christoph Hellwig CC: Tomas Henzl

Re: [PATCH v2 1/3] scsi: Fix a scsi_show_rq() NULL pointer dereference

2017-12-08 Thread Ming Lei
Hi Martin, On Thu, Dec 07, 2017 at 09:46:21PM -0500, Martin K. Petersen wrote: > > Ming, > > > As I explained in [1], the use-after-free is inevitable no matter if > > clearing 'SCpnt->cmnd' before mempool_free() in sd_uninit_command() or > > not, so we need to comment the fact that cdb may poin