[PATCH 1/3] mpt3sas: Eliminate conditional locking in mpt3sas_scsih_issue_tm()

2016-07-28 Thread Calvin Owens
This flag that conditionally acquires the mutex is confusing and prone to bugginess: refactor it into two separate function calls, and make the unlocked one complain if it's called outside the mutex. Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.h | 16 +++-- dr

[PATCH 3/3] mpt3sas: Fix warnings exposed by W=1

2016-07-28 Thread Calvin Owens
l with the potential error is non-trivial, so for now just WARN(). Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.c | 18 +++- drivers/scsi/mpt3sas/mpt3sas_config.c| 4 +- drivers/scsi/mpt3sas/mpt3sas_ctl.c | 29 ++--- drivers/scsi/mpt3sas/mpt3s

[PATCH 2/3] mpt3sas: Eliminate dead sleep_flag code

2016-07-28 Thread Calvin Owens
With the exception of a single call to wait_for_doorbell_int(), all this conditional sleeping code is dead. So delete it. Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.c | 241 +-- drivers/scsi/mpt3sas/mpt3sas_base.h | 6 +- drivers

[PATCH] mpt3sas: Ensure the connector_name string is NUL-terminated

2016-07-27 Thread Calvin Owens
e 2nd byte beyond our character array happens to be a NUL. Fix this by explicitly writing '\0' to the end of the string to ensure we don't run off the edge of the world in printk(). Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_base.h | 2 +- drivers/scsi/mpt3sas/

Re: [PATCH] ses: Fix racy cleanup of /sys in remove_dev()

2016-07-27 Thread Calvin Owens
On 06/15/2016 01:24 PM, Calvin Owens wrote: On Thursday 06/02 at 15:50 -0700, Calvin Owens wrote: On 05/13/2016 01:28 PM, Calvin Owens wrote: Currently we free the resources backing the enclosure device before we call device_unregister(). This is racy: during rmmod of low-level SCSI drivers

Re: [BUG] Slab corruption during XFS writeback under memory pressure

2016-07-19 Thread Calvin Owens
On 07/18/2016 07:05 PM, Calvin Owens wrote: On 07/17/2016 11:02 PM, Dave Chinner wrote: On Sun, Jul 17, 2016 at 10:00:03AM +1000, Dave Chinner wrote: On Fri, Jul 15, 2016 at 05:18:02PM -0700, Calvin Owens wrote: Hello all, I've found a nasty source of slab corruption. Based on seeing si

Re: [BUG] Slab corruption during XFS writeback under memory pressure

2016-07-18 Thread Calvin Owens
On 07/17/2016 11:02 PM, Dave Chinner wrote: On Sun, Jul 17, 2016 at 10:00:03AM +1000, Dave Chinner wrote: On Fri, Jul 15, 2016 at 05:18:02PM -0700, Calvin Owens wrote: Hello all, I've found a nasty source of slab corruption. Based on seeing similar symptoms on boxes at Facebook, I su

[BUG] Slab corruption during XFS writeback under memory pressure

2016-07-15 Thread Calvin Owens
Hello all, I've found a nasty source of slab corruption. Based on seeing similar symptoms on boxes at Facebook, I suspect it's been around since at least 3.10. It only reproduces under memory pressure so far as I can tell: the issue seems to be that XFS reclaims pages from buffers that are still

Re: [PATCH] ses: Fix racy cleanup of /sys in remove_dev()

2016-06-15 Thread Calvin Owens
On Thursday 06/02 at 15:50 -0700, Calvin Owens wrote: > On 05/13/2016 01:28 PM, Calvin Owens wrote: > > Currently we free the resources backing the enclosure device before we > > call device_unregister(). This is racy: during rmmod of low-level SCSI > > drivers that hook into

Re: [PATCH] ses: Fix racy cleanup of /sys in remove_dev()

2016-06-02 Thread Calvin Owens
On 05/13/2016 01:28 PM, Calvin Owens wrote: Currently we free the resources backing the enclosure device before we call device_unregister(). This is racy: during rmmod of low-level SCSI drivers that hook into enclosure, we end up with a small window of time during which writing to /sys can OOPS

Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-18 Thread Calvin Owens
On Wednesday 05/18 at 18:44 +0530, Sreekanth Reddy wrote: > On Tue, May 17, 2016 at 8:43 AM, Calvin Owens wrote: > > On Monday 05/16 at 15:51 -0600, Sathya Prakash Veerichetty wrote: > >> Our understanding is the relationship between the SCSI host and SAS end > >> de

Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-16 Thread Calvin Owens
e only driver besides mpt*sas that calls sas_delete_port() explicitly is HPSA, and it does it in the opposite order mpt3sas does: scsi_remove_host() first. Thanks, Calvin > -Original Message- > From: Calvin Owens [mailto:calvinow...@fb.com] > Sent: Monday, May 16, 2016 2:25 PM > To:

Re: [PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-16 Thread Calvin Owens
On Friday 05/13 at 21:17 +, Elliott, Robert (Persistent Memory) wrote: > > > > -Original Message- > > From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- > > ow...@vger.kernel.org] On Behalf Of Calvin Owens > > Sent: Friday, May 13, 2016 3:2

[PATCH] mpt3sas: Do scsi_remove_host() before deleting SAS PHY objects

2016-05-13 Thread Calvin Owens
mple: just call scsi_remove_host() before we call sas_port_delete() and/or sas_remove_host(). Signed-off-by: Calvin Owens --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/sc

[PATCH] ses: Fix racy cleanup of /sys in remove_dev()

2016-05-13 Thread Calvin Owens
driver core holds a reference over ->remove_dev(), so AFAICT this is safe. Signed-off-by: Calvin Owens --- drivers/scsi/ses.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index 53ef1cb..0e8601a 100644 --- a/drivers/scsi/ses.c +++

[PATCH] mpt3sas: Don't overreach ioc->reply_post[] during initialization

2016-03-19 Thread Calvin Owens
pt3sas] [] do_one_initcall+0x113/0x2b0 [] do_init_module+0x1d0/0x4d8 [] load_module+0x6729/0x8dc0 [] SYSC_init_module+0x183/0x1a0 [] SyS_init_module+0xe/0x10 [] entry_SYSCALL_64_fastpath+0x12/0x6a Fix this by pulling the value at the beginning of the loop. Signed-off-by:

[PATCH] sg: Fix double-free when drives detach during SG_IO

2015-10-30 Thread Calvin Owens
t;cmd if it isn't embedded in the object itself. KASAN was extremely helpful in finding the root cause of this bug. Signed-off-by: Calvin Owens --- drivers/scsi/sg.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 9d7b7db

Re: [PATCH 1/2] mpt3sas: Refcount sas_device objects and fix unsafe list usage

2015-08-26 Thread Calvin Owens
a safe way. > > This patch is a port of Calvin's PATCH-v4 for mpt2sas code. > > Cc: Calvin Owens > Cc: Christoph Hellwig > Cc: Sreekanth Reddy > Cc: MPT-FusionLinux.pdl > Signed-off-by: Nicholas Bellinger > --- > drivers/scsi/mpt3sas/mpt3s

Re: [PATCH 2/2] mpt3sas: Refcount fw_events and fix unsafe list usage

2015-08-26 Thread Calvin Owens
eanup_queue() such that it > no longer iterates over the list without holding the lock, since > _firmware_event_work() concurrently deletes items from the list. > > This patch is a port of Calvin's PATCH-v4 for mpt2sas code. > > Cc: Calvin Owens > Cc: Christoph Hellwig >

Re: [PATCH 0/2] mpt3sas: Reference counting fixes from in-flight mpt2sas

2015-08-26 Thread Calvin Owens
On Wednesday 08/26 at 04:09 +, Nicholas A. Bellinger wrote: > From: Nicholas Bellinger > > Hi James & Co, > > This series is a mpt3sas forward port of Calvin Owens' in-flight > reference counting bugfixes for mpt2sas LLD code here: > > [PATCH v4 0/2] Fixes

[PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-08-13 Thread Calvin Owens
k, or we risk corrupting random memory if items are added or deleted as we iterate. This patch refactors _scsih_probe_sas() to use the sas_device_list in a safe way. Cc: Christoph Hellwig Cc: Bart Van Assche Cc: Joe Lawrence Signed-off-by: Calvin Owens --- Changes in v4: * Fix lack o

[PATCH v4 0/2] Fixes for memory corruption in mpt2sas

2015-08-13 Thread Calvin Owens
Hello all, This patchset attempts to address problems we've been having with panics due to memory corruption from the mpt2sas driver. Thanks, Calvin [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list [PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage Tota

[PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

2015-08-13 Thread Calvin Owens
() concurrently deletes items from the list. Cc: Christoph Hellwig Signed-off-by: Calvin Owens --- Changes in v4: None Changes in v3: * Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event, which can loop over a sleep forever (5m+ at least) at unloading. I don't

Re: [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-08-13 Thread Calvin Owens
On Monday 08/10 at 18:45 +0530, Sreekanth Reddy wrote: > On Sat, Aug 1, 2015 at 10:32 AM, Calvin Owens wrote: Sreekanth, Thanks for the review, responses below. I'll have a v4 out shortly. Calvin > > These objects can be referenced concurrently throughout the driver, we > >

[PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-07-31 Thread Calvin Owens
k, or we risk corrupting random memory if items are added or deleted as we iterate. This patch refactors _scsih_probe_sas() to use the sas_device_list in a safe way. Cc: Christoph Hellwig Cc: Bart Van Assche Cc: Joe Lawrence Signed-off-by: Calvin Owens --- Changes in v3: * Dro

[PATCH v3 0/2] Fixes for memory corruption in mpt2sas

2015-07-31 Thread Calvin Owens
Hello all, This patchset attempts to address problems we've been having with panics due to memory corruption from the mpt2sas driver. Changes are noted in the individual patches, I realized putting them in the cover was probably a bit confusing. Thanks, Calvin Patches in this series: [PATCH v

[PATCH v3 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

2015-07-31 Thread Calvin Owens
() concurrently deletes items from the list. Cc: Christoph Hellwig Signed-off-by: Calvin Owens --- Changes in v3: * Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event, which can loop over a sleep forever (5m+ at least) at unloading. I don't think anything prev

Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-07-21 Thread Calvin Owens
On Sunday 07/12 at 23:52 -0700, Christoph Hellwig wrote: > On Sat, Jul 11, 2015 at 09:24:55PM -0700, Calvin Owens wrote: > > These objects can be referenced concurrently throughout the driver, we > > need a way to make sure threads can't delete them out from under each > &

Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-07-21 Thread Calvin Owens
On Monday 07/13 at 11:05 -0400, Joe Lawrence wrote: > On 07/12/2015 12:24 AM, Calvin Owens wrote: > > These objects can be referenced concurrently throughout the driver, we > > need a way to make sure threads can't delete them out from under each > > other. This pa

Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-07-21 Thread Calvin Owens
On Thursday 07/16 at 20:27 +0530, Sreekanth Reddy wrote: > On Sun, Jul 12, 2015 at 9:54 AM, Calvin Owens wrote: > > These objects can be referenced concurrently throughout the driver, we > > need a way to make sure threads can't delete them out from under each > >

[PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

2015-07-11 Thread Calvin Owens
k, or we risk corrupting random memory if items are added or deleted as we iterate. This patch refactors _scsih_probe_sas() to use the sas_device_list in a safe way. Cc: Christoph Hellwig Cc: Bart Van Assche Signed-off-by: Calvin Owens --- drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +- dr

[PATCH 0/2 v2] Fixes for memory corruption in mpt2sas

2015-07-11 Thread Calvin Owens
Hello all, This patchset attempts to address problems we've been having with panics due to memory corruption from the mpt2sas driver. Thanks, Calvin Patches in this series: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage [PATCH 2/2] mpt2sas: Refcount fw_events and fix

[PATCH 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

2015-07-11 Thread Calvin Owens
() concurrently deletes items from the list. Cc: Christoph Hellwig Cc: Bart Van Assche Signed-off-by: Calvin Owens --- drivers/scsi/mpt2sas/mpt2sas_scsih.c | 101 --- 1 file changed, 81 insertions(+), 20 deletions(-) diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b

Re: [PATCH 6/6] Fix unsafe fw_event_list usage

2015-07-11 Thread Calvin Owens
On Friday 07/03 at 09:02 -0700, Christoph Hellwig wrote: > On Mon, Jun 08, 2015 at 08:50:56PM -0700, Calvin Owens wrote: > > Since the fw_event deletes itself from the list, cleanup_queue() can > > walk onto garbage pointers or walk off into freed memory. > > > >

Re: [PATCH 2/6] Refactor code to use new sas_device refcount

2015-07-11 Thread Calvin Owens
On Friday 07/03 at 08:38 -0700, Christoph Hellwig wrote: > > > > +struct _sas_device * > > +mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER > > *ioc, > > +u64 sas_address) > > Any chance to use a shorter name for this function? E.g. > __mpt2sas_get_sdev_by_addr ?

Re: [PATCH 5/6] Refactor code to use new fw_event refcount

2015-07-11 Thread Calvin Owens
Thanks for this, I'm sending a v2 shortly. On Friday 07/03 at 09:00 -0700, Christoph Hellwig wrote: > On Mon, Jun 08, 2015 at 08:50:55PM -0700, Calvin Owens wrote: > > This refactors the fw_event code to use the new refcount. > > I spent some time looking over this

[PATCH 4/6] Add refcount to fw_event_work struct

2015-06-08 Thread Calvin Owens
The fw_event_work struct is concurrently referenced at shutdown, so add a refcount to protect it. Signed-off-by: Calvin Owens --- drivers/scsi/mpt2sas/mpt2sas_scsih.c | 28 1 file changed, 28 insertions(+) diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b

[PATCH 3/6] Fix unsafe sas_device_list usage

2015-06-08 Thread Calvin Owens
We cannot iterate over the list without holding a lock for the entire duration, or we risk corrupting random memory if items are added or deleted as we iterate. This refactors code such that it always holds the lock when iterating on or accessing the sas_device_list. Signed-off-by: Calvin Owens

[PATCH 5/6] Refactor code to use new fw_event refcount

2015-06-08 Thread Calvin Owens
This refactors the fw_event code to use the new refcount. Signed-off-by: Calvin Owens --- drivers/scsi/mpt2sas/mpt2sas_scsih.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c

[PATCH 6/6] Fix unsafe fw_event_list usage

2015-06-08 Thread Calvin Owens
Since the fw_event deletes itself from the list, cleanup_queue() can walk onto garbage pointers or walk off into freed memory. This refactors the code in _scsih_fw_event_cleanup_queue() to not iterate over the fw_event_list without a lock. Signed-off-by: Calvin Owens --- drivers/scsi/mpt2sas

[PATCH 2/6] Refactor code to use new sas_device refcount

2015-06-08 Thread Calvin Owens
This patch refactors the code in the driver to use the new reference count on the sas_device struct. Signed-off-by: Calvin Owens --- drivers/scsi/mpt2sas/mpt2sas_base.h | 4 +- drivers/scsi/mpt2sas/mpt2sas_scsih.c | 329 --- drivers/scsi/mpt2sas

[PATCH 1/6] Add refcount to sas_device struct

2015-06-08 Thread Calvin Owens
These objects can be referenced concurrently throughout the driver, we need a way to make sure threads can't delete them out from under each other. Signed-off-by: Calvin Owens --- drivers/scsi/mpt2sas/mpt2sas_base.h | 16 1 file changed, 16 insertions(+) diff --git a/dr

[RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas

2015-06-08 Thread Calvin Owens
Hello all, This patchset attempts to address problems we've been having with panics due to memory corruption from the mpt2sas driver. I will provide a similar set of fixes for mpt3sas, since we see similar issues there as well. "Porting" this to mpt3sas will be trivial since the part of the drive