date:20190116

Re: [PATCH v1 5/8] scsi: ufs: qcom: Expose the reset controller for PHY

2019-01-16 Thread Kishon Vijay Abraham I




On 12/01/19 4:31 AM, Evan Green wrote:
> Expose a reset controller that the phy can use to perform its
> initialization in a single callback.
> 
> Also, change the use of the phy functions from ufs-qcom such that
> phy_poweron actually fires up the phy, and phy_poweroff actually
> powers it down.
> 
> Signed-off-by: Evan Green 

Can I get Ack for this patch from SCSI MAINTAINERS?

Thanks
Kishon

> 
> ---
> Note: This change depends on the remaining changes in this series,
> since UFS PHY reset now needs to be done by the PHY driver.
> 
>  drivers/scsi/ufs/Kconfig|   1 +
>  drivers/scsi/ufs/ufs-qcom.c | 110 +---
>  drivers/scsi/ufs/ufs-qcom.h |   4 ++
>  3 files changed, 71 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig
> index 2ddbb26d9c265..63c5c4115981f 100644
> --- a/drivers/scsi/ufs/Kconfig
> +++ b/drivers/scsi/ufs/Kconfig
> @@ -100,6 +100,7 @@ config SCSI_UFS_QCOM
>   tristate "QCOM specific hooks to UFS controller platform driver"
>   depends on SCSI_UFSHCD_PLATFORM && ARCH_QCOM
>   select PHY_QCOM_UFS
> + select RESET_CONTROLLER
>   help
> This selects the QCOM specific additions to UFSHCD platform driver.
> UFS host on QCOM needs some vendor specific configuration before
> diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
> index 3aeadb14aae1e..db46f9a64b54c 100644
> --- a/drivers/scsi/ufs/ufs-qcom.c
> +++ b/drivers/scsi/ufs/ufs-qcom.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "ufshcd.h"
>  #include "ufshcd-pltfrm.h"
> @@ -49,6 +50,11 @@ static void ufs_qcom_get_default_testbus_cfg(struct 
> ufs_qcom_host *host);
>  static int ufs_qcom_set_dme_vs_core_clk_ctrl_clear_div(struct ufs_hba *hba,
>  u32 clk_cycles);
>  
> +static struct ufs_qcom_host *rcdev_to_ufs_host(struct reset_controller_dev 
> *rcd)
> +{
> + return container_of(rcd, struct ufs_qcom_host, rcdev);
> +}
> +
>  static void ufs_qcom_dump_regs_wrapper(struct ufs_hba *hba, int offset, int 
> len,
>  const char *prefix, void *priv)
>  {
> @@ -255,11 +261,6 @@ static int ufs_qcom_power_up_sequence(struct ufs_hba 
> *hba)
>   if (is_rate_B)
>   phy_set_mode(phy, PHY_MODE_UFS_HS_B);
>  
> - /* Assert PHY reset and apply PHY calibration values */
> - ufs_qcom_assert_reset(hba);
> - /* provide 1ms delay to let the reset pulse propagate */
> - usleep_range(1000, 1100);
> -
>   /* phy initialization - calibrate the phy */
>   ret = phy_init(phy);
>   if (ret) {
> @@ -268,15 +269,6 @@ static int ufs_qcom_power_up_sequence(struct ufs_hba 
> *hba)
>   goto out;
>   }
>  
> - /* De-assert PHY reset and start serdes */
> - ufs_qcom_deassert_reset(hba);
> -
> - /*
> -  * after reset deassertion, phy will need all ref clocks,
> -  * voltage, current to settle down before starting serdes.
> -  */
> - usleep_range(1000, 1100);
> -
>   /* power on phy - start serdes and phy's power and clocks */
>   ret = phy_power_on(phy);
>   if (ret) {
> @@ -290,7 +282,6 @@ static int ufs_qcom_power_up_sequence(struct ufs_hba *hba)
>   return 0;
>  
>  out_disable_phy:
> - ufs_qcom_assert_reset(hba);
>   phy_exit(phy);
>  out:
>   return ret;
> @@ -554,21 +545,10 @@ static int ufs_qcom_suspend(struct ufs_hba *hba, enum 
> ufs_pm_op pm_op)
>   ufs_qcom_disable_lane_clks(host);
>   phy_power_off(phy);
>  
> - /* Assert PHY soft reset */
> - ufs_qcom_assert_reset(hba);
> - goto out;
> - }
> -
> - /*
> -  * If UniPro link is not active, PHY ref_clk, main PHY analog power
> -  * rail and low noise analog power rail for PLL can be switched off.
> -  */
> - if (!ufs_qcom_is_link_active(hba)) {
> + } else if (!ufs_qcom_is_link_active(hba)) {
>   ufs_qcom_disable_lane_clks(host);
> - phy_power_off(phy);
>   }
>  
> -out:
>   return ret;
>  }
>  
> @@ -578,21 +558,26 @@ static int ufs_qcom_resume(struct ufs_hba *hba, enum 
> ufs_pm_op pm_op)
>   struct phy *phy = host->generic_phy;
>   int err;
>  
> - err = phy_power_on(phy);
> - if (err) {
> - dev_err(hba->dev, "%s: failed enabling regs, err = %d\n",
> - __func__, err);
> - goto out;
> - }
> + if (ufs_qcom_is_link_off(hba)) {
> + err = phy_power_on(phy);
> + if (err) {
> + dev_err(hba->dev, "%s: failed enabling regs, err = 
> %d\n",
> + __func__, err);
> + return err;
> + }
>  
> - err = ufs_qcom_enable_lane_clks(host);
> - if (err)
> - goto out;
> + err = ufs_qcom_enable_lane_clks(host);
> + if (err)

Re: [PATCH v1] scsi: ufs: Use explicit access size in ufshcd_dump_regs

2019-01-16 Thread Marc Gonzalez

On 09/01/2019 16:38, Jeffrey Hugo wrote:

>> On 11/12/2018 15:18, Marc Gonzalez wrote:
>>
>>> memcpy_fromio() doesn't provide any control over access size.
>>> For example, on arm64, it is implemented using readb and readq.
>>> This may trigger a synchronous external abort:
>>>
>>> [3.729943] Internal error: synchronous external abort: 96000210 [#1] 
>>> PREEMPT SMP
>>> [3.737000] Modules linked in:
>>> [3.744371] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G S
>>> 4.20.0-rc4 #16
>>> [3.747413] Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP 
>>> (DT)
>>> [3.755295] pstate: 0005 (nzcv daif -PAN -UAO)
>>> [3.761978] pc : __memcpy_fromio+0x68/0x80
>>> [3.766718] lr : ufshcd_dump_regs+0x50/0xb0
>>> [3.770767] sp : 0807ba00
>>> [3.774830] x29: 0807ba00 x28: fffb
>>> [3.778344] x27: 089db068 x26: 8000f6e58000
>>> [3.783728] x25: 000e x24: 0800
>>> [3.789023] x23: 8000f6e587c8 x22: 0800
>>> [3.794319] x21: 08908368 x20: 8000f6e1ab80
>>> [3.799615] x19: 006c x18: 
>>> [3.804910] x17:  x16: 
>>> [3.810206] x15: 09199648 x14: 89244187
>>> [3.815502] x13: 09244195 x12: 091ab000
>>> [3.820797] x11: 05f5e0ff x10: 091998a0
>>> [3.826093] x9 :  x8 : 8000f6e1ac00
>>> [3.831389] x7 :  x6 : 0068
>>> [3.836676] x5 : 8000f6e1abe8 x4 : 
>>> [3.841971] x3 : 0928c868 x2 : 8000f6e1abec
>>> [3.847267] x1 : 0928c868 x0 : 8000f6e1abe8
>>> [3.852567] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
>>> [3.857900] Call trace:
>>> [3.864473]  __memcpy_fromio+0x68/0x80
>>> [3.866683]  ufs_qcom_dump_dbg_regs+0x1c0/0x370
>>> [3.870522]  ufshcd_print_host_regs+0x168/0x190
>>> [3.874946]  ufshcd_init+0xd4c/0xde0
>>> [3.879459]  ufshcd_pltfrm_init+0x3c8/0x550
>>> [3.883264]  ufs_qcom_probe+0x24/0x60
>>> [3.887188]  platform_drv_probe+0x50/0xa0
>>>
>>> Assuming aligned 32-bit registers, let's use readl, after making sure
>>> that 'offset' and 'len' are indeed multiples of 4.
>>>
>>> Fixes: ba80917d9932d ("scsi: ufs: ufshcd_dump_regs to use memcpy_fromio")
>>> Signed-off-by: Marc Gonzalez 
>>> ---
>>>   drivers/scsi/ufs/ufshcd.c | 10 --
>>>   1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>>> index 535180c01ce8..320bbd9849bc 100644
>>> --- a/drivers/scsi/ufs/ufshcd.c
>>> +++ b/drivers/scsi/ufs/ufshcd.c
>>> @@ -112,13 +112,19 @@
>>>   int ufshcd_dump_regs(struct ufs_hba *hba, size_t offset, size_t len,
>>>  const char *prefix)
>>>   {
>>> -   u8 *regs;
>>> +   u32 *regs;
>>> +   size_t pos;
>>> +
>>> +   if (offset % 4 != 0 || len % 4 != 0) /* keep readl happy */
>>> +   return -EINVAL;
> 
> Hmm.  It seems like these cases could be handled, but I guess we cannot 
> necessarily assume that reading past the bounds specified by the client 
> is safe, so this seems reasonable.
> 
>>>   
>>> regs = kzalloc(len, GFP_KERNEL);
>>> if (!regs)
>>> return -ENOMEM;
>>>   
>>> -   memcpy_fromio(regs, hba->mmio_base + offset, len);
>>> +   for (pos = 0; pos < len; pos += 4)
>>> +   regs[pos / 4] = ufshcd_readl(hba, offset + pos);
>>> +
>>> ufshcd_hex_dump(prefix, regs, len);
>>> kfree(regs);
>>>   
> 
> Reviewed-by: Jeffrey Hugo 

James, Martin,

I haven't heard back from Vinayak. (In fact, his last review dates back
to 2016. Should MAINTAINERS be updated? I can send a patch.)

Would you mind taking this patch through the SCSI tree?

Tomas, does "LGTM" translate to Acked-by? :-D

Regards.

RE: [PATCH v1] scsi: ufs: Use explicit access size in ufshcd_dump_regs

2019-01-16 Thread Winkler, Tomas

> 
> On 09/01/2019 16:38, Jeffrey Hugo wrote:
> 
> >> On 11/12/2018 15:18, Marc Gonzalez wrote:
> >>
> >>> memcpy_fromio() doesn't provide any control over access size.
> >>> For example, on arm64, it is implemented using readb and readq.
> >>> This may trigger a synchronous external abort:
> >>>
> >>> [3.729943] Internal error: synchronous external abort: 96000210 [#1]
> PREEMPT SMP
> >>> [3.737000] Modules linked in:
> >>> [3.744371] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G S
> >>> 4.20.0-
> rc4 #16
> >>> [3.747413] Hardware name: Qualcomm Technologies, Inc. MSM8998 v1
> MTP (DT)
> >>> [3.755295] pstate: 0005 (nzcv daif -PAN -UAO)
> >>> [3.761978] pc : __memcpy_fromio+0x68/0x80
> >>> [3.766718] lr : ufshcd_dump_regs+0x50/0xb0
> >>> [3.770767] sp : 0807ba00
> >>> [3.774830] x29: 0807ba00 x28: fffb
> >>> [3.778344] x27: 089db068 x26: 8000f6e58000
> >>> [3.783728] x25: 000e x24: 0800
> >>> [3.789023] x23: 8000f6e587c8 x22: 0800
> >>> [3.794319] x21: 08908368 x20: 8000f6e1ab80
> >>> [3.799615] x19: 006c x18: 
> >>> [3.804910] x17:  x16: 
> >>> [3.810206] x15: 09199648 x14: 89244187
> >>> [3.815502] x13: 09244195 x12: 091ab000
> >>> [3.820797] x11: 05f5e0ff x10: 091998a0
> >>> [3.826093] x9 :  x8 : 8000f6e1ac00
> >>> [3.831389] x7 :  x6 : 0068
> >>> [3.836676] x5 : 8000f6e1abe8 x4 : 
> >>> [3.841971] x3 : 0928c868 x2 : 8000f6e1abec
> >>> [3.847267] x1 : 0928c868 x0 : 8000f6e1abe8
> >>> [3.852567] Process swapper/0 (pid: 1, stack limit = 
> >>> 0x(ptrval))
> >>> [3.857900] Call trace:
> >>> [3.864473]  __memcpy_fromio+0x68/0x80
> >>> [3.866683]  ufs_qcom_dump_dbg_regs+0x1c0/0x370
> >>> [3.870522]  ufshcd_print_host_regs+0x168/0x190
> >>> [3.874946]  ufshcd_init+0xd4c/0xde0
> >>> [3.879459]  ufshcd_pltfrm_init+0x3c8/0x550
> >>> [3.883264]  ufs_qcom_probe+0x24/0x60
> >>> [3.887188]  platform_drv_probe+0x50/0xa0
> >>>
> >>> Assuming aligned 32-bit registers, let's use readl, after making
> >>> sure that 'offset' and 'len' are indeed multiples of 4.
> >>>
> >>> Fixes: ba80917d9932d ("scsi: ufs: ufshcd_dump_regs to use
> >>> memcpy_fromio")
> >>> Signed-off-by: Marc Gonzalez 
> >>> ---
> >>>   drivers/scsi/ufs/ufshcd.c | 10 --
> >>>   1 file changed, 8 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> >>> index 535180c01ce8..320bbd9849bc 100644
> >>> --- a/drivers/scsi/ufs/ufshcd.c
> >>> +++ b/drivers/scsi/ufs/ufshcd.c
> >>> @@ -112,13 +112,19 @@
> >>>   int ufshcd_dump_regs(struct ufs_hba *hba, size_t offset, size_t len,
> >>>const char *prefix)
> >>>   {
> >>> - u8 *regs;
> >>> + u32 *regs;
> >>> + size_t pos;
> >>> +
> >>> + if (offset % 4 != 0 || len % 4 != 0) /* keep readl happy */
> >>> + return -EINVAL;
> >
> > Hmm.  It seems like these cases could be handled, but I guess we
> > cannot necessarily assume that reading past the bounds specified by
> > the client is safe, so this seems reasonable.
> >
> >>>
> >>>   regs = kzalloc(len, GFP_KERNEL);
> >>>   if (!regs)
> >>>   return -ENOMEM;
> >>>
> >>> - memcpy_fromio(regs, hba->mmio_base + offset, len);
> >>> + for (pos = 0; pos < len; pos += 4)
> >>> + regs[pos / 4] = ufshcd_readl(hba, offset + pos);
> >>> +
> >>>   ufshcd_hex_dump(prefix, regs, len);
> >>>   kfree(regs);
> >>>
> >
> > Reviewed-by: Jeffrey Hugo 
> 
> James, Martin,
> 
> I haven't heard back from Vinayak. (In fact, his last review dates back to 
> 2016.
> Should MAINTAINERS be updated? I can send a patch.)
> 
> Would you mind taking this patch through the SCSI tree?
> 
> Tomas, does "LGTM" translate to Acked-by? :-D
Yes, you can add my Ack.

Thanks
Tomas

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

2019-01-16 Thread Martin Wilck

On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> If the device is _not_ removable we should not start the event
> poller as the media will not go away. Having the event poller running
> will block the open() call as it will try to flush outstanding
> events,
> which it can't if the device is in state 'BLOCKED'. So the open()
> call
> will be stalled until the device state changed, which might be quite
> some time depending on the transport.
> 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/sd.c | 3 +++
>  1 file changed, 3 insertions(+)

Thanks - you lost patience with me, did you? :-)

I didn't submit this yet because Tejun (adding him to CC) had
explicitly removed this check in commit 2bae0093cab4, and I was still
trying to understand why.

Perhaps a follow-up discussion will clarify that.

Martin

> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index a1a44f52e0e8..521f0a384446 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1539,6 +1539,9 @@ static unsigned int sd_check_events(struct
> gendisk *disk, unsigned int clearing)
>   return 0;
>  
>   sdp = sdkp->device;
> + if (!sdp->removable)
> + return 0;
> +
>   SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp,
> "sd_check_events\n"));
>  
>   /*

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

2019-01-16 Thread Hannes Reinecke


On 1/16/19 11:26 AM, Martin Wilck wrote:

On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:

If the device is _not_ removable we should not start the event
poller as the media will not go away. Having the event poller running
will block the open() call as it will try to flush outstanding
events,
which it can't if the device is in state 'BLOCKED'. So the open()
call
will be stalled until the device state changed, which might be quite
some time depending on the transport.

Signed-off-by: Hannes Reinecke 
---
  drivers/scsi/sd.c | 3 +++
  1 file changed, 3 insertions(+)


Thanks - you lost patience with me, did you? :-)

I didn't submit this yet because Tejun (adding him to CC) had
explicitly removed this check in commit 2bae0093cab4, and I was still
trying to understand why.

Perhaps a follow-up discussion will clarify that.


To my understanding Tejun removed it as the check got moved into
set_media_not_present() function.
However, I'd be very much interested to hear on how a non-removable 
device can change ...
Sure, there might be non-compatible devices which do _not_ set the 
removable flag (despite them being removable) and the notorious aacraid 
which insists on claiming all devices being removable, but if the need 
arises we can always introduce a blacklist flag.


Cheers,

Hannes
--
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

2019-01-16 Thread Martin Wilck

On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote:
> On 1/16/19 11:26 AM, Martin Wilck wrote:
> > On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> > > If the device is _not_ removable we should not start the event
> > > poller as the media will not go away. Having the event poller
> > > running
> > > will block the open() call as it will try to flush outstanding
> > > events,
> > > which it can't if the device is in state 'BLOCKED'. So the open()
> > > call
> > > will be stalled until the device state changed, which might be
> > > quite
> > > some time depending on the transport.
> > > 
> > > Signed-off-by: Hannes Reinecke 
> > > ---
> > >   drivers/scsi/sd.c | 3 +++
> > >   1 file changed, 3 insertions(+)
> > 
> > Thanks - you lost patience with me, did you? :-)
> > 
> > I didn't submit this yet because Tejun (adding him to CC) had
> > explicitly removed this check in commit 2bae0093cab4, and I was
> > still
> > trying to understand why.
> > 
> > Perhaps a follow-up discussion will clarify that.
> > 
> To my understanding Tejun removed it as the check got moved into
> set_media_not_present() function.

Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be
executed for non-removable disks in sd_check_events(), which was never
done in sd_media_changed() beforehand.

> However, I'd be very much interested to hear on how a non-removable 
> device can change ...

Me too. The specs at least don't seem to mandate that only removable
devices can return MEDIUM NOT PRESENT. But even then, it makes only
little sense to actively test such devices for a media change by
sending TURs.

> Sure, there might be non-compatible devices which do _not_ set the 
> removable flag (despite them being removable) and the notorious
> aacraid 
> which insists on claiming all devices being removable, but if the
> need 
> arises we can always introduce a blacklist flag.

I agree. Also, I suppose that if someone has a case for checking
for media change on non-removable devices, (s)he will bring it up.

Therefore:
Acked-by: Martin Wilck 

Regards
Martin

Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-16 Thread John Garry


On 16/01/2019 02:54, Martin K. Petersen wrote:


Hi John,



Hi Martin,


So in this case I think that accessor functions are actually better
because they allow us to print a big fat warning when you twiddle
something you shouldn't post-initialization. So that's something I think
we could--and should--improve.


Sure, this is an alternative, but I would rather make it obvious when
these parameters should be set so that this would not be required.


I would like to have a mechanism in place that warns if you twiddle
things that break assumptions made at host registration time.


Yes, something more robust would be good.

That's not

a scenario the old registration interface was designed to handle.

I am not sure I agree with your assertion that setting these masks in
the struct prior to scsi_add_host() makes this ordering requirement much
more obvious.

It is not like you are passing in a list of parameters and

then receiving a separately instantiated immutable scsi_host object. You
are performing an operation on something you already have and own.

That's why I commented that the current intersection between
compile-time static host template, dynamically discovered
pre-registration scsi_host parameters, and the registered runtime
scsi_host struct is somewhat messy.

Btw. I have no attachment to the prot wrappers whatsoever. The reason
they exist is that the SCSI integrity support was optional. And
therefore we had stub functions so things could be compiled without any
of the integrity fields being present in the various SCSI structs.


I never noticed stubs for setting/getting 
Scsi_host.prot_{capabilities,guard_type}


So I

don't have any problem killing the wrappers except they may actually be
handy for regressions like this one where you could #error if the driver
writer violates the ordering requirement.



We set many Scsi_host parameters without such safeguarding, and I don't 
know what's special about these protection-related members.


Thanks,
John

[PATCH fix] scsi_lib: make sure scsi_request.sense valid

2019-01-16 Thread Douglas Gilbert

The block layer assumes scsi_request:sense is always a valid
pointer. This is set up once in scsi_mq_init_request() and the
containing scsi_cmnd object is used often, being re-initialized
by scsi_init_command(). That works unless some code re-purposes
part of the scsi_cmnd object for something else. And that is
what bidi handling does in scsi_mq_prep_fn(). The result is an
oops at some later time when the partly overwritten object is
re-used. The overwrite is from d285203cf647d but 'git blame'
does not show removed code, so that commit may not be the
culprit.

Signed-off-by: Douglas Gilbert 
---

This was found while injecting errors (thus generating sense data)
into a sequence of bidi commands. At some later time the block
layer blew up with a scsi_request::sense NULL dereference in
sg_rq_end_io(). Without testing I'm confident the bsg driver,
the osd ULD and exofs are exposed to this bug.

 drivers/scsi/scsi_lib.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b13cc9288ba0..71259bd4040a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, struct 
scsi_cmnd *cmd)
 
cmd->device = dev;
cmd->sense_buffer = buf;
+   cmd->req.sense = buf;
cmd->prot_sdb = prot;
cmd->flags = flags;
INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler);
-- 
2.17.1

[PATCH for-5.0] scsi: communicate max segment size to the DMA mapping code

2019-01-16 Thread Christoph Hellwig

When a host driver sets a maximum segment size we should not only
propagate that setting to the block layer, which can merge segments,
but also to the DMA mapping layer which can merge segments as well.

Fixes: 50c2e9107f ("scsi: introduce a max_segment_size host_template 
parameters")
Signed-off-by: Christoph Hellwig 
---
 drivers/ata/pata_macio.c |  9 -
 drivers/ata/sata_inic162x.c  | 22 +-
 drivers/firewire/sbp2.c  |  5 +
 drivers/scsi/aacraid/linit.c |  9 -
 drivers/scsi/scsi_lib.c  |  4 ++--
 5 files changed, 20 insertions(+), 29 deletions(-)

diff --git a/drivers/ata/pata_macio.c b/drivers/ata/pata_macio.c
index 8cc9c429ad95..9e7fc302430f 100644
--- a/drivers/ata/pata_macio.c
+++ b/drivers/ata/pata_macio.c
@@ -915,6 +915,10 @@ static struct scsi_host_template pata_macio_sht = {
.sg_tablesize   = MAX_DCMDS,
/* We may not need that strict one */
.dma_boundary   = ATA_DMA_BOUNDARY,
+   /* Not sure what the real max is but we know it's less than 64K, let's
+* use 64K minus 256
+*/
+   .max_segment_size   = MAX_DBDMA_SEG,
.slave_configure= pata_macio_slave_config,
 };
 
@@ -1044,11 +1048,6 @@ static int pata_macio_common_init(struct pata_macio_priv 
*priv,
/* Make sure we have sane initial timings in the cache */
pata_macio_default_timings(priv);
 
-   /* Not sure what the real max is but we know it's less than 64K, let's
-* use 64K minus 256
-*/
-   dma_set_max_seg_size(priv->dev, MAX_DBDMA_SEG);
-
/* Allocate libata host for 1 port */
memset(&pinfo, 0, sizeof(struct ata_port_info));
pmac_macio_calc_timing_masks(priv, &pinfo);
diff --git a/drivers/ata/sata_inic162x.c b/drivers/ata/sata_inic162x.c
index e0bcf9b2dab0..174e84ce4379 100644
--- a/drivers/ata/sata_inic162x.c
+++ b/drivers/ata/sata_inic162x.c
@@ -245,8 +245,15 @@ struct inic_port_priv {
 
 static struct scsi_host_template inic_sht = {
ATA_BASE_SHT(DRV_NAME),
-   .sg_tablesize   = LIBATA_MAX_PRD,   /* maybe it can be larger? */
-   .dma_boundary   = INIC_DMA_BOUNDARY,
+   .sg_tablesize   = LIBATA_MAX_PRD, /* maybe it can be larger? */
+
+   /*
+* This controller is braindamaged.  dma_boundary is 0x like others
+* but it will lock up the whole machine HARD if 65536 byte PRD entry
+* is fed.  Reduce maximum segment size.
+*/
+   .dma_boundary   = INIC_DMA_BOUNDARY,
+   .max_segment_size   = 65536 - 512,
 };
 
 static const int scr_map[] = {
@@ -868,17 +875,6 @@ static int inic_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
return rc;
}
 
-   /*
-* This controller is braindamaged.  dma_boundary is 0x
-* like others but it will lock up the whole machine HARD if
-* 65536 byte PRD entry is fed. Reduce maximum segment size.
-*/
-   rc = dma_set_max_seg_size(&pdev->dev, 65536 - 512);
-   if (rc) {
-   dev_err(&pdev->dev, "failed to set the maximum segment size\n");
-   return rc;
-   }
-
rc = init_controller(hpriv->mmio_base, hpriv->cached_hctl);
if (rc) {
dev_err(&pdev->dev, "failed to initialize controller\n");
diff --git a/drivers/firewire/sbp2.c b/drivers/firewire/sbp2.c
index 09b845e90114..a785ffd5af89 100644
--- a/drivers/firewire/sbp2.c
+++ b/drivers/firewire/sbp2.c
@@ -1144,10 +1144,6 @@ static int sbp2_probe(struct fw_unit *unit, const struct 
ieee1394_device_id *id)
if (device->is_local)
return -ENODEV;
 
-   if (dma_get_max_seg_size(device->card->device) > SBP2_MAX_SEG_SIZE)
-   WARN_ON(dma_set_max_seg_size(device->card->device,
-SBP2_MAX_SEG_SIZE));
-
shost = scsi_host_alloc(&scsi_driver_template, sizeof(*tgt));
if (shost == NULL)
return -ENOMEM;
@@ -1610,6 +1606,7 @@ static struct scsi_host_template scsi_driver_template = {
.eh_abort_handler   = sbp2_scsi_abort,
.this_id= -1,
.sg_tablesize   = SG_ALL,
+   .max_segment_size   = SBP2_MAX_SEG_SIZE,
.can_queue  = 1,
.sdev_attrs = sbp2_scsi_sysfs_attrs,
 };
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 634ddb90e7aa..7e56a11836c1 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -1747,11 +1747,10 @@ static int aac_probe_one(struct pci_dev *pdev, const 
struct pci_device_id *id)
shost->max_sectors = (shost->sg_tablesize * 8) + 112;
}
 
-   error = dma_set_max_seg_size(&pdev->dev,
-   (aac->adapter_info.options & AAC_OPT_NEW_COMM) ?
-   (shost->max_sectors << 9) : 65536);
-   if (error)
-   goto out_deinit;
+   i

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

2019-01-16 Thread Douglas Gilbert


On 2019-01-16 5:58 a.m., Martin Wilck wrote:

On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote:

On 1/16/19 11:26 AM, Martin Wilck wrote:

On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:

If the device is _not_ removable we should not start the event
poller as the media will not go away. Having the event poller
running
will block the open() call as it will try to flush outstanding
events,
which it can't if the device is in state 'BLOCKED'. So the open()
call
will be stalled until the device state changed, which might be
quite
some time depending on the transport.

Signed-off-by: Hannes Reinecke 
---
   drivers/scsi/sd.c | 3 +++
   1 file changed, 3 insertions(+)


Thanks - you lost patience with me, did you? :-)

I didn't submit this yet because Tejun (adding him to CC) had
explicitly removed this check in commit 2bae0093cab4, and I was
still
trying to understand why.

Perhaps a follow-up discussion will clarify that.


To my understanding Tejun removed it as the check got moved into
set_media_not_present() function.


Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be
executed for non-removable disks in sd_check_events(), which was never
done in sd_media_changed() beforehand.


However, I'd be very much interested to hear on how a non-removable
device can change ...


Me too. The specs at least don't seem to mandate that only removable
devices can return MEDIUM NOT PRESENT. But even then, it makes only
little sense to actively test such devices for a media change by
sending TURs.


So to be more precise, do you mean the LU disappearing while its
port(s) and containing target device remain online?

Otherwise a hot unplug meets the bill. Also some curious things can
happen at the transport level. For example I believe T10 have just
accepted a MODE SELECT command that will change a SAS wide port
to narrow port(s) (and vice versa). The device is taken offline for
a programmable period meant to be long enough for the scanning logic
to notice and then brought back in its new configuration.

Doug Gilbert


Sure, there might be non-compatible devices which do _not_ set the
removable flag (despite them being removable) and the notorious
aacraid
which insists on claiming all devices being removable, but if the
need
arises we can always introduce a blacklist flag.


I agree. Also, I suppose that if someone has a case for checking
for media change on non-removable devices, (s)he will bring it up.

Therefore:
Acked-by: Martin Wilck 

Regards
Martin

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

2019-01-16 Thread Douglas Gilbert


Ping 

If these types of disks (ones that can fast format between 512 and
4096 bytes) are not out in the field yet, they soon will be. This
problem requires a hard reboot (i.e. hold you finger on the power
button for 5-10 seconds). Sad 

Doug Gilbert


On 2018-12-24 9:19 p.m., Douglas Gilbert wrote:

The kernel first saw this disk as having 4096 logical block size:

   # lsscs -gs
[6:0:2:0]  disk  HGST  HUS726T4TAL4204  C430 /dev/sdd  /dev/sg3 4.00TB

Then about a minute later it had a logical block size of 512 bytes after:

   # sg_format --format --ffmt=1 --size=512 /dev/sg3

[The '--ffmt=1' option means fast format and it obviously doesn't visit
every block.] The sd driver was not amused and when I did:

  # fdisk /dev/sdd

That xterm froze in an uninterruptible state and from another xterm I could
see the log being filled with:

  [ 1004.409881] sd 6:0:2:0: [sdd] Unaligned partial completion (resid=3584, 
sector_sz=4096)


Rebooted but that locked up the machine, power cycle ...

Doug Gilbert

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

2019-01-16 Thread Martin K. Petersen



Doug,

> If these types of disks (ones that can fast format between 512 and
> 4096 bytes) are not out in the field yet, they soon will be. This
> problem requires a hard reboot (i.e. hold you finger on the power
> button for 5-10 seconds). Sad 

Do we get a notification from the drive that inquiry data has changed
once the format is complete?

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

2019-01-16 Thread Douglas Gilbert


On 2019-01-16 11:44 a.m., Martin K. Petersen wrote:


Doug,


If these types of disks (ones that can fast format between 512 and
4096 bytes) are not out in the field yet, they soon will be. This
problem requires a hard reboot (i.e. hold you finger on the power
button for 5-10 seconds). Sad 


Do we get a notification from the drive that inquiry data has changed
once the format is complete?


I need to reconfigure my box to use FreeBSD with those disks. Then I
should be able to answer that question.


BTW I'm told by the manufacturer of said disk, that the last firmware
update was to production firmware.

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

2019-01-16 Thread Bart Van Assche

On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> If the device is _not_ removable we should not start the event
> poller as the media will not go away. Having the event poller running
> will block the open() call as it will try to flush outstanding events,
> which it can't if the device is in state 'BLOCKED'. So the open() call
> will be stalled until the device state changed, which might be quite
> some time depending on the transport.
> 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/sd.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index a1a44f52e0e8..521f0a384446 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1539,6 +1539,9 @@ static unsigned int sd_check_events(struct gendisk 
> *disk, unsigned int clearing)
>   return 0;
>  
>   sdp = sdkp->device;
> + if (!sdp->removable)
> + return 0;
> +
>   SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_check_events\n"));
>  
>   /*

Hi Hannes,

Although this patch looks fine to me, wouldn't it be a better approach to
cause the events checker not to submit any SCSI commands to blocked devices?
That approach should improve the sd behavior for both removable and
non-removable devices.

Thanks,

Bart.

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

2019-01-16 Thread Douglas Gilbert


On 2019-01-16 11:55 a.m., Douglas Gilbert wrote:

On 2019-01-16 11:44 a.m., Martin K. Petersen wrote:


Doug,


If these types of disks (ones that can fast format between 512 and
4096 bytes) are not out in the field yet, they soon will be. This
problem requires a hard reboot (i.e. hold you finger on the power
button for 5-10 seconds). Sad 


Do we get a notification from the drive that inquiry data has changed
once the format is complete?


I need to reconfigure my box to use FreeBSD with those disks. Then I
should be able to answer that question.


BTW I'm told by the manufacturer of said disk, that the last firmware
update was to production firmware.


No sign of a Unit Attention on FreeBSD (12.0 Release) from either a
512->4096 or 4096->512 fast format. And fdisk (which is different on
FreeBSD) worked when the block size was 512 byte but said it could
find the block size when disk's LB size was 4096.

sbc4r16 chapter 4.10:
"Any time the READ CAPACITY (10) parameter data (see 5.19.2) or the
READ CAPACITY (16) parameter data (see 5.20.2) changes (e.g., when a
FORMAT UNIT command or a MODE SELECT command causes a change to the
logical block length or protection information, or when a vendor specific 
mechanism causes a change), then the device server shall establish a

unit attention condition for the SCSI initiator port (see SAM-5)
associated with each I_T nexus, except the I_T nexus on which the command 
causing the change was received with the additional sense code set to

CAPACITY DATA HAS CHANGED."

So seeing I'm using the I_T nexus that sent the FORMAT UNIT, then no Unit
Attention is expected, as per above.

So onto INQUIRY VPD pages: what has changed that will cause a UA?

And surely the sd driver or mid-level should "smell a rat" after the first:
   Unaligned partial completion
error message. The disk was unmounted so it should not be a big deal to
run initialisation again on it. If it was mounted then that would be an
interesting situation, since the user visible data (say just changed to
4096 byte LBs) would be the concatenation of 8 of the previous 512 byte
LBs. However there is nothing in SBC-4 that says that will be the case,
and I was told by the firmware people, just assume that was an "accident".
PI might even survive across LB size changes.

Doug Gilbert

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

2019-01-16 Thread Martin Wilck

On Wed, 2019-01-16 at 11:32 -0500, Douglas Gilbert wrote:
> On 2019-01-16 5:58 a.m., Martin Wilck wrote:
> > On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote:
> > > On 1/16/19 11:26 AM, Martin Wilck wrote:
> > > > On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> > > > > If the device is _not_ removable we should not start the
> > > > > event
> > > > > poller as the media will not go away. Having the event poller
> > > > > running
> > > > > will block the open() call as it will try to flush
> > > > > outstanding
> > > > > events,
> > > > > which it can't if the device is in state 'BLOCKED'. So the
> > > > > open()
> > > > > call
> > > > > will be stalled until the device state changed, which might
> > > > > be
> > > > > quite
> > > > > some time depending on the transport.
> > > > > 
> > > > > Signed-off-by: Hannes Reinecke 
> > > > > ---
> > > > >drivers/scsi/sd.c | 3 +++
> > > > >1 file changed, 3 insertions(+)
> > > > 
> > > > Thanks - you lost patience with me, did you? :-)
> > > > 
> > > > I didn't submit this yet because Tejun (adding him to CC) had
> > > > explicitly removed this check in commit 2bae0093cab4, and I was
> > > > still
> > > > trying to understand why.
> > > > 
> > > > Perhaps a follow-up discussion will clarify that.
> > > > 
> > > To my understanding Tejun removed it as the check got moved into
> > > set_media_not_present() function.
> > 
> > Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be
> > executed for non-removable disks in sd_check_events(), which was
> > never
> > done in sd_media_changed() beforehand.
> > 
> > > However, I'd be very much interested to hear on how a non-
> > > removable
> > > device can change ...
> > 
> > Me too. The specs at least don't seem to mandate that only
> > removable
> > devices can return MEDIUM NOT PRESENT. But even then, it makes only
> > little sense to actively test such devices for a media change by
> > sending TURs.
> 
> So to be more precise, do you mean the LU disappearing while its
> port(s) and containing target device remain online?
> 
> Otherwise a hot unplug meets the bill. Also some curious things can
> happen at the transport level. For example I believe T10 have just
> accepted a MODE SELECT command that will change a SAS wide port
> to narrow port(s) (and vice versa). The device is taken offline for
> a programmable period meant to be long enough for the scanning logic
> to notice and then brought back in its new configuration.

Are these really cases where the sense code would be UNIT ATTENTION /
MEDIUM NOT PRESENT or NOT READY / MEDIUM NOT PRESENT? That's all that
matters here. Other sorts of device removal would be handled much
differently.

Cheers,
Martin

Re: ufshcd_queuecommand() triggering after ufshcd_suspend()?

2019-01-16 Thread John Stultz

On Sun, Jan 13, 2019 at 7:25 PM Zang Leigang  wrote:
>   I think there are two different issues:
>
>   1. clk_gating's state(including state's trace event) and is_suspended is not
>  wrapped by ufshcd_is_clkgating_allowed which Hisilicon's kirin platoform
>  soc does not need but is set and checked in a regular path.
>   2. I think SCSI is necessary to block queue to stop sending to ufs after
>  system suspend or, add a new state for ufs like UFSHCD_STATE_SUSPENDING
>  or what else. hba->is_sys_suspended is too late to stop 
> ufshcd_queuecommand
>

Hey Zang,
  Thanks for the suggestions. I'm not sure I have enough context to
try to come up with a patch. I'll try to take a stab at it, but I'd
appreciate it if  you have any more specific suggestions or patches
for me to try. :)

thanks
-john

Re: [PATCH v1 0/8] phy: qcom-ufs: Enable regulators to be off in suspend

2019-01-16 Thread Stephen Boyd

Quoting Evan Green (2019-01-11 15:01:21)
> 
> Because the UFS PHY reset bit is now toggled in the PHY, rather
> than in ufs-qcom, this also percolated to all other PHYs using
> ufs-qcom, which from what I can see is just 8996.
> 
> There are a couple of tradeoffs in this series that I'd welcome feedback
> on. First, it breaks compatibility with device trees that don't expose
> this new reset controller. Making this work with older device trees
> would be pretty ugly in the code, and given that the SDM845 UFS DT nodes
> aren't accepted upstream yet, the breakage seemed worth it. I'm not as
> sure about 8996.

It may take some minor surgery to the reset framework, but it should be
possible to register a reset lookup that can be used if nothing matches
in DT. Right now the reset code looks to only want to use DT if it's
there for the device requesting the reset. If there isn't a DT node for
the device it will look in the global lookup list. That could be changed
to fallback to the global lookup that uses device names (similar to clk
framework) when the DT lookup fails. Then it's just a matter of
registering the lookup for this reset with the handful of device names
that need the non-DT way of finding the reset.

Of course, that's another change and if breaking DT is simpler and
acceptable then I would say go for the path of least resistance and
don't try to modify reset framework for this.

[PATCH v2] scsi: be2iscsi: fix potential NULL pointer dereference

2019-01-16 Thread Dmitry Voytik

Fix potential NULL pointer dereference wich might happen in
beiscsi_alloc_mem(). If kmalloc_array() fails and mem_descr->mem_array
is set to NULL, then its dereferencing happens when passing
mem_descr->mem_array[] to dma_free_coherent().

Signed-off-by: Dmitry Voytik 
---

Changes since v1:
 - prevent double free by setting mem_arr_orig to NULL

 drivers/scsi/be2iscsi/be_main.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
index 74e260027c7d..7b8c80318c38 100644
--- a/drivers/scsi/be2iscsi/be_main.c
+++ b/drivers/scsi/be2iscsi/be_main.c
@@ -2527,8 +2527,11 @@ static int beiscsi_alloc_mem(struct beiscsi_hba *phba)
mem_descr->size_in_bytes = phba->mem_req[i];
mem_descr->mem_array = kmalloc_array(j, sizeof(*mem_arr),
 GFP_KERNEL);
-   if (!mem_descr->mem_array)
+   if (!mem_descr->mem_array) {
+   mem_descr->mem_array = mem_arr_orig;
+   mem_arr_orig = NULL;
goto free_mem;
+   }
 
memcpy(mem_descr->mem_array, mem_arr_orig,
   sizeof(struct mem_array) * j);
-- 
2.20.1

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

2019-01-16 Thread Martin Wilck

On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> If the device is _not_ removable we should not start the event
> poller as the media will not go away. Having the event poller running
> will block the open() call as it will try to flush outstanding
> events,
> which it can't if the device is in state 'BLOCKED'. So the open()
> call
> will be stalled until the device state changed, which might be quite
> some time depending on the transport.
> 
> Signed-off-by: Hannes Reinecke 

I've pondered this some more. Meanwhile I think we should rather fix
the generic event handling code in genhd.c. The problem is that this
code schedules the check_events() function in some situations (notably,
on close(2) via blkdev_put()), even for devices which don't report any
supported events in the (struct gendisk)->events field (the sd driver
correctly sets the events field depending on the "removable" attribute,
but the that only influences the polling behavior, not the call upon
close()).

It looks weird to me that devices that don't report any supported
events would be checked for such events. The reason for this is 
7c88a168da80 and follow-ups (7eec77a1, 9fd097b1). For the sake of not
confusing userland, these commits sacrificed the information in the
generic block layer whether or not a given device actually supports
media change events.

I'm working on a patch to fix that.

Regards,
Martin

Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid

2019-01-16 Thread Bart Van Assche

On Wed, 2019-01-16 at 10:57 -0500, Douglas Gilbert wrote:
> The block layer assumes scsi_request:sense is always a valid
> pointer. This is set up once in scsi_mq_init_request() and the
> containing scsi_cmnd object is used often, being re-initialized
> by scsi_init_command(). That works unless some code re-purposes
> part of the scsi_cmnd object for something else. And that is
> what bidi handling does in scsi_mq_prep_fn(). The result is an
> oops at some later time when the partly overwritten object is
> re-used. The overwrite is from d285203cf647d but 'git blame'
> does not show removed code, so that commit may not be the
> culprit.
> 
> Signed-off-by: Douglas Gilbert 
> ---
> 
> This was found while injecting errors (thus generating sense data)
> into a sequence of bidi commands. At some later time the block
> layer blew up with a scsi_request::sense NULL dereference in
> sg_rq_end_io(). Without testing I'm confident the bsg driver,
> the osd ULD and exofs are exposed to this bug.
> 
>  drivers/scsi/scsi_lib.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index b13cc9288ba0..71259bd4040a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, struct 
> scsi_cmnd *cmd)
>  
>   cmd->device = dev;
>   cmd->sense_buffer = buf;
> + cmd->req.sense = buf;
>   cmd->prot_sdb = prot;
>   cmd->flags = flags;
>   INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler);

Hi Doug,

The description of this patch does not look correct to me. scsi_init_command()
does not overwrite the sense pointer. From the body of that function:

/* zero out the cmd, except for the embedded scsi_request */
memset((char *)cmd + sizeof(cmd->req), 0,
sizeof(*cmd) - sizeof(cmd->req) + dev->host->hostt->cmd_size);

It is not clear to me which code overwrites the sense pointer. I think that
needs to be figured out before discussion of this patch can continue.

Thanks,

Bart.

Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid

2019-01-16 Thread Douglas Gilbert


On 2019-01-16 6:56 p.m., Bart Van Assche wrote:

On Wed, 2019-01-16 at 10:57 -0500, Douglas Gilbert wrote:

The block layer assumes scsi_request:sense is always a valid
pointer. This is set up once in scsi_mq_init_request() and the
containing scsi_cmnd object is used often, being re-initialized
by scsi_init_command(). That works unless some code re-purposes
part of the scsi_cmnd object for something else. And that is
what bidi handling does in scsi_mq_prep_fn(). The result is an
oops at some later time when the partly overwritten object is
re-used. The overwrite is from d285203cf647d but 'git blame'
does not show removed code, so that commit may not be the
culprit.

Signed-off-by: Douglas Gilbert 
---

This was found while injecting errors (thus generating sense data)
into a sequence of bidi commands. At some later time the block
layer blew up with a scsi_request::sense NULL dereference in
sg_rq_end_io(). Without testing I'm confident the bsg driver,
the osd ULD and exofs are exposed to this bug.

  drivers/scsi/scsi_lib.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b13cc9288ba0..71259bd4040a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, struct 
scsi_cmnd *cmd)
  
  	cmd->device = dev;

cmd->sense_buffer = buf;
+   cmd->req.sense = buf;
cmd->prot_sdb = prot;
cmd->flags = flags;
INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler);


Hi Doug,

The description of this patch does not look correct to me. scsi_init_command()
does not overwrite the sense pointer. From the body of that function:

/* zero out the cmd, except for the embedded scsi_request */
memset((char *)cmd + sizeof(cmd->req), 0,
sizeof(*cmd) - sizeof(cmd->req) + dev->host->hostt->cmd_size);

It is not clear to me which code overwrites the sense pointer. I think that
needs to be figured out before discussion of this patch can continue.


Bart,
Please re-read the explanation. The two sentences in the middle should tell
you that you are looking at the wrong memset().

And I'm waiting for the person who wrote the questionable code to comment.


If you don't believe me, try sending a device reset to a scsi_debug device.
Then use sg_raw *** to send a XDWRITREAD(10) bidi command to the same scsi_debug
device, you should get a Unit Attention concerning that device reset. If
you do, keep sending that bidi command. Make sure you have no important
files open because that machine will lock solid. Basically what happens
when a rq_end_io() callback de-references a NULL pointer. It looks like
it has been there since 2014 and took me 2 days to find. Sorry that I can't
explain it in one simple sentence.

Douglas


*** Use 'man sg_raw' and see the EXAMPLES section (second to last example)
You can use a bsg device as shown.

Re: [ANNOUNCE] v4 sg driver: ready for testing

2019-01-16 Thread Douglas Gilbert


There is an update to the SCSI Generic (sg) v4 driver adding synchronous
and asynchronous bidi command support. Plus lots of fixes and some minor
improvements. See:
http://sg.danny.cz/sg/sg_v40.html

The kernel code is split in two in the tarball below, one targeting
lk 5.0 and the other targeting lk 4.20 and earlier ***. Each section
contains the 3 files that represent the sg v4 driver plus a meandering
17 part patchset. Those patchsets reflect the driver's rewrite rather
than a logical progression.
http://sg.danny.cz/sg/p/sgv4_20190116.tgz

Plus there are updated testing utilities in sg3_utils-1.45 (beta,
revision 807) at the top of this page:
   http://sg.danny.cz/sg/index.html

Doug Gilbert


*** the reason for the split is the tree wide change to the access_ok()
function.


On 2018-12-25 2:39 a.m., Douglas Gilbert wrote:

There is an update to the sg v4 driver with some error fixes, SIGIO and
RT signals work plus single READ, multiple WRITE sharing support. See:
     http://sg.danny.cz/sg/sg_v40.html

with testing utilities in sg3_utils-1.45 (beta, revision 802) on the main
page:
     http://sg.danny.cz/sg/index.html

Doug Gilbert


On 2018-12-18 6:41 p.m., Douglas Gilbert wrote:

After an underwhelming response to my intermediate level patchsets to
modernize the sg driver in October this year (see "[PATCH 0/8] sg: major
cleanup, remove max_queue limit" followed by v2 and v3 between 20181019
and 20181028), I decided to move ahead and add the functionality proposed
for the version 4 sg driver. That means accepting interface objects of
type 'struct sg_io_v4' (as found in include/uapi/linux/bsg) plus two new
ioctls: SG_IOSUBMIT and SG_IORECEIVE as proposed by Linus Torvalds to
replace the unloved write(2)/read(2) asynchronous interface . There
is a new feature called "sharing" explained in the web page (see below).

Yes, there is a patchset available (14 part and growing) but even without
explanatory comments at the top of each patch, that patchset is 4 times
larger than the v4 sg driver (i.e. the finished product) and over 6
times larger than the original v3 sg driver! Part of the reason for
the patchset size is the multiple backtracks and rewrites associated
with a real development process. The cleanest patchset would have 3
parts:
   1) split the current include/scsi/sg.h into the end product headers:
  include/uapi/scsi/sg.h and include/scsi/sg.h
   2) delete drivers/scsi/sg.c
   3) add the v4 drivers/scsi/sg.c

After part 2) you could build a kernel and I can guarantee that no-one
will be able to find any sg driver bugs but some users might get upset
(but not the Linux security folks).

So there is a working v4 sg driver discussed here, with a download:
 http://sg.danny.cz/sg/sg_v40.html

I will keep that page up to date while the driver is in this phase.
There is a sg3_utils beta of 1.45 (revision 799) package in the News
section at the top of the main page:
 http://sg.danny.cz/sg/index.html

That sg3_utils beta package will use the v4 sg interface via sg devices
if the v4 driver is detected. There are also three test utilities in
the 'testing' directory designed to exercise the v4 extensions.

The degree of backward compatibility with the v3 driver should be high
but there are limits to backward compatibility. As an example, it is
possible that there are user apps that depend on hitting the 16
outstanding command limit (per fd) in the v3 driver and go "wild"
when v4 removes that ceiling. If so, a "high_v3_compat" driver option
could be added to put that ceiling back.

The only way to find out is for folks to try and if there is a failure,
contact me, or send mail to this list. Code reviews welcome as well.

Doug Gilbert


 I felt this was a better use of my time than trying to invent a new
  debug/trace mechanism for the whole SCSI subsystem. That is what
  _SCSI_ system maintainers are for, I'll stick to the sg driver (and
  scsi_debug). Add user space tools and there is more than enough work
  there ...

Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid

2019-01-16 Thread Bart Van Assche

On Wed, 2019-01-16 at 19:54 -0500, Douglas Gilbert wrote:
> On 2019-01-16 6:56 p.m., Bart Van Assche wrote:
> > On Wed, 2019-01-16 at 10:57 -0500, Douglas Gilbert wrote:
> > > The block layer assumes scsi_request:sense is always a valid
> > > pointer. This is set up once in scsi_mq_init_request() and the
> > > containing scsi_cmnd object is used often, being re-initialized
> > > by scsi_init_command(). That works unless some code re-purposes
> > > part of the scsi_cmnd object for something else. And that is
> > > what bidi handling does in scsi_mq_prep_fn(). The result is an
> > > oops at some later time when the partly overwritten object is
> > > re-used. The overwrite is from d285203cf647d but 'git blame'
> > > does not show removed code, so that commit may not be the
> > > culprit.
> > > 
> > > Signed-off-by: Douglas Gilbert 
> > > ---
> > > 
> > > This was found while injecting errors (thus generating sense data)
> > > into a sequence of bidi commands. At some later time the block
> > > layer blew up with a scsi_request::sense NULL dereference in
> > > sg_rq_end_io(). Without testing I'm confident the bsg driver,
> > > the osd ULD and exofs are exposed to this bug.
> > > 
> > >   drivers/scsi/scsi_lib.c | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index b13cc9288ba0..71259bd4040a 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, 
> > > struct scsi_cmnd *cmd)
> > >   
> > >   cmd->device = dev;
> > >   cmd->sense_buffer = buf;
> > > + cmd->req.sense = buf;
> > >   cmd->prot_sdb = prot;
> > >   cmd->flags = flags;
> > >   INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler);
> > 
> > Hi Doug,
> > 
> > The description of this patch does not look correct to me. 
> > scsi_init_command()
> > does not overwrite the sense pointer. From the body of that function:
> > 
> > /* zero out the cmd, except for the embedded scsi_request */
> > memset((char *)cmd + sizeof(cmd->req), 0,
> > sizeof(*cmd) - sizeof(cmd->req) + dev->host->hostt->cmd_size);
> > 
> > It is not clear to me which code overwrites the sense pointer. I think that
> > needs to be figured out before discussion of this patch can continue.
> 
> Bart,
> Please re-read the explanation. The two sentences in the middle should tell
> you that you are looking at the wrong memset().
> 
> And I'm waiting for the person who wrote the questionable code to comment.
> 
> 
> If you don't believe me, try sending a device reset to a scsi_debug device.
> Then use sg_raw *** to send a XDWRITREAD(10) bidi command to the same 
> scsi_debug
> device, you should get a Unit Attention concerning that device reset. If
> you do, keep sending that bidi command. Make sure you have no important
> files open because that machine will lock solid. Basically what happens
> when a rq_end_io() callback de-references a NULL pointer. It looks like
> it has been there since 2014 and took me 2 days to find. Sorry that I can't
> explain it in one simple sentence.

Hi Doug,

Is this perhaps the memset you are referring to?

memset(bidi_sdb, 0, sizeof(struct scsi_data_buffer));

Bart.

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

2019-01-16 Thread Martin K. Petersen



Doug,

> except the I_T nexus on which the command causing the change was
> received with the additional sense code set to CAPACITY DATA HAS
> CHANGED."
>
> So seeing I'm using the I_T nexus that sent the FORMAT UNIT, then no
> Unit Attention is expected, as per above.

But that assumes that there is a single entity sending commands. Which
is not the case here given that sg, unbeknownst to sd, is sending a
FORMAT UNIT.

We are not intercepting commands sent via sg to see if they may affect
the ULDs. So we need some other way of finding out that the device has
been reformatted behind our backs.

> And surely the sd driver or mid-level should "smell a rat" after the
> first: Unaligned partial completion error message. The disk was
> unmounted so it should not be a big deal to run initialisation again
> on it.

While I agree that this particular error should be a red flag, we'll
only see it if the new format deviates significantly from the old
one. So smelling rats seems like papering over a limited subset of the
possible issues resulting from a FORMAT UNIT.

I wonder if it wouldn't be better to teach sg_format to revalidate a
device once FORMAT UNIT completes?

> If it was mounted then that would be an interesting situation, since
> the user visible data (say just changed to 4096 byte LBs) would be the
> concatenation of 8 of the previous 512 byte LBs.

Maybe sg_format should check whether a block device is in use before it
proceeds to send FORMAT UNIT?

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH v1 5/8] scsi: ufs: qcom: Expose the reset controller for PHY

2019-01-16 Thread Martin K. Petersen



Kishon,

> On 12/01/19 4:31 AM, Evan Green wrote:
>> Expose a reset controller that the phy can use to perform its
>> initialization in a single callback.
>> 
>> Also, change the use of the phy functions from ufs-qcom such that
>> phy_poweron actually fires up the phy, and phy_poweroff actually
>> powers it down.
>> 
>> Signed-off-by: Evan Green 
>
> Can I get Ack for this patch from SCSI MAINTAINERS?

No objection from me if there is general consensus that moving reset to
the phy is the right thing to do.

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH v2 0/7] sd patches for kernel v5.1

2019-01-16 Thread Martin K. Petersen



Bart,

> If any changes have been left out that is the result of an
> oversight. Can you clarify which changes you think are missing?

It appears you got everything. Things were just sliced and diced
slightly differently.

I'll run some tests and apply unless something breaks.

Thanks again for dusting these off!

-- 
Martin K. Petersen  Oracle Linux Engineering

Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2

2019-01-16 Thread Ching Huang

This patch series are against to mkp's 5.1/scsi-queue.

1. Due to dma_zalloc_coherent will be phase out, so use dma_alloc_coherent to 
replace it.
2. Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2.
3. Update driver version to v1.40.00.10-20190116
---

[PATCH 1/3] scsi: arcmsr: Use dma_alloc_coherent to replace dma_zalloc_coherent

2019-01-16 Thread Ching Huang

>From Ching Huang 

Due to dma_zalloc_coherent will be phase out, so use dma_alloc_coherent to 
replace
 it.

Signed-off-by: Ching Huang 
---

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 9f85d5a..5736434 100755
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -642,7 +642,7 @@ static bool arcmsr_alloc_io_queue(struct 
AdapterControlBlock *acb)
switch (acb->adapter_type) {
case ACB_ADAPTER_TYPE_B: {
acb->ioqueue_size = roundup(sizeof(struct MessageUnit_B), 32);
-   dma_coherent = dma_zalloc_coherent(&pdev->dev, 
acb->ioqueue_size,
+   dma_coherent = dma_alloc_coherent(&pdev->dev, acb->ioqueue_size,
&dma_coherent_handle, GFP_KERNEL);
if (!dma_coherent) {
pr_notice("arcmsr%d: DMA allocation failed\n", 
acb->host->host_no);
@@ -656,7 +656,7 @@ static bool arcmsr_alloc_io_queue(struct 
AdapterControlBlock *acb)
break;
case ACB_ADAPTER_TYPE_D: {
acb->ioqueue_size = roundup(sizeof(struct MessageUnit_D), 32);
-   dma_coherent = dma_zalloc_coherent(&pdev->dev, 
acb->ioqueue_size,
+   dma_coherent = dma_alloc_coherent(&pdev->dev, acb->ioqueue_size,
&dma_coherent_handle, GFP_KERNEL);
if (!dma_coherent) {
pr_notice("arcmsr%d: DMA allocation failed\n", 
acb->host->host_no);
@@ -672,7 +672,7 @@ static bool arcmsr_alloc_io_queue(struct 
AdapterControlBlock *acb)
uint32_t completeQ_size;
completeQ_size = sizeof(struct deliver_completeQ) * 
ARCMSR_MAX_HBE_DONEQUEUE + 128;
acb->ioqueue_size = roundup(completeQ_size, 32);
-   dma_coherent = dma_zalloc_coherent(&pdev->dev, 
acb->ioqueue_size,
+   dma_coherent = dma_alloc_coherent(&pdev->dev, acb->ioqueue_size,
&dma_coherent_handle, GFP_KERNEL);
if (!dma_coherent){
pr_notice("arcmsr%d: DMA allocation failed\n", 
acb->host->host_no);

[PATCH 2/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2

2019-01-16 Thread Ching Huang

>From Ching Huang 

Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2.

Signed-off-by: Ching Huang 
---

diff --git a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
index a94c513..b98c632 100755
--- a/drivers/scsi/arcmsr/arcmsr.h
+++ b/drivers/scsi/arcmsr/arcmsr.h
@@ -508,9 +508,9 @@ struct MessageUnit_A
 struct MessageUnit_B
 {
uint32_tpost_qbuffer[ARCMSR_MAX_HBB_POSTQUEUE];
-   uint32_tdone_qbuffer[ARCMSR_MAX_HBB_POSTQUEUE];
+   volatile uint32_t   done_qbuffer[ARCMSR_MAX_HBB_POSTQUEUE];
uint32_tpostq_index;
-   uint32_tdoneq_index;
+   volatile uint32_t   doneq_index;
uint32_t__iomem *drv2iop_doorbell;
uint32_t__iomem *drv2iop_doorbell_mask;
uint32_t__iomem *iop2drv_doorbell;
diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 5736434..88053b1 100755
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -1113,7 +1113,11 @@ static int arcmsr_resume(struct pci_dev *pdev)
switch (acb->adapter_type) {
case ACB_ADAPTER_TYPE_B: {
struct MessageUnit_B *reg = acb->pmuB;
-   reg->post_qbuffer[0] = 0;
+   uint32_t i;
+   for (i = 0; i < ARCMSR_MAX_HBB_POSTQUEUE; i++) {
+   reg->post_qbuffer[i] = 0;
+   reg->done_qbuffer[i] = 0;
+   }
reg->postq_index = 0;
reg->doneq_index = 0;
break;

[PATCH 3/3] scsi: arcmsr: Update driver version to v1.40.00.10-20190116

2019-01-16 Thread Ching Huang

>From Ching Huang 

Update driver version to v1.40.00.10-20190116.

Signed-off-by: Ching Huang 
---

diff --git a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h
index b98c632..6033bcc 100755
--- a/drivers/scsi/arcmsr/arcmsr.h
+++ b/drivers/scsi/arcmsr/arcmsr.h
@@ -49,7 +49,7 @@ struct device_attribute;
 #define ARCMSR_MAX_OUTSTANDING_CMD 1024
 #define ARCMSR_DEFAULT_OUTSTANDING_CMD 128
 #define ARCMSR_MIN_OUTSTANDING_CMD 32
-#define ARCMSR_DRIVER_VERSION  "v1.40.00.10-20181217"
+#define ARCMSR_DRIVER_VERSION  "v1.40.00.10-20190116"
 #define ARCMSR_SCSI_INITIATOR_ID   255
 #define ARCMSR_MAX_XFER_SECTORS512
 #define ARCMSR_MAX_XFER_SECTORS_B  4096

[PATCH 0/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2

2019-01-16 Thread Ching Huang

This patch series are against to mkp's 5.1/scsi-queue.

1. Due to dma_zalloc_coherent will be phase out, so use dma_alloc_coherent to 
replace
 it.
2. Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2.
3. Update driver version to v1.40.00.10-20190116
---

RE: [PATCH] scsi: ufs: revamp string descriptor reading

2019-01-16 Thread Winkler, Tomas

> > Hello Tomas,
> >
> > >
> > > Define new a type: uc_string_id for easier string handling and less
> > > casting. Reduce number or string copies in price of a dynamic
> > > allocation.
> > >
> > > Signed-off-by: Tomas Winkler 
> > Tested-by: Avri Altman 
> >
> > Just one nit - doesn't really matters.
> >
> > Cheers,
> > Avri
> >
> > > ---
> > >  drivers/scsi/ufs/ufs-sysfs.c |  20 ++---
> > >  drivers/scsi/ufs/ufs.h   |   2 +-
> > >  drivers/scsi/ufs/ufshcd.c| 164 +--
> > >  drivers/scsi/ufs/ufshcd.h|   9 +-
> > >  4 files changed, 115 insertions(+), 80 deletions(-)
> >
> >
> > >
> > >   ufs_fixup_device_setup(hba, &card);
> > > + ufs_put_device_desc(&card);
> > ufs_get_device_desc() and ufs_put_device_desc() actually serves the
> > quirks setup.
> > Make sense to call it from within so the logic is clear and in one place.
> > Might also save ufs_put_device_desc().
> 
> You are right from the current perspective , just I'd need it also for the 
> RPMB
> patches that should follow, then it will have bit larger span than the quirks.
> 

Is this okay, can we merge?
Thanks
Tomas

Re: [PATCH v1 5/8] scsi: ufs: qcom: Expose the reset controller for PHY

Re: [PATCH v1] scsi: ufs: Use explicit access size in ufshcd_dump_regs

RE: [PATCH v1] scsi: ufs: Use explicit access size in ufshcd_dump_regs

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

[PATCH fix] scsi_lib: make sure scsi_request.sense valid

[PATCH for-5.0] scsi: communicate max segment size to the DMA mapping code

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

Re: ufshcd_queuecommand() triggering after ufshcd_suspend()?

Re: [PATCH v1 0/8] phy: qcom-ufs: Enable regulators to be off in suspend

[PATCH v2] scsi: be2iscsi: fix potential NULL pointer dereference

Re: [PATCH] sd: skip non-removable devices in sd_check_events()

Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid

Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid

Re: [ANNOUNCE] v4 sg driver: ready for testing

Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid

Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever

Re: [PATCH v1 5/8] scsi: ufs: qcom: Expose the reset controller for PHY

Re: [PATCH v2 0/7] sd patches for kernel v5.1

Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2

[PATCH 1/3] scsi: arcmsr: Use dma_alloc_coherent to replace dma_zalloc_coherent

[PATCH 2/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2

[PATCH 3/3] scsi: arcmsr: Update driver version to v1.40.00.10-20190116

[PATCH 0/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2

RE: [PATCH] scsi: ufs: revamp string descriptor reading

33 matches

Site Navigation

Mail list logo

Footer information