Re: [PATCH v1 5/8] scsi: ufs: qcom: Expose the reset controller for PHY
On 12/01/19 4:31 AM, Evan Green wrote: > Expose a reset controller that the phy can use to perform its > initialization in a single callback. > > Also, change the use of the phy functions from ufs-qcom such that > phy_poweron actually fires up the phy, and phy_poweroff actually > powers it down. > > Signed-off-by: Evan Green Can I get Ack for this patch from SCSI MAINTAINERS? Thanks Kishon > > --- > Note: This change depends on the remaining changes in this series, > since UFS PHY reset now needs to be done by the PHY driver. > > drivers/scsi/ufs/Kconfig| 1 + > drivers/scsi/ufs/ufs-qcom.c | 110 +--- > drivers/scsi/ufs/ufs-qcom.h | 4 ++ > 3 files changed, 71 insertions(+), 44 deletions(-) > > diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig > index 2ddbb26d9c265..63c5c4115981f 100644 > --- a/drivers/scsi/ufs/Kconfig > +++ b/drivers/scsi/ufs/Kconfig > @@ -100,6 +100,7 @@ config SCSI_UFS_QCOM > tristate "QCOM specific hooks to UFS controller platform driver" > depends on SCSI_UFSHCD_PLATFORM && ARCH_QCOM > select PHY_QCOM_UFS > + select RESET_CONTROLLER > help > This selects the QCOM specific additions to UFSHCD platform driver. > UFS host on QCOM needs some vendor specific configuration before > diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c > index 3aeadb14aae1e..db46f9a64b54c 100644 > --- a/drivers/scsi/ufs/ufs-qcom.c > +++ b/drivers/scsi/ufs/ufs-qcom.c > @@ -16,6 +16,7 @@ > #include > #include > #include > +#include > > #include "ufshcd.h" > #include "ufshcd-pltfrm.h" > @@ -49,6 +50,11 @@ static void ufs_qcom_get_default_testbus_cfg(struct > ufs_qcom_host *host); > static int ufs_qcom_set_dme_vs_core_clk_ctrl_clear_div(struct ufs_hba *hba, > u32 clk_cycles); > > +static struct ufs_qcom_host *rcdev_to_ufs_host(struct reset_controller_dev > *rcd) > +{ > + return container_of(rcd, struct ufs_qcom_host, rcdev); > +} > + > static void ufs_qcom_dump_regs_wrapper(struct ufs_hba *hba, int offset, int > len, > const char *prefix, void *priv) > { > @@ -255,11 +261,6 @@ static int ufs_qcom_power_up_sequence(struct ufs_hba > *hba) > if (is_rate_B) > phy_set_mode(phy, PHY_MODE_UFS_HS_B); > > - /* Assert PHY reset and apply PHY calibration values */ > - ufs_qcom_assert_reset(hba); > - /* provide 1ms delay to let the reset pulse propagate */ > - usleep_range(1000, 1100); > - > /* phy initialization - calibrate the phy */ > ret = phy_init(phy); > if (ret) { > @@ -268,15 +269,6 @@ static int ufs_qcom_power_up_sequence(struct ufs_hba > *hba) > goto out; > } > > - /* De-assert PHY reset and start serdes */ > - ufs_qcom_deassert_reset(hba); > - > - /* > - * after reset deassertion, phy will need all ref clocks, > - * voltage, current to settle down before starting serdes. > - */ > - usleep_range(1000, 1100); > - > /* power on phy - start serdes and phy's power and clocks */ > ret = phy_power_on(phy); > if (ret) { > @@ -290,7 +282,6 @@ static int ufs_qcom_power_up_sequence(struct ufs_hba *hba) > return 0; > > out_disable_phy: > - ufs_qcom_assert_reset(hba); > phy_exit(phy); > out: > return ret; > @@ -554,21 +545,10 @@ static int ufs_qcom_suspend(struct ufs_hba *hba, enum > ufs_pm_op pm_op) > ufs_qcom_disable_lane_clks(host); > phy_power_off(phy); > > - /* Assert PHY soft reset */ > - ufs_qcom_assert_reset(hba); > - goto out; > - } > - > - /* > - * If UniPro link is not active, PHY ref_clk, main PHY analog power > - * rail and low noise analog power rail for PLL can be switched off. > - */ > - if (!ufs_qcom_is_link_active(hba)) { > + } else if (!ufs_qcom_is_link_active(hba)) { > ufs_qcom_disable_lane_clks(host); > - phy_power_off(phy); > } > > -out: > return ret; > } > > @@ -578,21 +558,26 @@ static int ufs_qcom_resume(struct ufs_hba *hba, enum > ufs_pm_op pm_op) > struct phy *phy = host->generic_phy; > int err; > > - err = phy_power_on(phy); > - if (err) { > - dev_err(hba->dev, "%s: failed enabling regs, err = %d\n", > - __func__, err); > - goto out; > - } > + if (ufs_qcom_is_link_off(hba)) { > + err = phy_power_on(phy); > + if (err) { > + dev_err(hba->dev, "%s: failed enabling regs, err = > %d\n", > + __func__, err); > + return err; > + } > > - err = ufs_qcom_enable_lane_clks(host); > - if (err) > - goto out; > + err = ufs_qcom_enable_lane_clks(host); > + if (err)
Re: [PATCH v1] scsi: ufs: Use explicit access size in ufshcd_dump_regs
On 09/01/2019 16:38, Jeffrey Hugo wrote: >> On 11/12/2018 15:18, Marc Gonzalez wrote: >> >>> memcpy_fromio() doesn't provide any control over access size. >>> For example, on arm64, it is implemented using readb and readq. >>> This may trigger a synchronous external abort: >>> >>> [3.729943] Internal error: synchronous external abort: 96000210 [#1] >>> PREEMPT SMP >>> [3.737000] Modules linked in: >>> [3.744371] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G S >>> 4.20.0-rc4 #16 >>> [3.747413] Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP >>> (DT) >>> [3.755295] pstate: 0005 (nzcv daif -PAN -UAO) >>> [3.761978] pc : __memcpy_fromio+0x68/0x80 >>> [3.766718] lr : ufshcd_dump_regs+0x50/0xb0 >>> [3.770767] sp : 0807ba00 >>> [3.774830] x29: 0807ba00 x28: fffb >>> [3.778344] x27: 089db068 x26: 8000f6e58000 >>> [3.783728] x25: 000e x24: 0800 >>> [3.789023] x23: 8000f6e587c8 x22: 0800 >>> [3.794319] x21: 08908368 x20: 8000f6e1ab80 >>> [3.799615] x19: 006c x18: >>> [3.804910] x17: x16: >>> [3.810206] x15: 09199648 x14: 89244187 >>> [3.815502] x13: 09244195 x12: 091ab000 >>> [3.820797] x11: 05f5e0ff x10: 091998a0 >>> [3.826093] x9 : x8 : 8000f6e1ac00 >>> [3.831389] x7 : x6 : 0068 >>> [3.836676] x5 : 8000f6e1abe8 x4 : >>> [3.841971] x3 : 0928c868 x2 : 8000f6e1abec >>> [3.847267] x1 : 0928c868 x0 : 8000f6e1abe8 >>> [3.852567] Process swapper/0 (pid: 1, stack limit = 0x(ptrval)) >>> [3.857900] Call trace: >>> [3.864473] __memcpy_fromio+0x68/0x80 >>> [3.866683] ufs_qcom_dump_dbg_regs+0x1c0/0x370 >>> [3.870522] ufshcd_print_host_regs+0x168/0x190 >>> [3.874946] ufshcd_init+0xd4c/0xde0 >>> [3.879459] ufshcd_pltfrm_init+0x3c8/0x550 >>> [3.883264] ufs_qcom_probe+0x24/0x60 >>> [3.887188] platform_drv_probe+0x50/0xa0 >>> >>> Assuming aligned 32-bit registers, let's use readl, after making sure >>> that 'offset' and 'len' are indeed multiples of 4. >>> >>> Fixes: ba80917d9932d ("scsi: ufs: ufshcd_dump_regs to use memcpy_fromio") >>> Signed-off-by: Marc Gonzalez >>> --- >>> drivers/scsi/ufs/ufshcd.c | 10 -- >>> 1 file changed, 8 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>> index 535180c01ce8..320bbd9849bc 100644 >>> --- a/drivers/scsi/ufs/ufshcd.c >>> +++ b/drivers/scsi/ufs/ufshcd.c >>> @@ -112,13 +112,19 @@ >>> int ufshcd_dump_regs(struct ufs_hba *hba, size_t offset, size_t len, >>> const char *prefix) >>> { >>> - u8 *regs; >>> + u32 *regs; >>> + size_t pos; >>> + >>> + if (offset % 4 != 0 || len % 4 != 0) /* keep readl happy */ >>> + return -EINVAL; > > Hmm. It seems like these cases could be handled, but I guess we cannot > necessarily assume that reading past the bounds specified by the client > is safe, so this seems reasonable. > >>> >>> regs = kzalloc(len, GFP_KERNEL); >>> if (!regs) >>> return -ENOMEM; >>> >>> - memcpy_fromio(regs, hba->mmio_base + offset, len); >>> + for (pos = 0; pos < len; pos += 4) >>> + regs[pos / 4] = ufshcd_readl(hba, offset + pos); >>> + >>> ufshcd_hex_dump(prefix, regs, len); >>> kfree(regs); >>> > > Reviewed-by: Jeffrey Hugo James, Martin, I haven't heard back from Vinayak. (In fact, his last review dates back to 2016. Should MAINTAINERS be updated? I can send a patch.) Would you mind taking this patch through the SCSI tree? Tomas, does "LGTM" translate to Acked-by? :-D Regards.
RE: [PATCH v1] scsi: ufs: Use explicit access size in ufshcd_dump_regs
> > On 09/01/2019 16:38, Jeffrey Hugo wrote: > > >> On 11/12/2018 15:18, Marc Gonzalez wrote: > >> > >>> memcpy_fromio() doesn't provide any control over access size. > >>> For example, on arm64, it is implemented using readb and readq. > >>> This may trigger a synchronous external abort: > >>> > >>> [3.729943] Internal error: synchronous external abort: 96000210 [#1] > PREEMPT SMP > >>> [3.737000] Modules linked in: > >>> [3.744371] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G S > >>> 4.20.0- > rc4 #16 > >>> [3.747413] Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 > MTP (DT) > >>> [3.755295] pstate: 0005 (nzcv daif -PAN -UAO) > >>> [3.761978] pc : __memcpy_fromio+0x68/0x80 > >>> [3.766718] lr : ufshcd_dump_regs+0x50/0xb0 > >>> [3.770767] sp : 0807ba00 > >>> [3.774830] x29: 0807ba00 x28: fffb > >>> [3.778344] x27: 089db068 x26: 8000f6e58000 > >>> [3.783728] x25: 000e x24: 0800 > >>> [3.789023] x23: 8000f6e587c8 x22: 0800 > >>> [3.794319] x21: 08908368 x20: 8000f6e1ab80 > >>> [3.799615] x19: 006c x18: > >>> [3.804910] x17: x16: > >>> [3.810206] x15: 09199648 x14: 89244187 > >>> [3.815502] x13: 09244195 x12: 091ab000 > >>> [3.820797] x11: 05f5e0ff x10: 091998a0 > >>> [3.826093] x9 : x8 : 8000f6e1ac00 > >>> [3.831389] x7 : x6 : 0068 > >>> [3.836676] x5 : 8000f6e1abe8 x4 : > >>> [3.841971] x3 : 0928c868 x2 : 8000f6e1abec > >>> [3.847267] x1 : 0928c868 x0 : 8000f6e1abe8 > >>> [3.852567] Process swapper/0 (pid: 1, stack limit = > >>> 0x(ptrval)) > >>> [3.857900] Call trace: > >>> [3.864473] __memcpy_fromio+0x68/0x80 > >>> [3.866683] ufs_qcom_dump_dbg_regs+0x1c0/0x370 > >>> [3.870522] ufshcd_print_host_regs+0x168/0x190 > >>> [3.874946] ufshcd_init+0xd4c/0xde0 > >>> [3.879459] ufshcd_pltfrm_init+0x3c8/0x550 > >>> [3.883264] ufs_qcom_probe+0x24/0x60 > >>> [3.887188] platform_drv_probe+0x50/0xa0 > >>> > >>> Assuming aligned 32-bit registers, let's use readl, after making > >>> sure that 'offset' and 'len' are indeed multiples of 4. > >>> > >>> Fixes: ba80917d9932d ("scsi: ufs: ufshcd_dump_regs to use > >>> memcpy_fromio") > >>> Signed-off-by: Marc Gonzalez > >>> --- > >>> drivers/scsi/ufs/ufshcd.c | 10 -- > >>> 1 file changed, 8 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > >>> index 535180c01ce8..320bbd9849bc 100644 > >>> --- a/drivers/scsi/ufs/ufshcd.c > >>> +++ b/drivers/scsi/ufs/ufshcd.c > >>> @@ -112,13 +112,19 @@ > >>> int ufshcd_dump_regs(struct ufs_hba *hba, size_t offset, size_t len, > >>>const char *prefix) > >>> { > >>> - u8 *regs; > >>> + u32 *regs; > >>> + size_t pos; > >>> + > >>> + if (offset % 4 != 0 || len % 4 != 0) /* keep readl happy */ > >>> + return -EINVAL; > > > > Hmm. It seems like these cases could be handled, but I guess we > > cannot necessarily assume that reading past the bounds specified by > > the client is safe, so this seems reasonable. > > > >>> > >>> regs = kzalloc(len, GFP_KERNEL); > >>> if (!regs) > >>> return -ENOMEM; > >>> > >>> - memcpy_fromio(regs, hba->mmio_base + offset, len); > >>> + for (pos = 0; pos < len; pos += 4) > >>> + regs[pos / 4] = ufshcd_readl(hba, offset + pos); > >>> + > >>> ufshcd_hex_dump(prefix, regs, len); > >>> kfree(regs); > >>> > > > > Reviewed-by: Jeffrey Hugo > > James, Martin, > > I haven't heard back from Vinayak. (In fact, his last review dates back to > 2016. > Should MAINTAINERS be updated? I can send a patch.) > > Would you mind taking this patch through the SCSI tree? > > Tomas, does "LGTM" translate to Acked-by? :-D Yes, you can add my Ack. Thanks Tomas
Re: [PATCH] sd: skip non-removable devices in sd_check_events()
On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote: > If the device is _not_ removable we should not start the event > poller as the media will not go away. Having the event poller running > will block the open() call as it will try to flush outstanding > events, > which it can't if the device is in state 'BLOCKED'. So the open() > call > will be stalled until the device state changed, which might be quite > some time depending on the transport. > > Signed-off-by: Hannes Reinecke > --- > drivers/scsi/sd.c | 3 +++ > 1 file changed, 3 insertions(+) Thanks - you lost patience with me, did you? :-) I didn't submit this yet because Tejun (adding him to CC) had explicitly removed this check in commit 2bae0093cab4, and I was still trying to understand why. Perhaps a follow-up discussion will clarify that. Martin > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index a1a44f52e0e8..521f0a384446 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -1539,6 +1539,9 @@ static unsigned int sd_check_events(struct > gendisk *disk, unsigned int clearing) > return 0; > > sdp = sdkp->device; > + if (!sdp->removable) > + return 0; > + > SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, > "sd_check_events\n")); > > /*
Re: [PATCH] sd: skip non-removable devices in sd_check_events()
On 1/16/19 11:26 AM, Martin Wilck wrote: On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote: If the device is _not_ removable we should not start the event poller as the media will not go away. Having the event poller running will block the open() call as it will try to flush outstanding events, which it can't if the device is in state 'BLOCKED'. So the open() call will be stalled until the device state changed, which might be quite some time depending on the transport. Signed-off-by: Hannes Reinecke --- drivers/scsi/sd.c | 3 +++ 1 file changed, 3 insertions(+) Thanks - you lost patience with me, did you? :-) I didn't submit this yet because Tejun (adding him to CC) had explicitly removed this check in commit 2bae0093cab4, and I was still trying to understand why. Perhaps a follow-up discussion will clarify that. To my understanding Tejun removed it as the check got moved into set_media_not_present() function. However, I'd be very much interested to hear on how a non-removable device can change ... Sure, there might be non-compatible devices which do _not_ set the removable flag (despite them being removable) and the notorious aacraid which insists on claiming all devices being removable, but if the need arises we can always introduce a blacklist flag. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
Re: [PATCH] sd: skip non-removable devices in sd_check_events()
On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote: > On 1/16/19 11:26 AM, Martin Wilck wrote: > > On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote: > > > If the device is _not_ removable we should not start the event > > > poller as the media will not go away. Having the event poller > > > running > > > will block the open() call as it will try to flush outstanding > > > events, > > > which it can't if the device is in state 'BLOCKED'. So the open() > > > call > > > will be stalled until the device state changed, which might be > > > quite > > > some time depending on the transport. > > > > > > Signed-off-by: Hannes Reinecke > > > --- > > > drivers/scsi/sd.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > Thanks - you lost patience with me, did you? :-) > > > > I didn't submit this yet because Tejun (adding him to CC) had > > explicitly removed this check in commit 2bae0093cab4, and I was > > still > > trying to understand why. > > > > Perhaps a follow-up discussion will clarify that. > > > To my understanding Tejun removed it as the check got moved into > set_media_not_present() function. Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be executed for non-removable disks in sd_check_events(), which was never done in sd_media_changed() beforehand. > However, I'd be very much interested to hear on how a non-removable > device can change ... Me too. The specs at least don't seem to mandate that only removable devices can return MEDIUM NOT PRESENT. But even then, it makes only little sense to actively test such devices for a media change by sending TURs. > Sure, there might be non-compatible devices which do _not_ set the > removable flag (despite them being removable) and the notorious > aacraid > which insists on claiming all devices being removable, but if the > need > arises we can always introduce a blacklist flag. I agree. Also, I suppose that if someone has a case for checking for media change on non-removable devices, (s)he will bring it up. Therefore: Acked-by: Martin Wilck Regards Martin
Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()
On 16/01/2019 02:54, Martin K. Petersen wrote: Hi John, Hi Martin, So in this case I think that accessor functions are actually better because they allow us to print a big fat warning when you twiddle something you shouldn't post-initialization. So that's something I think we could--and should--improve. Sure, this is an alternative, but I would rather make it obvious when these parameters should be set so that this would not be required. I would like to have a mechanism in place that warns if you twiddle things that break assumptions made at host registration time. Yes, something more robust would be good. That's not a scenario the old registration interface was designed to handle. I am not sure I agree with your assertion that setting these masks in the struct prior to scsi_add_host() makes this ordering requirement much more obvious. It is not like you are passing in a list of parameters and then receiving a separately instantiated immutable scsi_host object. You are performing an operation on something you already have and own. That's why I commented that the current intersection between compile-time static host template, dynamically discovered pre-registration scsi_host parameters, and the registered runtime scsi_host struct is somewhat messy. Btw. I have no attachment to the prot wrappers whatsoever. The reason they exist is that the SCSI integrity support was optional. And therefore we had stub functions so things could be compiled without any of the integrity fields being present in the various SCSI structs. I never noticed stubs for setting/getting Scsi_host.prot_{capabilities,guard_type} So I don't have any problem killing the wrappers except they may actually be handy for regressions like this one where you could #error if the driver writer violates the ordering requirement. We set many Scsi_host parameters without such safeguarding, and I don't know what's special about these protection-related members. Thanks, John
[PATCH fix] scsi_lib: make sure scsi_request.sense valid
The block layer assumes scsi_request:sense is always a valid pointer. This is set up once in scsi_mq_init_request() and the containing scsi_cmnd object is used often, being re-initialized by scsi_init_command(). That works unless some code re-purposes part of the scsi_cmnd object for something else. And that is what bidi handling does in scsi_mq_prep_fn(). The result is an oops at some later time when the partly overwritten object is re-used. The overwrite is from d285203cf647d but 'git blame' does not show removed code, so that commit may not be the culprit. Signed-off-by: Douglas Gilbert --- This was found while injecting errors (thus generating sense data) into a sequence of bidi commands. At some later time the block layer blew up with a scsi_request::sense NULL dereference in sg_rq_end_io(). Without testing I'm confident the bsg driver, the osd ULD and exofs are exposed to this bug. drivers/scsi/scsi_lib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index b13cc9288ba0..71259bd4040a 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, struct scsi_cmnd *cmd) cmd->device = dev; cmd->sense_buffer = buf; + cmd->req.sense = buf; cmd->prot_sdb = prot; cmd->flags = flags; INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler); -- 2.17.1
[PATCH for-5.0] scsi: communicate max segment size to the DMA mapping code
When a host driver sets a maximum segment size we should not only propagate that setting to the block layer, which can merge segments, but also to the DMA mapping layer which can merge segments as well. Fixes: 50c2e9107f ("scsi: introduce a max_segment_size host_template parameters") Signed-off-by: Christoph Hellwig --- drivers/ata/pata_macio.c | 9 - drivers/ata/sata_inic162x.c | 22 +- drivers/firewire/sbp2.c | 5 + drivers/scsi/aacraid/linit.c | 9 - drivers/scsi/scsi_lib.c | 4 ++-- 5 files changed, 20 insertions(+), 29 deletions(-) diff --git a/drivers/ata/pata_macio.c b/drivers/ata/pata_macio.c index 8cc9c429ad95..9e7fc302430f 100644 --- a/drivers/ata/pata_macio.c +++ b/drivers/ata/pata_macio.c @@ -915,6 +915,10 @@ static struct scsi_host_template pata_macio_sht = { .sg_tablesize = MAX_DCMDS, /* We may not need that strict one */ .dma_boundary = ATA_DMA_BOUNDARY, + /* Not sure what the real max is but we know it's less than 64K, let's +* use 64K minus 256 +*/ + .max_segment_size = MAX_DBDMA_SEG, .slave_configure= pata_macio_slave_config, }; @@ -1044,11 +1048,6 @@ static int pata_macio_common_init(struct pata_macio_priv *priv, /* Make sure we have sane initial timings in the cache */ pata_macio_default_timings(priv); - /* Not sure what the real max is but we know it's less than 64K, let's -* use 64K minus 256 -*/ - dma_set_max_seg_size(priv->dev, MAX_DBDMA_SEG); - /* Allocate libata host for 1 port */ memset(&pinfo, 0, sizeof(struct ata_port_info)); pmac_macio_calc_timing_masks(priv, &pinfo); diff --git a/drivers/ata/sata_inic162x.c b/drivers/ata/sata_inic162x.c index e0bcf9b2dab0..174e84ce4379 100644 --- a/drivers/ata/sata_inic162x.c +++ b/drivers/ata/sata_inic162x.c @@ -245,8 +245,15 @@ struct inic_port_priv { static struct scsi_host_template inic_sht = { ATA_BASE_SHT(DRV_NAME), - .sg_tablesize = LIBATA_MAX_PRD, /* maybe it can be larger? */ - .dma_boundary = INIC_DMA_BOUNDARY, + .sg_tablesize = LIBATA_MAX_PRD, /* maybe it can be larger? */ + + /* +* This controller is braindamaged. dma_boundary is 0x like others +* but it will lock up the whole machine HARD if 65536 byte PRD entry +* is fed. Reduce maximum segment size. +*/ + .dma_boundary = INIC_DMA_BOUNDARY, + .max_segment_size = 65536 - 512, }; static const int scr_map[] = { @@ -868,17 +875,6 @@ static int inic_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) return rc; } - /* -* This controller is braindamaged. dma_boundary is 0x -* like others but it will lock up the whole machine HARD if -* 65536 byte PRD entry is fed. Reduce maximum segment size. -*/ - rc = dma_set_max_seg_size(&pdev->dev, 65536 - 512); - if (rc) { - dev_err(&pdev->dev, "failed to set the maximum segment size\n"); - return rc; - } - rc = init_controller(hpriv->mmio_base, hpriv->cached_hctl); if (rc) { dev_err(&pdev->dev, "failed to initialize controller\n"); diff --git a/drivers/firewire/sbp2.c b/drivers/firewire/sbp2.c index 09b845e90114..a785ffd5af89 100644 --- a/drivers/firewire/sbp2.c +++ b/drivers/firewire/sbp2.c @@ -1144,10 +1144,6 @@ static int sbp2_probe(struct fw_unit *unit, const struct ieee1394_device_id *id) if (device->is_local) return -ENODEV; - if (dma_get_max_seg_size(device->card->device) > SBP2_MAX_SEG_SIZE) - WARN_ON(dma_set_max_seg_size(device->card->device, -SBP2_MAX_SEG_SIZE)); - shost = scsi_host_alloc(&scsi_driver_template, sizeof(*tgt)); if (shost == NULL) return -ENOMEM; @@ -1610,6 +1606,7 @@ static struct scsi_host_template scsi_driver_template = { .eh_abort_handler = sbp2_scsi_abort, .this_id= -1, .sg_tablesize = SG_ALL, + .max_segment_size = SBP2_MAX_SEG_SIZE, .can_queue = 1, .sdev_attrs = sbp2_scsi_sysfs_attrs, }; diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c index 634ddb90e7aa..7e56a11836c1 100644 --- a/drivers/scsi/aacraid/linit.c +++ b/drivers/scsi/aacraid/linit.c @@ -1747,11 +1747,10 @@ static int aac_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) shost->max_sectors = (shost->sg_tablesize * 8) + 112; } - error = dma_set_max_seg_size(&pdev->dev, - (aac->adapter_info.options & AAC_OPT_NEW_COMM) ? - (shost->max_sectors << 9) : 65536); - if (error) - goto out_deinit; + i
Re: [PATCH] sd: skip non-removable devices in sd_check_events()
On 2019-01-16 5:58 a.m., Martin Wilck wrote: On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote: On 1/16/19 11:26 AM, Martin Wilck wrote: On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote: If the device is _not_ removable we should not start the event poller as the media will not go away. Having the event poller running will block the open() call as it will try to flush outstanding events, which it can't if the device is in state 'BLOCKED'. So the open() call will be stalled until the device state changed, which might be quite some time depending on the transport. Signed-off-by: Hannes Reinecke --- drivers/scsi/sd.c | 3 +++ 1 file changed, 3 insertions(+) Thanks - you lost patience with me, did you? :-) I didn't submit this yet because Tejun (adding him to CC) had explicitly removed this check in commit 2bae0093cab4, and I was still trying to understand why. Perhaps a follow-up discussion will clarify that. To my understanding Tejun removed it as the check got moved into set_media_not_present() function. Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be executed for non-removable disks in sd_check_events(), which was never done in sd_media_changed() beforehand. However, I'd be very much interested to hear on how a non-removable device can change ... Me too. The specs at least don't seem to mandate that only removable devices can return MEDIUM NOT PRESENT. But even then, it makes only little sense to actively test such devices for a media change by sending TURs. So to be more precise, do you mean the LU disappearing while its port(s) and containing target device remain online? Otherwise a hot unplug meets the bill. Also some curious things can happen at the transport level. For example I believe T10 have just accepted a MODE SELECT command that will change a SAS wide port to narrow port(s) (and vice versa). The device is taken offline for a programmable period meant to be long enough for the scanning logic to notice and then brought back in its new configuration. Doug Gilbert Sure, there might be non-compatible devices which do _not_ set the removable flag (despite them being removable) and the notorious aacraid which insists on claiming all devices being removable, but if the need arises we can always introduce a blacklist flag. I agree. Also, I suppose that if someone has a case for checking for media change on non-removable devices, (s)he will bring it up. Therefore: Acked-by: Martin Wilck Regards Martin
Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever
Ping If these types of disks (ones that can fast format between 512 and 4096 bytes) are not out in the field yet, they soon will be. This problem requires a hard reboot (i.e. hold you finger on the power button for 5-10 seconds). Sad Doug Gilbert On 2018-12-24 9:19 p.m., Douglas Gilbert wrote: The kernel first saw this disk as having 4096 logical block size: # lsscs -gs [6:0:2:0] disk HGST HUS726T4TAL4204 C430 /dev/sdd /dev/sg3 4.00TB Then about a minute later it had a logical block size of 512 bytes after: # sg_format --format --ffmt=1 --size=512 /dev/sg3 [The '--ffmt=1' option means fast format and it obviously doesn't visit every block.] The sd driver was not amused and when I did: # fdisk /dev/sdd That xterm froze in an uninterruptible state and from another xterm I could see the log being filled with: [ 1004.409881] sd 6:0:2:0: [sdd] Unaligned partial completion (resid=3584, sector_sz=4096) Rebooted but that locked up the machine, power cycle ... Doug Gilbert
Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever
Doug, > If these types of disks (ones that can fast format between 512 and > 4096 bytes) are not out in the field yet, they soon will be. This > problem requires a hard reboot (i.e. hold you finger on the power > button for 5-10 seconds). Sad Do we get a notification from the drive that inquiry data has changed once the format is complete? -- Martin K. Petersen Oracle Linux Engineering
Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever
On 2019-01-16 11:44 a.m., Martin K. Petersen wrote: Doug, If these types of disks (ones that can fast format between 512 and 4096 bytes) are not out in the field yet, they soon will be. This problem requires a hard reboot (i.e. hold you finger on the power button for 5-10 seconds). Sad Do we get a notification from the drive that inquiry data has changed once the format is complete? I need to reconfigure my box to use FreeBSD with those disks. Then I should be able to answer that question. BTW I'm told by the manufacturer of said disk, that the last firmware update was to production firmware.
Re: [PATCH] sd: skip non-removable devices in sd_check_events()
On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote: > If the device is _not_ removable we should not start the event > poller as the media will not go away. Having the event poller running > will block the open() call as it will try to flush outstanding events, > which it can't if the device is in state 'BLOCKED'. So the open() call > will be stalled until the device state changed, which might be quite > some time depending on the transport. > > Signed-off-by: Hannes Reinecke > --- > drivers/scsi/sd.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index a1a44f52e0e8..521f0a384446 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -1539,6 +1539,9 @@ static unsigned int sd_check_events(struct gendisk > *disk, unsigned int clearing) > return 0; > > sdp = sdkp->device; > + if (!sdp->removable) > + return 0; > + > SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_check_events\n")); > > /* Hi Hannes, Although this patch looks fine to me, wouldn't it be a better approach to cause the events checker not to submit any SCSI commands to blocked devices? That approach should improve the sd behavior for both removable and non-removable devices. Thanks, Bart.
Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever
On 2019-01-16 11:55 a.m., Douglas Gilbert wrote: On 2019-01-16 11:44 a.m., Martin K. Petersen wrote: Doug, If these types of disks (ones that can fast format between 512 and 4096 bytes) are not out in the field yet, they soon will be. This problem requires a hard reboot (i.e. hold you finger on the power button for 5-10 seconds). Sad Do we get a notification from the drive that inquiry data has changed once the format is complete? I need to reconfigure my box to use FreeBSD with those disks. Then I should be able to answer that question. BTW I'm told by the manufacturer of said disk, that the last firmware update was to production firmware. No sign of a Unit Attention on FreeBSD (12.0 Release) from either a 512->4096 or 4096->512 fast format. And fdisk (which is different on FreeBSD) worked when the block size was 512 byte but said it could find the block size when disk's LB size was 4096. sbc4r16 chapter 4.10: "Any time the READ CAPACITY (10) parameter data (see 5.19.2) or the READ CAPACITY (16) parameter data (see 5.20.2) changes (e.g., when a FORMAT UNIT command or a MODE SELECT command causes a change to the logical block length or protection information, or when a vendor specific mechanism causes a change), then the device server shall establish a unit attention condition for the SCSI initiator port (see SAM-5) associated with each I_T nexus, except the I_T nexus on which the command causing the change was received with the additional sense code set to CAPACITY DATA HAS CHANGED." So seeing I'm using the I_T nexus that sent the FORMAT UNIT, then no Unit Attention is expected, as per above. So onto INQUIRY VPD pages: what has changed that will cause a UA? And surely the sd driver or mid-level should "smell a rat" after the first: Unaligned partial completion error message. The disk was unmounted so it should not be a big deal to run initialisation again on it. If it was mounted then that would be an interesting situation, since the user visible data (say just changed to 4096 byte LBs) would be the concatenation of 8 of the previous 512 byte LBs. However there is nothing in SBC-4 that says that will be the case, and I was told by the firmware people, just assume that was an "accident". PI might even survive across LB size changes. Doug Gilbert
Re: [PATCH] sd: skip non-removable devices in sd_check_events()
On Wed, 2019-01-16 at 11:32 -0500, Douglas Gilbert wrote: > On 2019-01-16 5:58 a.m., Martin Wilck wrote: > > On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote: > > > On 1/16/19 11:26 AM, Martin Wilck wrote: > > > > On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote: > > > > > If the device is _not_ removable we should not start the > > > > > event > > > > > poller as the media will not go away. Having the event poller > > > > > running > > > > > will block the open() call as it will try to flush > > > > > outstanding > > > > > events, > > > > > which it can't if the device is in state 'BLOCKED'. So the > > > > > open() > > > > > call > > > > > will be stalled until the device state changed, which might > > > > > be > > > > > quite > > > > > some time depending on the transport. > > > > > > > > > > Signed-off-by: Hannes Reinecke > > > > > --- > > > > >drivers/scsi/sd.c | 3 +++ > > > > >1 file changed, 3 insertions(+) > > > > > > > > Thanks - you lost patience with me, did you? :-) > > > > > > > > I didn't submit this yet because Tejun (adding him to CC) had > > > > explicitly removed this check in commit 2bae0093cab4, and I was > > > > still > > > > trying to understand why. > > > > > > > > Perhaps a follow-up discussion will clarify that. > > > > > > > To my understanding Tejun removed it as the check got moved into > > > set_media_not_present() function. > > > > Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be > > executed for non-removable disks in sd_check_events(), which was > > never > > done in sd_media_changed() beforehand. > > > > > However, I'd be very much interested to hear on how a non- > > > removable > > > device can change ... > > > > Me too. The specs at least don't seem to mandate that only > > removable > > devices can return MEDIUM NOT PRESENT. But even then, it makes only > > little sense to actively test such devices for a media change by > > sending TURs. > > So to be more precise, do you mean the LU disappearing while its > port(s) and containing target device remain online? > > Otherwise a hot unplug meets the bill. Also some curious things can > happen at the transport level. For example I believe T10 have just > accepted a MODE SELECT command that will change a SAS wide port > to narrow port(s) (and vice versa). The device is taken offline for > a programmable period meant to be long enough for the scanning logic > to notice and then brought back in its new configuration. Are these really cases where the sense code would be UNIT ATTENTION / MEDIUM NOT PRESENT or NOT READY / MEDIUM NOT PRESENT? That's all that matters here. Other sorts of device removal would be handled much differently. Cheers, Martin
Re: ufshcd_queuecommand() triggering after ufshcd_suspend()?
On Sun, Jan 13, 2019 at 7:25 PM Zang Leigang wrote: > I think there are two different issues: > > 1. clk_gating's state(including state's trace event) and is_suspended is not > wrapped by ufshcd_is_clkgating_allowed which Hisilicon's kirin platoform > soc does not need but is set and checked in a regular path. > 2. I think SCSI is necessary to block queue to stop sending to ufs after > system suspend or, add a new state for ufs like UFSHCD_STATE_SUSPENDING > or what else. hba->is_sys_suspended is too late to stop > ufshcd_queuecommand > Hey Zang, Thanks for the suggestions. I'm not sure I have enough context to try to come up with a patch. I'll try to take a stab at it, but I'd appreciate it if you have any more specific suggestions or patches for me to try. :) thanks -john
Re: [PATCH v1 0/8] phy: qcom-ufs: Enable regulators to be off in suspend
Quoting Evan Green (2019-01-11 15:01:21) > > Because the UFS PHY reset bit is now toggled in the PHY, rather > than in ufs-qcom, this also percolated to all other PHYs using > ufs-qcom, which from what I can see is just 8996. > > There are a couple of tradeoffs in this series that I'd welcome feedback > on. First, it breaks compatibility with device trees that don't expose > this new reset controller. Making this work with older device trees > would be pretty ugly in the code, and given that the SDM845 UFS DT nodes > aren't accepted upstream yet, the breakage seemed worth it. I'm not as > sure about 8996. It may take some minor surgery to the reset framework, but it should be possible to register a reset lookup that can be used if nothing matches in DT. Right now the reset code looks to only want to use DT if it's there for the device requesting the reset. If there isn't a DT node for the device it will look in the global lookup list. That could be changed to fallback to the global lookup that uses device names (similar to clk framework) when the DT lookup fails. Then it's just a matter of registering the lookup for this reset with the handful of device names that need the non-DT way of finding the reset. Of course, that's another change and if breaking DT is simpler and acceptable then I would say go for the path of least resistance and don't try to modify reset framework for this.
[PATCH v2] scsi: be2iscsi: fix potential NULL pointer dereference
Fix potential NULL pointer dereference wich might happen in beiscsi_alloc_mem(). If kmalloc_array() fails and mem_descr->mem_array is set to NULL, then its dereferencing happens when passing mem_descr->mem_array[] to dma_free_coherent(). Signed-off-by: Dmitry Voytik --- Changes since v1: - prevent double free by setting mem_arr_orig to NULL drivers/scsi/be2iscsi/be_main.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index 74e260027c7d..7b8c80318c38 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -2527,8 +2527,11 @@ static int beiscsi_alloc_mem(struct beiscsi_hba *phba) mem_descr->size_in_bytes = phba->mem_req[i]; mem_descr->mem_array = kmalloc_array(j, sizeof(*mem_arr), GFP_KERNEL); - if (!mem_descr->mem_array) + if (!mem_descr->mem_array) { + mem_descr->mem_array = mem_arr_orig; + mem_arr_orig = NULL; goto free_mem; + } memcpy(mem_descr->mem_array, mem_arr_orig, sizeof(struct mem_array) * j); -- 2.20.1
Re: [PATCH] sd: skip non-removable devices in sd_check_events()
On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote: > If the device is _not_ removable we should not start the event > poller as the media will not go away. Having the event poller running > will block the open() call as it will try to flush outstanding > events, > which it can't if the device is in state 'BLOCKED'. So the open() > call > will be stalled until the device state changed, which might be quite > some time depending on the transport. > > Signed-off-by: Hannes Reinecke I've pondered this some more. Meanwhile I think we should rather fix the generic event handling code in genhd.c. The problem is that this code schedules the check_events() function in some situations (notably, on close(2) via blkdev_put()), even for devices which don't report any supported events in the (struct gendisk)->events field (the sd driver correctly sets the events field depending on the "removable" attribute, but the that only influences the polling behavior, not the call upon close()). It looks weird to me that devices that don't report any supported events would be checked for such events. The reason for this is 7c88a168da80 and follow-ups (7eec77a1, 9fd097b1). For the sake of not confusing userland, these commits sacrificed the information in the generic block layer whether or not a given device actually supports media change events. I'm working on a patch to fix that. Regards, Martin
Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid
On Wed, 2019-01-16 at 10:57 -0500, Douglas Gilbert wrote: > The block layer assumes scsi_request:sense is always a valid > pointer. This is set up once in scsi_mq_init_request() and the > containing scsi_cmnd object is used often, being re-initialized > by scsi_init_command(). That works unless some code re-purposes > part of the scsi_cmnd object for something else. And that is > what bidi handling does in scsi_mq_prep_fn(). The result is an > oops at some later time when the partly overwritten object is > re-used. The overwrite is from d285203cf647d but 'git blame' > does not show removed code, so that commit may not be the > culprit. > > Signed-off-by: Douglas Gilbert > --- > > This was found while injecting errors (thus generating sense data) > into a sequence of bidi commands. At some later time the block > layer blew up with a scsi_request::sense NULL dereference in > sg_rq_end_io(). Without testing I'm confident the bsg driver, > the osd ULD and exofs are exposed to this bug. > > drivers/scsi/scsi_lib.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index b13cc9288ba0..71259bd4040a 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, struct > scsi_cmnd *cmd) > > cmd->device = dev; > cmd->sense_buffer = buf; > + cmd->req.sense = buf; > cmd->prot_sdb = prot; > cmd->flags = flags; > INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler); Hi Doug, The description of this patch does not look correct to me. scsi_init_command() does not overwrite the sense pointer. From the body of that function: /* zero out the cmd, except for the embedded scsi_request */ memset((char *)cmd + sizeof(cmd->req), 0, sizeof(*cmd) - sizeof(cmd->req) + dev->host->hostt->cmd_size); It is not clear to me which code overwrites the sense pointer. I think that needs to be figured out before discussion of this patch can continue. Thanks, Bart.
Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid
On 2019-01-16 6:56 p.m., Bart Van Assche wrote: On Wed, 2019-01-16 at 10:57 -0500, Douglas Gilbert wrote: The block layer assumes scsi_request:sense is always a valid pointer. This is set up once in scsi_mq_init_request() and the containing scsi_cmnd object is used often, being re-initialized by scsi_init_command(). That works unless some code re-purposes part of the scsi_cmnd object for something else. And that is what bidi handling does in scsi_mq_prep_fn(). The result is an oops at some later time when the partly overwritten object is re-used. The overwrite is from d285203cf647d but 'git blame' does not show removed code, so that commit may not be the culprit. Signed-off-by: Douglas Gilbert --- This was found while injecting errors (thus generating sense data) into a sequence of bidi commands. At some later time the block layer blew up with a scsi_request::sense NULL dereference in sg_rq_end_io(). Without testing I'm confident the bsg driver, the osd ULD and exofs are exposed to this bug. drivers/scsi/scsi_lib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index b13cc9288ba0..71259bd4040a 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, struct scsi_cmnd *cmd) cmd->device = dev; cmd->sense_buffer = buf; + cmd->req.sense = buf; cmd->prot_sdb = prot; cmd->flags = flags; INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler); Hi Doug, The description of this patch does not look correct to me. scsi_init_command() does not overwrite the sense pointer. From the body of that function: /* zero out the cmd, except for the embedded scsi_request */ memset((char *)cmd + sizeof(cmd->req), 0, sizeof(*cmd) - sizeof(cmd->req) + dev->host->hostt->cmd_size); It is not clear to me which code overwrites the sense pointer. I think that needs to be figured out before discussion of this patch can continue. Bart, Please re-read the explanation. The two sentences in the middle should tell you that you are looking at the wrong memset(). And I'm waiting for the person who wrote the questionable code to comment. If you don't believe me, try sending a device reset to a scsi_debug device. Then use sg_raw *** to send a XDWRITREAD(10) bidi command to the same scsi_debug device, you should get a Unit Attention concerning that device reset. If you do, keep sending that bidi command. Make sure you have no important files open because that machine will lock solid. Basically what happens when a rq_end_io() callback de-references a NULL pointer. It looks like it has been there since 2014 and took me 2 days to find. Sorry that I can't explain it in one simple sentence. Douglas *** Use 'man sg_raw' and see the EXAMPLES section (second to last example) You can use a bsg device as shown.
Re: [ANNOUNCE] v4 sg driver: ready for testing
There is an update to the SCSI Generic (sg) v4 driver adding synchronous and asynchronous bidi command support. Plus lots of fixes and some minor improvements. See: http://sg.danny.cz/sg/sg_v40.html The kernel code is split in two in the tarball below, one targeting lk 5.0 and the other targeting lk 4.20 and earlier ***. Each section contains the 3 files that represent the sg v4 driver plus a meandering 17 part patchset. Those patchsets reflect the driver's rewrite rather than a logical progression. http://sg.danny.cz/sg/p/sgv4_20190116.tgz Plus there are updated testing utilities in sg3_utils-1.45 (beta, revision 807) at the top of this page: http://sg.danny.cz/sg/index.html Doug Gilbert *** the reason for the split is the tree wide change to the access_ok() function. On 2018-12-25 2:39 a.m., Douglas Gilbert wrote: There is an update to the sg v4 driver with some error fixes, SIGIO and RT signals work plus single READ, multiple WRITE sharing support. See: http://sg.danny.cz/sg/sg_v40.html with testing utilities in sg3_utils-1.45 (beta, revision 802) on the main page: http://sg.danny.cz/sg/index.html Doug Gilbert On 2018-12-18 6:41 p.m., Douglas Gilbert wrote: After an underwhelming response to my intermediate level patchsets to modernize the sg driver in October this year (see "[PATCH 0/8] sg: major cleanup, remove max_queue limit" followed by v2 and v3 between 20181019 and 20181028), I decided to move ahead and add the functionality proposed for the version 4 sg driver. That means accepting interface objects of type 'struct sg_io_v4' (as found in include/uapi/linux/bsg) plus two new ioctls: SG_IOSUBMIT and SG_IORECEIVE as proposed by Linus Torvalds to replace the unloved write(2)/read(2) asynchronous interface . There is a new feature called "sharing" explained in the web page (see below). Yes, there is a patchset available (14 part and growing) but even without explanatory comments at the top of each patch, that patchset is 4 times larger than the v4 sg driver (i.e. the finished product) and over 6 times larger than the original v3 sg driver! Part of the reason for the patchset size is the multiple backtracks and rewrites associated with a real development process. The cleanest patchset would have 3 parts: 1) split the current include/scsi/sg.h into the end product headers: include/uapi/scsi/sg.h and include/scsi/sg.h 2) delete drivers/scsi/sg.c 3) add the v4 drivers/scsi/sg.c After part 2) you could build a kernel and I can guarantee that no-one will be able to find any sg driver bugs but some users might get upset (but not the Linux security folks). So there is a working v4 sg driver discussed here, with a download: http://sg.danny.cz/sg/sg_v40.html I will keep that page up to date while the driver is in this phase. There is a sg3_utils beta of 1.45 (revision 799) package in the News section at the top of the main page: http://sg.danny.cz/sg/index.html That sg3_utils beta package will use the v4 sg interface via sg devices if the v4 driver is detected. There are also three test utilities in the 'testing' directory designed to exercise the v4 extensions. The degree of backward compatibility with the v3 driver should be high but there are limits to backward compatibility. As an example, it is possible that there are user apps that depend on hitting the 16 outstanding command limit (per fd) in the v3 driver and go "wild" when v4 removes that ceiling. If so, a "high_v3_compat" driver option could be added to put that ceiling back. The only way to find out is for folks to try and if there is a failure, contact me, or send mail to this list. Code reviews welcome as well. Doug Gilbert I felt this was a better use of my time than trying to invent a new debug/trace mechanism for the whole SCSI subsystem. That is what _SCSI_ system maintainers are for, I'll stick to the sg driver (and scsi_debug). Add user space tools and there is more than enough work there ...
Re: [PATCH fix] scsi_lib: make sure scsi_request.sense valid
On Wed, 2019-01-16 at 19:54 -0500, Douglas Gilbert wrote: > On 2019-01-16 6:56 p.m., Bart Van Assche wrote: > > On Wed, 2019-01-16 at 10:57 -0500, Douglas Gilbert wrote: > > > The block layer assumes scsi_request:sense is always a valid > > > pointer. This is set up once in scsi_mq_init_request() and the > > > containing scsi_cmnd object is used often, being re-initialized > > > by scsi_init_command(). That works unless some code re-purposes > > > part of the scsi_cmnd object for something else. And that is > > > what bidi handling does in scsi_mq_prep_fn(). The result is an > > > oops at some later time when the partly overwritten object is > > > re-used. The overwrite is from d285203cf647d but 'git blame' > > > does not show removed code, so that commit may not be the > > > culprit. > > > > > > Signed-off-by: Douglas Gilbert > > > --- > > > > > > This was found while injecting errors (thus generating sense data) > > > into a sequence of bidi commands. At some later time the block > > > layer blew up with a scsi_request::sense NULL dereference in > > > sg_rq_end_io(). Without testing I'm confident the bsg driver, > > > the osd ULD and exofs are exposed to this bug. > > > > > > drivers/scsi/scsi_lib.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > > index b13cc9288ba0..71259bd4040a 100644 > > > --- a/drivers/scsi/scsi_lib.c > > > +++ b/drivers/scsi/scsi_lib.c > > > @@ -1175,6 +1175,7 @@ void scsi_init_command(struct scsi_device *dev, > > > struct scsi_cmnd *cmd) > > > > > > cmd->device = dev; > > > cmd->sense_buffer = buf; > > > + cmd->req.sense = buf; > > > cmd->prot_sdb = prot; > > > cmd->flags = flags; > > > INIT_DELAYED_WORK(&cmd->abort_work, scmd_eh_abort_handler); > > > > Hi Doug, > > > > The description of this patch does not look correct to me. > > scsi_init_command() > > does not overwrite the sense pointer. From the body of that function: > > > > /* zero out the cmd, except for the embedded scsi_request */ > > memset((char *)cmd + sizeof(cmd->req), 0, > > sizeof(*cmd) - sizeof(cmd->req) + dev->host->hostt->cmd_size); > > > > It is not clear to me which code overwrites the sense pointer. I think that > > needs to be figured out before discussion of this patch can continue. > > Bart, > Please re-read the explanation. The two sentences in the middle should tell > you that you are looking at the wrong memset(). > > And I'm waiting for the person who wrote the questionable code to comment. > > > If you don't believe me, try sending a device reset to a scsi_debug device. > Then use sg_raw *** to send a XDWRITREAD(10) bidi command to the same > scsi_debug > device, you should get a Unit Attention concerning that device reset. If > you do, keep sending that bidi command. Make sure you have no important > files open because that machine will lock solid. Basically what happens > when a rq_end_io() callback de-references a NULL pointer. It looks like > it has been there since 2014 and took me 2 days to find. Sorry that I can't > explain it in one simple sentence. Hi Doug, Is this perhaps the memset you are referring to? memset(bidi_sdb, 0, sizeof(struct scsi_data_buffer)); Bart.
Re: lk 4.20.0 sd: 'Unaligned partial completion (resid=3584, sector_sz=4096)' forever
Doug, > except the I_T nexus on which the command causing the change was > received with the additional sense code set to CAPACITY DATA HAS > CHANGED." > > So seeing I'm using the I_T nexus that sent the FORMAT UNIT, then no > Unit Attention is expected, as per above. But that assumes that there is a single entity sending commands. Which is not the case here given that sg, unbeknownst to sd, is sending a FORMAT UNIT. We are not intercepting commands sent via sg to see if they may affect the ULDs. So we need some other way of finding out that the device has been reformatted behind our backs. > And surely the sd driver or mid-level should "smell a rat" after the > first: Unaligned partial completion error message. The disk was > unmounted so it should not be a big deal to run initialisation again > on it. While I agree that this particular error should be a red flag, we'll only see it if the new format deviates significantly from the old one. So smelling rats seems like papering over a limited subset of the possible issues resulting from a FORMAT UNIT. I wonder if it wouldn't be better to teach sg_format to revalidate a device once FORMAT UNIT completes? > If it was mounted then that would be an interesting situation, since > the user visible data (say just changed to 4096 byte LBs) would be the > concatenation of 8 of the previous 512 byte LBs. Maybe sg_format should check whether a block device is in use before it proceeds to send FORMAT UNIT? -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v1 5/8] scsi: ufs: qcom: Expose the reset controller for PHY
Kishon, > On 12/01/19 4:31 AM, Evan Green wrote: >> Expose a reset controller that the phy can use to perform its >> initialization in a single callback. >> >> Also, change the use of the phy functions from ufs-qcom such that >> phy_poweron actually fires up the phy, and phy_poweroff actually >> powers it down. >> >> Signed-off-by: Evan Green > > Can I get Ack for this patch from SCSI MAINTAINERS? No objection from me if there is general consensus that moving reset to the phy is the right thing to do. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v2 0/7] sd patches for kernel v5.1
Bart, > If any changes have been left out that is the result of an > oversight. Can you clarify which changes you think are missing? It appears you got everything. Things were just sliced and diced slightly differently. I'll run some tests and apply unless something breaks. Thanks again for dusting these off! -- Martin K. Petersen Oracle Linux Engineering
Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2
This patch series are against to mkp's 5.1/scsi-queue. 1. Due to dma_zalloc_coherent will be phase out, so use dma_alloc_coherent to replace it. 2. Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2. 3. Update driver version to v1.40.00.10-20190116 ---
[PATCH 1/3] scsi: arcmsr: Use dma_alloc_coherent to replace dma_zalloc_coherent
>From Ching Huang Due to dma_zalloc_coherent will be phase out, so use dma_alloc_coherent to replace it. Signed-off-by: Ching Huang --- diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c index 9f85d5a..5736434 100755 --- a/drivers/scsi/arcmsr/arcmsr_hba.c +++ b/drivers/scsi/arcmsr/arcmsr_hba.c @@ -642,7 +642,7 @@ static bool arcmsr_alloc_io_queue(struct AdapterControlBlock *acb) switch (acb->adapter_type) { case ACB_ADAPTER_TYPE_B: { acb->ioqueue_size = roundup(sizeof(struct MessageUnit_B), 32); - dma_coherent = dma_zalloc_coherent(&pdev->dev, acb->ioqueue_size, + dma_coherent = dma_alloc_coherent(&pdev->dev, acb->ioqueue_size, &dma_coherent_handle, GFP_KERNEL); if (!dma_coherent) { pr_notice("arcmsr%d: DMA allocation failed\n", acb->host->host_no); @@ -656,7 +656,7 @@ static bool arcmsr_alloc_io_queue(struct AdapterControlBlock *acb) break; case ACB_ADAPTER_TYPE_D: { acb->ioqueue_size = roundup(sizeof(struct MessageUnit_D), 32); - dma_coherent = dma_zalloc_coherent(&pdev->dev, acb->ioqueue_size, + dma_coherent = dma_alloc_coherent(&pdev->dev, acb->ioqueue_size, &dma_coherent_handle, GFP_KERNEL); if (!dma_coherent) { pr_notice("arcmsr%d: DMA allocation failed\n", acb->host->host_no); @@ -672,7 +672,7 @@ static bool arcmsr_alloc_io_queue(struct AdapterControlBlock *acb) uint32_t completeQ_size; completeQ_size = sizeof(struct deliver_completeQ) * ARCMSR_MAX_HBE_DONEQUEUE + 128; acb->ioqueue_size = roundup(completeQ_size, 32); - dma_coherent = dma_zalloc_coherent(&pdev->dev, acb->ioqueue_size, + dma_coherent = dma_alloc_coherent(&pdev->dev, acb->ioqueue_size, &dma_coherent_handle, GFP_KERNEL); if (!dma_coherent){ pr_notice("arcmsr%d: DMA allocation failed\n", acb->host->host_no);
[PATCH 2/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2
>From Ching Huang Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2. Signed-off-by: Ching Huang --- diff --git a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h index a94c513..b98c632 100755 --- a/drivers/scsi/arcmsr/arcmsr.h +++ b/drivers/scsi/arcmsr/arcmsr.h @@ -508,9 +508,9 @@ struct MessageUnit_A struct MessageUnit_B { uint32_tpost_qbuffer[ARCMSR_MAX_HBB_POSTQUEUE]; - uint32_tdone_qbuffer[ARCMSR_MAX_HBB_POSTQUEUE]; + volatile uint32_t done_qbuffer[ARCMSR_MAX_HBB_POSTQUEUE]; uint32_tpostq_index; - uint32_tdoneq_index; + volatile uint32_t doneq_index; uint32_t__iomem *drv2iop_doorbell; uint32_t__iomem *drv2iop_doorbell_mask; uint32_t__iomem *iop2drv_doorbell; diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c index 5736434..88053b1 100755 --- a/drivers/scsi/arcmsr/arcmsr_hba.c +++ b/drivers/scsi/arcmsr/arcmsr_hba.c @@ -1113,7 +1113,11 @@ static int arcmsr_resume(struct pci_dev *pdev) switch (acb->adapter_type) { case ACB_ADAPTER_TYPE_B: { struct MessageUnit_B *reg = acb->pmuB; - reg->post_qbuffer[0] = 0; + uint32_t i; + for (i = 0; i < ARCMSR_MAX_HBB_POSTQUEUE; i++) { + reg->post_qbuffer[i] = 0; + reg->done_qbuffer[i] = 0; + } reg->postq_index = 0; reg->doneq_index = 0; break;
[PATCH 3/3] scsi: arcmsr: Update driver version to v1.40.00.10-20190116
>From Ching Huang Update driver version to v1.40.00.10-20190116. Signed-off-by: Ching Huang --- diff --git a/drivers/scsi/arcmsr/arcmsr.h b/drivers/scsi/arcmsr/arcmsr.h index b98c632..6033bcc 100755 --- a/drivers/scsi/arcmsr/arcmsr.h +++ b/drivers/scsi/arcmsr/arcmsr.h @@ -49,7 +49,7 @@ struct device_attribute; #define ARCMSR_MAX_OUTSTANDING_CMD 1024 #define ARCMSR_DEFAULT_OUTSTANDING_CMD 128 #define ARCMSR_MIN_OUTSTANDING_CMD 32 -#define ARCMSR_DRIVER_VERSION "v1.40.00.10-20181217" +#define ARCMSR_DRIVER_VERSION "v1.40.00.10-20190116" #define ARCMSR_SCSI_INITIATOR_ID 255 #define ARCMSR_MAX_XFER_SECTORS512 #define ARCMSR_MAX_XFER_SECTORS_B 4096
[PATCH 0/3] scsi: arcmsr: Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2
This patch series are against to mkp's 5.1/scsi-queue. 1. Due to dma_zalloc_coherent will be phase out, so use dma_alloc_coherent to replace it. 2. Fix suspend/resume of ACB_ADAPTER_TYPE_B part 2. 3. Update driver version to v1.40.00.10-20190116 ---
RE: [PATCH] scsi: ufs: revamp string descriptor reading
> > Hello Tomas, > > > > > > > > Define new a type: uc_string_id for easier string handling and less > > > casting. Reduce number or string copies in price of a dynamic > > > allocation. > > > > > > Signed-off-by: Tomas Winkler > > Tested-by: Avri Altman > > > > Just one nit - doesn't really matters. > > > > Cheers, > > Avri > > > > > --- > > > drivers/scsi/ufs/ufs-sysfs.c | 20 ++--- > > > drivers/scsi/ufs/ufs.h | 2 +- > > > drivers/scsi/ufs/ufshcd.c| 164 +-- > > > drivers/scsi/ufs/ufshcd.h| 9 +- > > > 4 files changed, 115 insertions(+), 80 deletions(-) > > > > > > > > > > ufs_fixup_device_setup(hba, &card); > > > + ufs_put_device_desc(&card); > > ufs_get_device_desc() and ufs_put_device_desc() actually serves the > > quirks setup. > > Make sense to call it from within so the logic is clear and in one place. > > Might also save ufs_put_device_desc(). > > You are right from the current perspective , just I'd need it also for the > RPMB > patches that should follow, then it will have bit larger span than the quirks. > Is this okay, can we merge? Thanks Tomas