[PATCH -next] advansys: Add missing include file
Fix: drivers/scsi/advansys.c: In function 'adv_isr_callback': drivers/scsi/advansys.c:6089:3: error: implicit declaration of function 'dma_pool_free' drivers/scsi/advansys.c: In function 'adv_get_sglist': drivers/scsi/advansys.c:7654:3: error: implicit declaration of function 'dma_pool_alloc' drivers/scsi/advansys.c: In function 'advansys_wide_init_chip': drivers/scsi/advansys.c:10866:2: error: implicit declaration of function 'dma_pool_create' Fixes: 0ce538226b7a ("advansys: Use dma_pool for sg elements") Cc: Hannes Reinecke Signed-off-by: Guenter Roeck --- drivers/scsi/advansys.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c index 14d3aa54daa5..af2cb5e9e3f7 100644 --- a/drivers/scsi/advansys.c +++ b/drivers/scsi/advansys.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] scsi: Fix qemu boot hang problem
The latest kernel fails to boot qemu arm images when using scsi for disk access. Boot gets stuck after the following messages. brd: module loaded sym53c8xx :00:0c.0: enabling device (0100 -> 0103) sym0: <895a> rev 0x0 at pci :00:0c.0 irq 93 sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking sym0: SCSI BUS has been reset. scsi host0: sym-2.2.3 Bisect points to commit 71e75c97f97a ("scsi: convert device_busy to atomic_t"). Code inspection shows the following suspicious change in scsi_request_fn. out_delay: - if (sdev->device_busy == 0 && !scsi_device_blocked(sdev)) + if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) blk_delay_queue(q, SCSI_QUEUE_DELAY); } 'sdev->device_busy == 0' was replaced with 'atomic_read(&sdev->device_busy)', meaning the logic was reversed. Changing this expression to '!atomic_read(&sdev->device_busy)' fixes the problem. Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Webb Scales Cc: Jens Axboe Signed-off-by: Guenter Roeck --- drivers/scsi/scsi_lib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 9c44392..ce62e87 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1774,7 +1774,7 @@ static void scsi_request_fn(struct request_queue *q) blk_requeue_request(q, req); atomic_dec(&sdev->device_busy); out_delay: - if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) + if (!atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) blk_delay_queue(q, SCSI_QUEUE_DELAY); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Fix qemu boot hang problem
On 08/11/2014 08:33 AM, Venkatesh Srinivas wrote: Should we wrap the sdev->device_busy read and test in a new scsi_device_busy(), so as to preserve the old polarity? I don't see a difference in polarity, or at least replacing '== 0' with '!' is not a reverse in polarity for me. Anyway, I don't really care if or how this is wrapped, as long as it is getting fixed. Thanks, Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] uas: replace WARN_ON_ONCE() with assert_spin_locked().
On Mon, Aug 11, 2014 at 01:29:26PM +0530, Sanjeev Sharma wrote: > spin_is_locked() always return false in uniprocessor configuration and > therefore it > would be advise to repalce with assert_spin_locked(). > > Signed-off-by: Sanjeev Sharma > --- > drivers/usb/storage/uas.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c > index 3f42785..8e5877d 100644 > --- a/drivers/usb/storage/uas.c > +++ b/drivers/usb/storage/uas.c > @@ -154,7 +154,7 @@ static void uas_mark_cmd_dead(struct uas_dev_info > *devinfo, > struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, SCp); > > uas_log_cmd_state(cmnd, caller); > - WARN_ON_ONCE(!spin_is_locked(&devinfo->lock)); > + assert_spin_locked(&devinfo->lock); Seems to me that replacing WARN_ON_ONCE (which may be annoying but only creates a traceback, and only once) with assert_spin_locked (which crashes the kernel) is a bit drastic. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] uas: replace WARN_ON_ONCE() with assert_spin_locked().
On Mon, Aug 11, 2014 at 08:55:07PM +0200, Hans de Goede wrote: > Hi, > > On 08/11/2014 08:19 PM, Guenter Roeck wrote: > > On Mon, Aug 11, 2014 at 01:29:26PM +0530, Sanjeev Sharma wrote: > >> spin_is_locked() always return false in uniprocessor configuration and > >> therefore it > >> would be advise to repalce with assert_spin_locked(). > >> > >> Signed-off-by: Sanjeev Sharma > >> --- > >> drivers/usb/storage/uas.c | 8 > >> 1 file changed, 4 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c > >> index 3f42785..8e5877d 100644 > >> --- a/drivers/usb/storage/uas.c > >> +++ b/drivers/usb/storage/uas.c > >> @@ -154,7 +154,7 @@ static void uas_mark_cmd_dead(struct uas_dev_info > >> *devinfo, > >>struct scsi_cmnd *cmnd = container_of(scp, struct scsi_cmnd, SCp); > >> > >>uas_log_cmd_state(cmnd, caller); > >> - WARN_ON_ONCE(!spin_is_locked(&devinfo->lock)); > >> + assert_spin_locked(&devinfo->lock); > > > > Seems to me that replacing WARN_ON_ONCE (which may be annoying but only > > creates a traceback, and only once) with assert_spin_locked (which > > crashes the kernel) is a bit drastic. > > I can see your point, but so far these paranoia checks have never triggered, > and having them trigger _always_ one some uni-processor (which is the reason > for this patch) to me seems the worse problem of the 2. > If those are just paranoia checks, it might make sense to use lockdep_assert_held() to reduce runtime overhead if lockdep debugging is disabled. > Ideally we would have a warn_spin_not_locked or such ... > There is WARN_ON_SMP, which might have been a better choice if the _ONCE isn't that important (which one should think if it is ok to crash the kernel if the problem is seen). Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] uas: replace WARN_ON_ONCE() with lockdep_assert_held()
On Tue, Aug 12, 2014 at 02:01:53PM +0800, Greg KH wrote: > On Tue, Aug 12, 2014 at 11:38:37AM +0530, Sanjeev Sharma wrote: > > spin_is_locked() always return false in uniprocessor configuration and > > therefore it > > would be advise to replace with lockdep_assert_held(). > > Add "on some architectures" in here somewhere, as it's not broken on the > large majority of UP cpus :) > FWIW, it is confirmed broken on mips (32 and 64 bit), ppc, and sparc64. I have not tested on x86. Might be worth trying. arm64 seems to be ok, unless I did something wrong in my test, as well as at least some of the architectures which don't support smp to start with (such as sparc32 or microblaze). Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Fix qemu boot hang problem
ping ... the problem fixed by this patch still affects the upstream kernel (v3.16-11383-gc9d2642) as well as -next (20140815). Guenter On 08/10/2014 05:54 AM, Guenter Roeck wrote: The latest kernel fails to boot qemu arm images when using scsi for disk access. Boot gets stuck after the following messages. brd: module loaded sym53c8xx :00:0c.0: enabling device (0100 -> 0103) sym0: <895a> rev 0x0 at pci :00:0c.0 irq 93 sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking sym0: SCSI BUS has been reset. scsi host0: sym-2.2.3 Bisect points to commit 71e75c97f97a ("scsi: convert device_busy to atomic_t"). Code inspection shows the following suspicious change in scsi_request_fn. out_delay: - if (sdev->device_busy == 0 && !scsi_device_blocked(sdev)) + if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) blk_delay_queue(q, SCSI_QUEUE_DELAY); } 'sdev->device_busy == 0' was replaced with 'atomic_read(&sdev->device_busy)', meaning the logic was reversed. Changing this expression to '!atomic_read(&sdev->device_busy)' fixes the problem. Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Webb Scales Cc: Jens Axboe Signed-off-by: Guenter Roeck --- drivers/scsi/scsi_lib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 9c44392..ce62e87 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1774,7 +1774,7 @@ static void scsi_request_fn(struct request_queue *q) blk_requeue_request(q, req); atomic_dec(&sdev->device_busy); out_delay: - if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) + if (!atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) blk_delay_queue(q, SCSI_QUEUE_DELAY); } -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Fix qemu boot hang problem
On 08/15/2014 12:22 PM, Christoph Hellwig wrote: On Fri, Aug 15, 2014 at 12:34:22PM -0600, Jens Axboe wrote: On 2014-08-15 12:22, Guenter Roeck wrote: ping ... the problem fixed by this patch still affects the upstream kernel (v3.16-11383-gc9d2642) as well as -next (20140815). James, could you upstream this one in time for -rc1? It's in the core-for-3.17 updates I sent to James yesterday, just needs to be forwarded.. Great, thanks a lot! Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] scsi: two SG_CHAIN related fixes
On Tue, Jun 04, 2019 at 04:23:06PM +0800, Ming Lei wrote: > Hi, > > Guenter reported scsi boot issue caused by commit c3288dd8c232 > ("scsi: core: avoid pre-allocating big SGL for data"). > > Turns out there are at least two issues. > > The 1st patch fixes issue in case that NO_SG_CHAIN on some ARCHs, > such as alpha, arm and parisc. > > The 2nd patch makes esp scsi working with SG_CHAIN. > Both patches applied on top of next-20190604. Results on alpha: Waiting for root device /dev/sda... [ cut here ] WARNING: CPU: 0 PID: 7 at mm/slab.h:359 kmem_cache_free+0x120/0x2a0 virt_to_cache: Object is not a Slab page! Modules linked in: CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 5.2.0-rc3-next-20190604-2-gab1fef861ee2 #1 fc000787fb38 fc0007d12658 fc333d50 fc4205b0 fc333dd8 fc0007eb82a8 fc3c3234 fceedb20 0080 0001 0167 fc4205b0 fc4205b0 fc0007d908f8 fccbe231 fc000787fbf8 0018 fc6d8e38 fc0007d908f8 fcb8d610 0001 005a Trace: [] __warn+0x160/0x190 [] kmem_cache_free+0x120/0x2a0 [] warn_slowpath_fmt+0x58/0x70 [] mempool_free_slab+0x24/0x40 [] kmem_cache_free+0x120/0x2a0 [] kmem_cache_free+0x120/0x2a0 [] bio_free+0x98/0xc0 [] mempool_free_slab+0x24/0x40 [] mempool_free+0x48/0x100 [] sg_pool_free+0x9c/0xd0 [] sg_pool_free+0x0/0xd0 [] __sg_free_table+0xb0/0xf0 [] sg_free_table_chained+0x48/0x60 [] scsi_mq_uninit_cmd+0xc0/0xd0 [] scsi_end_request+0x94/0x270 [] scsi_io_completion+0xd0/0x740 [] scsi_finish_command+0x104/0x160 [] scsi_finish_command+0x40/0x160 [] scsi_softirq_done+0x18c/0x1c0 [] blk_done_softirq+0xbc/0xf0 [] run_ksoftirqd+0x48/0x80 [] smpboot_thread_fn+0x184/0x230 [] kthread+0x168/0x1d0 [] smpboot_thread_fn+0x0/0x230 [] ret_from_kernel_thread+0x18/0x20 [] kthread+0x0/0x1d0 ---[ end trace 1fadf5f1c983cdae ]--- scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK2.5+ PQ: 0 ANSI: 5 sd 0:0:0:0: Power-on or device reset occurred sd 0:0:0:0: [sda] 20480 512-byte logical blocks: (10.5 MB/10.0 MiB) random: crng init done sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] Attached SCSI disk EXT4-fs (sda): mounted filesystem without journal. Opts: (null) VFS: Mounted root (ext4 filesystem) readonly on device 8:0. Unable to handle kernel paging request at virtual address kworker/0:1H(45): Oops 1 pc = [] ra = [] ps = Tainted: G W pc is at loop+0x0/0x10 ra is at scsi_queue_rq+0x6fc/0xaa0 v0 = t0 = 0060 t1 = 0004 t2 = 000c t3 = 98010028 t4 = t5 = 0060 t6 = fc0007eb83d4 t7 = fc0007ed s0 = s1 = fc0007eb82c0 s2 = fc0007d72998 s3 = fc0007dd2b28 s4 = fc0007eb83d0 s5 = fc000782a328 s6 = fc000782a328 a0 = a1 = a2 = a3 = a4 = a5 = fc0007edbeb0 t8 = fc0007eb83d3 t9 = 0002 t10= fc0007d12068 t11= pv = fcb7f6e0 at = gp = fcee88d8 sp = f0901f3d Disabling lock debugging due to kernel taint Trace: [] blk_mq_dispatch_rq_list+0x454/0x810 [] blk_mq_dispatch_rq_list+0x3dc/0x810 [] blk_mq_do_dispatch_sched+0xc4/0x170 [] blk_mq_do_dispatch_sched+0x60/0x170 [] blk_mq_do_dispatch_sched+0x90/0x170 [] blk_mq_sched_dispatch_requests+0x140/0x220 [] __blk_mq_run_hw_queue+0x6c/0x250 [] process_one_work+0x1f0/0x520 [] worker_thread+0x60/0x750 [] kthread+0x168/0x1d0 [] worker_thread+0x0/0x750 [] ret_from_kernel_thread+0x18/0x20 [] kthread+0x0/0x1d0 On the plus side, boot tests test are passing with m68k and sparc, so "scsi: esp: make it working on SG_CHAIN" seems to work. Guenter
Re: [PATCH 2/2] scsi: esp: make it working on SG_CHAIN
On Tue, Jun 04, 2019 at 04:23:08PM +0800, Ming Lei wrote: > The driver supporses that there isn't sg chain, and itereate the > list one by one. This way is obviously wrong. > > Fixes it by sgl helper. > > Cc: Christoph Hellwig > Cc: Bart Van Assche > Cc: Ewan D. Milne > Cc: Hannes Reinecke > Cc: Guenter Roeck > Reported-by: Guenter Roeck > Signed-off-by: Ming Lei Tested-by: Guenter Roeck > --- > drivers/scsi/esp_scsi.c | 32 +--- > 1 file changed, 25 insertions(+), 7 deletions(-) > > diff --git a/drivers/scsi/esp_scsi.c b/drivers/scsi/esp_scsi.c > index 76e7ca864d6a..58b4e059dcfb 100644 > --- a/drivers/scsi/esp_scsi.c > +++ b/drivers/scsi/esp_scsi.c > @@ -371,6 +371,7 @@ static void esp_map_dma(struct esp *esp, struct scsi_cmnd > *cmd) > struct esp_cmd_priv *spriv = ESP_CMD_PRIV(cmd); > struct scatterlist *sg = scsi_sglist(cmd); > int total = 0, i; > + struct scatterlist *sgt; > > if (cmd->sc_data_direction == DMA_NONE) > return; > @@ -381,14 +382,15 @@ static void esp_map_dma(struct esp *esp, struct > scsi_cmnd *cmd) >* a dma address, so perform an identity mapping. >*/ > spriv->num_sg = scsi_sg_count(cmd); > - for (i = 0; i < spriv->num_sg; i++) { > - sg[i].dma_address = (uintptr_t)sg_virt(&sg[i]); > - total += sg_dma_len(&sg[i]); > + > + scsi_for_each_sg(cmd, sgt, spriv->num_sg, i) { > + sgt->dma_address = (uintptr_t)sg_virt(sgt); > + total += sg_dma_len(sgt); > } > } else { > spriv->num_sg = scsi_dma_map(cmd); > - for (i = 0; i < spriv->num_sg; i++) > - total += sg_dma_len(&sg[i]); > + scsi_for_each_sg(cmd, sgt, spriv->num_sg, i) > + total += sg_dma_len(sgt); > } > spriv->cur_residue = sg_dma_len(sg); > spriv->cur_sg = sg; > @@ -444,7 +446,7 @@ static void esp_advance_dma(struct esp *esp, struct > esp_cmd_entry *ent, > p->tot_residue = 0; > } > if (!p->cur_residue && p->tot_residue) { > - p->cur_sg++; > + p->cur_sg = sg_next(p->cur_sg); > p->cur_residue = sg_dma_len(p->cur_sg); > } > } > @@ -1610,6 +1612,22 @@ static void esp_msgin_extended(struct esp *esp) > scsi_esp_cmd(esp, ESP_CMD_SATN); > } > > +static struct scatterlist *esp_sg_prev(struct scsi_cmnd *cmd, > + struct scatterlist *sg) > +{ > + int i; > + struct scatterlist *tmp; > + struct scatterlist *prev = NULL; > + > + scsi_for_each_sg(cmd, tmp, scsi_sg_count(cmd), i) { > + if (tmp == sg) > + break; > + prev = tmp; > + } > + > + return prev; > +} > + > /* Analyze msgin bytes received from target so far. Return non-zero > * if there are more bytes needed to complete the message. > */ > @@ -1647,7 +1665,7 @@ static int esp_msgin_process(struct esp *esp) > spriv = ESP_CMD_PRIV(ent->cmd); > > if (spriv->cur_residue == sg_dma_len(spriv->cur_sg)) { > - spriv->cur_sg--; > + spriv->cur_sg = esp_sg_prev(ent->cmd, spriv->cur_sg); > spriv->cur_residue = 1; > } else > spriv->cur_residue++; > -- > 2.20.1 >
Re: [PATCH 1/2] scsi: core: don't pre-allocate small SGL in case of NO_SG_CHAIN
On Tue, Jun 04, 2019 at 04:23:07PM +0800, Ming Lei wrote: > The pre-allocated small SGL depends on SG_CHAIN, so if the ARCH doesn't > support SG_CHAIN, pre-allocation of small SGL can't work at all. > > Fixes this issue by not using small pre-allocation in case of > NO_SG_CHAIN. > > Cc: Christoph Hellwig > Cc: Bart Van Assche > Cc: Ewan D. Milne > Cc: Hannes Reinecke > Cc: Guenter Roeck > Fixes: c3288dd8c232 ("scsi: core: avoid pre-allocating big SGL for data") > Reported-by: Guenter Roeck > Signed-off-by: Ming Lei alpha images still crash with this patch applied (see response to patch 0/2) for traceback). Guenter
Re: [PATCH 0/2] scsi: two SG_CHAIN related fixes
On Tue, Jun 04, 2019 at 11:18:44PM +0800, Ming Lei wrote: > Hi Guenter, > > Thanks for your test and report! > > On Tue, Jun 04, 2019 at 06:38:49AM -0700, Guenter Roeck wrote: > > On Tue, Jun 04, 2019 at 04:23:06PM +0800, Ming Lei wrote: > > > Hi, > > > > > > Guenter reported scsi boot issue caused by commit c3288dd8c232 > > > ("scsi: core: avoid pre-allocating big SGL for data"). > > > > > > Turns out there are at least two issues. > > > > > > The 1st patch fixes issue in case that NO_SG_CHAIN on some ARCHs, > > > such as alpha, arm and parisc. > > > > > > The 2nd patch makes esp scsi working with SG_CHAIN. > > > > > > > Both patches applied on top of next-20190604. > > > > Results on alpha: > > > > Waiting for root device /dev/sda... > > [ cut here ] > > WARNING: CPU: 0 PID: 7 at mm/slab.h:359 kmem_cache_free+0x120/0x2a0 > > virt_to_cache: Object is not a Slab page! > > Please apply the following patch against the posted two: > > diff --git a/lib/sg_pool.c b/lib/sg_pool.c > index 47eecbe094d8..e042a1722615 100644 > --- a/lib/sg_pool.c > +++ b/lib/sg_pool.c > @@ -122,7 +122,7 @@ int sg_alloc_table_chained(struct sg_table *table, int > nents, > } > > /* User supposes that the 1st SGL includes real entry */ > - if (nents_first_chunk == 1) { > + if (nents_first_chunk <= 1) { > first_chunk = NULL; > nents_first_chunk = 0; > } > That fixes the problem. Thanks, Guenter
Re: [PATCH V2 0/3] scsi: three SG_CHAIN related fixes
On Wed, Jun 05, 2019 at 09:06:20AM +0800, Ming Lei wrote: > Hi, > > Guenter reported scsi boot issue caused by commit c3288dd8c232 > ("scsi: core: avoid pre-allocating big SGL for data"). > > Turns out there are at least three issues. > > The 1st patch fixes sg_alloc_table_chained() which may try to use > the pre-allocation SGL even though user passes zero to 'nents_first_chunk'. > > The 2nd patch fixes issue in case that NO_SG_CHAIN on some ARCHs, > such as alpha, arm and parisc. > > The 3rd patch makes esp scsi working with SG_CHAIN. > > V2: > - add the patch1, which is verified by Guenter > - add .prv_sg to store the previous sg for esp_scsi > > Ming Lei (3): > scsi: lib/sg_pool.c: clear 'first_chunk' in case of no pre-allocation > scsi: core: don't pre-allocate small SGL in case of NO_SG_CHAIN > scsi: esp: make it working on SG_CHAIN > Running my tests on next-20190605, I get: Qemu test results: total: 349 pass: 296 fail: 53 The same tests on next-20190605 plus this series results in: Qemu test results: total: 349 pass: 347 fail: 2 Failed tests: sh:rts7751r2dplus_defconfig:usb:rootfs sh:rts7751r2dplus_defconfig:usb-hub:rootfs The remaining failures are consistent across re-runs. The failure is only seen when booting from usb using the sm501-usb controller (see below). usb 1-2.1: reset full-speed USB device number 4 using sm501-usb sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 08 7c 00 00 f0 00 print_req_error: I/O error, dev sda, sector 2172 flags 80700 usb 1-2.1: reset full-speed USB device number 4 using sm501-usb sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 01 da 00 00 f0 00 print_req_error: I/O error, dev sda, sector 474 flags 84700 usb 1-2.1: reset full-speed USB device number 4 using sm501-usb sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 02 da 00 00 f0 00 print_req_error: I/O error, dev sda, sector 730 flags 84700 usb 1-2.1: reset full-speed USB device number 4 using sm501-usb sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 0b 50 00 00 f0 00 print_req_error: I/O error, dev sda, sector 2896 flags 84700 Presumably that means that either the sm501-usb emulation has a subtle bug associated with SG, or something is wrong with the sm501-usb driver. The qemu command line in the failure case is: qemu-system-sh4 -M r2d \ -kernel ./arch/sh/boot/zImage \ -snapshot \ -usb -device usb-storage,drive=d0 \ -drive file=rootfs.ext2,if=none,id=d0,format=raw \ -append 'panic=-1 slub_debug=FZPUA root=/dev/sda rootwait console=ttySC1,115200 earlycon=scif,mmio16,0xffe8 noiotrap' \ -serial null -serial stdio \ -net nic,model=rtl8139 -net user -nographic -monitor null I'll be happy to provide kernel configuration and a pointer to the root file system if needed. Thanks, Guenter
Re: [PATCH V2 0/3] scsi: three SG_CHAIN related fixes
On Wed, Jun 05, 2019 at 09:25:37AM -0700, Guenter Roeck wrote: > On Wed, Jun 05, 2019 at 09:06:20AM +0800, Ming Lei wrote: > > Hi, > > > > Guenter reported scsi boot issue caused by commit c3288dd8c232 > > ("scsi: core: avoid pre-allocating big SGL for data"). > > > > Turns out there are at least three issues. > > > > The 1st patch fixes sg_alloc_table_chained() which may try to use > > the pre-allocation SGL even though user passes zero to 'nents_first_chunk'. > > > > The 2nd patch fixes issue in case that NO_SG_CHAIN on some ARCHs, > > such as alpha, arm and parisc. > > > > The 3rd patch makes esp scsi working with SG_CHAIN. > > > > V2: > > - add the patch1, which is verified by Guenter > > - add .prv_sg to store the previous sg for esp_scsi > > > > Ming Lei (3): > > scsi: lib/sg_pool.c: clear 'first_chunk' in case of no pre-allocation > > scsi: core: don't pre-allocate small SGL in case of NO_SG_CHAIN > > scsi: esp: make it working on SG_CHAIN > > > > Running my tests on next-20190605, I get: > > Qemu test results: > total: 349 pass: 296 fail: 53 > > The same tests on next-20190605 plus this series results in: > > Qemu test results: > total: 349 pass: 347 fail: 2 > Failed tests: > sh:rts7751r2dplus_defconfig:usb:rootfs > sh:rts7751r2dplus_defconfig:usb-hub:rootfs > > The remaining failures are consistent across re-runs. The failure is only > seen when booting from usb using the sm501-usb controller (see below). > > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 08 7c 00 00 f0 00 > print_req_error: I/O error, dev sda, sector 2172 flags 80700 > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 01 da 00 00 f0 00 > print_req_error: I/O error, dev sda, sector 474 flags 84700 > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 02 da 00 00 f0 00 > print_req_error: I/O error, dev sda, sector 730 flags 84700 > usb 1-2.1: reset full-speed USB device number 4 using sm501-usb > sd 1:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=0x00 > sd 1:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 0b 50 00 00 f0 00 > print_req_error: I/O error, dev sda, sector 2896 flags 84700 > > Presumably that means that either the sm501-usb emulation has a subtle bug > associated with SG, or something is wrong with the sm501-usb driver. > Turns out the above is caused by other (unrelated) patches in the sm501 USB driver. After reverting those, everything is fine. Sorry for the noise. Please feel free to add Tested-by: Guenter Roeck to all three patches. Thanks, Guenter > The qemu command line in the failure case is: > > qemu-system-sh4 -M r2d \ > -kernel ./arch/sh/boot/zImage \ > -snapshot \ > -usb -device usb-storage,drive=d0 \ > -drive file=rootfs.ext2,if=none,id=d0,format=raw \ > -append 'panic=-1 slub_debug=FZPUA root=/dev/sda rootwait > console=ttySC1,115200 earlycon=scif,mmio16,0xffe8 noiotrap' \ > -serial null -serial stdio \ > -net nic,model=rtl8139 -net user -nographic -monitor null > > I'll be happy to provide kernel configuration and a pointer to the root file > system if needed. > > Thanks, > Guenter
Re: [PATCH 15/22] hwmon: applesmc: fix format string overflow
On 07/14/2017 05:07 AM, Arnd Bergmann wrote: gcc-7 warns that the key might exceed five bytes for lage index values: drivers/hwmon/applesmc.c: In function 'applesmc_show_fan_position': drivers/hwmon/applesmc.c:906:18: error: '%d' directive writing between 1 and 5 bytes into a region of size 4 [-Werror=format-overflow=] sprintf(newkey, FAN_ID_FMT, to_index(attr)); ^~~ drivers/hwmon/applesmc.c:906:18: note: directive argument in the range [0, 65535] drivers/hwmon/applesmc.c:906:2: note: 'sprintf' output between 5 and 9 bytes into a destination of size 5 As the key is required to be four characters plus trailing zero, we know that the index has to be small here. I'm using snprintf() to avoid the warning. This would truncate the string instead of overflowing. Signed-off-by: Arnd Bergmann I submitted a more comprehensive patch a couple of days ago. There are other similar sprintf() calls in the driver which gcc doesn't report. Guenter --- drivers/hwmon/applesmc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 0af7fd311979..515163b9a89f 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -903,7 +903,7 @@ static ssize_t applesmc_show_fan_position(struct device *dev, char newkey[5]; u8 buffer[17]; - sprintf(newkey, FAN_ID_FMT, to_index(attr)); + snprintf(newkey, sizeof(newkey), FAN_ID_FMT, to_index(attr)); ret = applesmc_read_key(newkey, buffer, 16); buffer[16] = 0;
Re: [PATCH v7] scsi: Add hwmon support for SMART temperature sensors
[-cc] Hi Linus, On 11/22/18 5:49 AM, Linus Walleij wrote: [ ... ] (It's also worth noting that HDD temperature sensors are notoriously unreliable). I am sorry if you think that D-Link does bad engineering, what I am trying to achieve is upstream support for this device, without any out-of-tree patches. The D-Link DIR-685 uses the harddisk sensor for this, whether we like it or not. Following the above argument (not yours, the one about accuracy), I guess we should drop support for all CPU temperature sensors from the Linux kernel. After all, they are known to be much more inaccurate than a HDD temperature sensor could ever be. Seriously, any argument about (lack of) sensor accuracy should be silently ignored. I may be overly pessimistic nowadays, but whenever I see such an argument, I read it as "we'll never accept your patch, so better stop wasting your time and forget about it". I hope I am wrong, but it seems to me that this is where things are going. Can you possibly extract this as pure hwmon driver outside scsi control ? I'll be happy to accept it as standalone hwmon driver. Thanks, Guenter
Re: [PATCH v7] scsi: Add hwmon support for SMART temperature sensors
On 11/23/18 12:18 AM, Linus Walleij wrote: On Thu, Nov 22, 2018 at 9:00 PM Guenter Roeck wrote: Can you possibly extract this as pure hwmon driver outside scsi control ? I'll be happy to accept it as standalone hwmon driver. That should be last resort, what the SCSI people want is noble, and they did a tremendous (impressive) work by hiding all the ATA drives behind SCSI emulation with libata, so they want me to keep up that tradition by also making the temperature reading behave "as if it was a SCSI drive" too so I'm on board with trying that out even if I think the bar is a bit high for causal contributors. No problem, your call. I would just very much dislike your work to get lost, that is all. Thanks, Guenter
kobject lifetime issues in blk-mq
Hi, Images with CONFIG_DEBUG_KOBJECT=y, CONFIG_DEBUG_KOBJECT_RELEASE=y enabled report kobject lifetime issues when booting an image in qemu using virtio-scsi-pci. Bisect points to two different commits, depending on the kernel configuration. 1st configuration: defconfig plus: CONFIG_VIRTIO_BLK=y CONFIG_SCSI_LOWLEVEL=y CONFIG_SCSI_VIRTIO=y CONFIG_VIRTIO_PCI=y CONFIG_VIRTIO_MMIO=y CONFIG_DEBUG_OBJECTS=y CONFIG_DEBUG_OBJECTS_FREE=y CONFIG_DEBUG_OBJECTS_TIMERS=y CONFIG_DEBUG_KOBJECT=y CONFIG_DEBUG_KOBJECT_RELEASE=y Bisecting this configuration points to commit b5b6e8c8d3b4 ("scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity") as the culprit. This commit enforces the use of blk-mq for virtio_scsi; the bisect log is therefore a bit misleading. 2nd configuration: As above, plus: CONFIG_SCSI_MQ_DEFAULT=y With this configuration, bisect points to commit 7ea5fe31c12d ("blk-mq: make lifetime consitent between q/ctx and its kobject"). The qemu command line used to reproduce the problem is as follows. The qemu version does not seem to matter (I tested with qemu versions 2.5, 2.11, and 3.0). qemu-system-x86_64 \ -kernel arch/x86/boot/bzImage \ -device virtio-scsi-pci,id=scsi \ -device scsi-hd,bus=scsi.0,drive=d0 \ -drive file=./wheezy.img,format=raw,if=none,id=d0 \ -snapshot \ -m 2G -smp 4 -enable-kvm \ -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic \ -nographic -monitor none \ -no-reboot \ -append "console=ttyS0 root=/dev/sda earlyprintk=serial panic_on_warn=1 panic=-1" Root file system: https://storage.googleapis.com/syzkaller/wheezy.img or: https://github.com/groeck/linux-build-test/blob/master/rootfs/x86_64/rootfs.ext2.gz It seems that any root file system can be used to reproduce the problem. Also, the problem is not limited to virtio-scsi-pci. I also tried am53c974, lsi53c810, and lsi53c895a (after enabling the respective drivers), with the same result. Overall, this suggests that the problem is related to blk-mq and was indeed introduced with commit 7ea5fe31c12d. For reference, I attached an actual crash log as well as a set of bisect logs below. Unfortunately, my understanding of blk-mq is not good enough to find the underlying problem and suggest a fix. Please let me know if there is anything else I can do to help fixing the problem. Note that the problem was originally reported by syzbot running a test on chromeos-4.14. It may well be that the problem has already been reported and is being worked on. If so, please apologize the noise. Thanks, Guenter --- [8.700038] [ cut here ] [8.700376] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x10 [8.701032] WARNING: CPU: 0 PID: 0 at lib/debugobjects.c:329 debug_print_object+0x6a/0x80 [8.701032] Kernel panic - not syncing: panic_on_warn set ... [8.701032] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc1 #30 [8.701032] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [8.701032] Call Trace: [8.701032] [8.701032] dump_stack+0x46/0x5b [8.701032] panic+0xf3/0x246 [8.701032] ? debug_print_object+0x6a/0x80 [8.701032] __warn+0xeb/0xf0 [8.701032] ? debug_print_object+0x6a/0x80 [8.701032] ? debug_print_object+0x6a/0x80 [8.701032] report_bug+0xb1/0x120 [8.701032] fixup_bug.part.10+0x13/0x30 [8.701032] do_error_trap+0x8f/0xb0 [8.701032] do_invalid_op+0x31/0x40 [8.701032] ? debug_print_object+0x6a/0x80 [8.701032] invalid_op+0x14/0x20 [8.701032] RIP: 0010:debug_print_object+0x6a/0x80 [8.701032] Code: 8b 43 10 83 c2 01 8b 4b 14 89 15 c1 e2 78 01 4c 8b 45 00 4c 89 e6 48 c7 c7 f0 ca 41 a6 48 8b 14 c5 a0 f7 24 a6 e8 06 8e cc ff <0f> 0b 5b 83 05 70 47 1a 01 01 5d 41 5c c3 83 05 65 47 1a 01 01 c3 [8.701032] RSP: 0018:89b9fda03e48 EFLAGS: 00010082 [8.701032] RAX: RBX: 89b9fd49d0c8 RCX: a6646e18 [8.701032] RDX: 0001 RSI: 0082 RDI: a6cc596c [8.701032] RBP: a664a5c0 R08: 79616c6564203a74 R09: 5f6b726f775f6465 [8.701032] R10: 89b9fda03f10 R11: 3178302f3078302b R12: a6403e22 [8.701032] R13: 0001 R14: a6d4d458 R15: 89b9fc4c8780 [8.701032] ? debug_print_object+0x6a/0x80 [8.701032] debug_check_no_obj_freed+0x1b8/0x1ea [8.701032] ? rcu_process_callbacks+0x2a7/0x750 [8.701032] kmem_cache_free+0x6e/0x1a0 [8.701032] ? __blk_release_queue+0x150/0x150 [8.701032] rcu_process_callbacks+0x2a7/0x750 [8.701032] __do_softirq+0xf2/0x2c7 [8.701032] irq_exit+0xb7/0xc0 [8.701032] smp_apic_timer_interrupt+0x67/0x130 [8.701032] apic_timer_interrupt+0xf/0x20 [8.701032] [8.701032] RIP: 0010:default_idle+0x12/0x140 [8.701032] Code: fe ff ff c6 43 08 00 fb eb b0
Re: kobject lifetime issues in blk-mq
On Mon, Nov 12, 2018 at 05:20:52PM +0800, Ming Lei wrote: > On Fri, Nov 09, 2018 at 12:35:18PM -0800, Guenter Roeck wrote: > > Hi, > > > > Images with CONFIG_DEBUG_KOBJECT=y, CONFIG_DEBUG_KOBJECT_RELEASE=y > > enabled report kobject lifetime issues when booting an image in qemu > > using virtio-scsi-pci. > > > > Bisect points to two different commits, depending on the kernel > > configuration. > > > > 1st configuration: defconfig plus: > > > > CONFIG_VIRTIO_BLK=y > > CONFIG_SCSI_LOWLEVEL=y > > CONFIG_SCSI_VIRTIO=y > > CONFIG_VIRTIO_PCI=y > > CONFIG_VIRTIO_MMIO=y > > CONFIG_DEBUG_OBJECTS=y > > CONFIG_DEBUG_OBJECTS_FREE=y > > CONFIG_DEBUG_OBJECTS_TIMERS=y > > CONFIG_DEBUG_KOBJECT=y > > CONFIG_DEBUG_KOBJECT_RELEASE=y > > > > Bisecting this configuration points to commit b5b6e8c8d3b4 ("scsi: > > virtio_scsi: fix IO hang caused by automatic irq vector affinity") > > as the culprit. This commit enforces the use of blk-mq for virtio_scsi; > > the bisect log is therefore a bit misleading. > > > > 2nd configuration: As above, plus: > > CONFIG_SCSI_MQ_DEFAULT=y > > > > With this configuration, bisect points to commit 7ea5fe31c12d > > ("blk-mq: make lifetime consitent between q/ctx and its kobject"). > > > > The qemu command line used to reproduce the problem is as follows. > > The qemu version does not seem to matter (I tested with qemu versions > > 2.5, 2.11, and 3.0). > > > > qemu-system-x86_64 \ > > -kernel arch/x86/boot/bzImage \ > > -device virtio-scsi-pci,id=scsi \ > > -device scsi-hd,bus=scsi.0,drive=d0 \ > > -drive file=./wheezy.img,format=raw,if=none,id=d0 \ > > -snapshot \ > > -m 2G -smp 4 -enable-kvm \ > > -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic \ > > -nographic -monitor none \ > > -no-reboot \ > > -append "console=ttyS0 root=/dev/sda earlyprintk=serial panic_on_warn=1 > > panic=-1" > > > > Root file system: > > https://storage.googleapis.com/syzkaller/wheezy.img > > or: > > > > https://github.com/groeck/linux-build-test/blob/master/rootfs/x86_64/rootfs.ext2.gz > > > > It seems that any root file system can be used to reproduce the problem. > > Also, the problem is not limited to virtio-scsi-pci. I also tried > > am53c974, lsi53c810, and lsi53c895a (after enabling the respective > > drivers), with the same result. > > > > Overall, this suggests that the problem is related to blk-mq and was > > indeed introduced with commit 7ea5fe31c12d. > > > > For reference, I attached an actual crash log as well as a set of bisect > > logs below. > > > > Unfortunately, my understanding of blk-mq is not good enough to find the > > underlying problem and suggest a fix. Please let me know if there is > > anything else I can do to help fixing the problem. > > > > Note that the problem was originally reported by syzbot running a test > > on chromeos-4.14. It may well be that the problem has already been > > reported and is being worked on. If so, please apologize the noise. > > > > Thanks, > > Guenter > > > > --- > > [8.700038] [ cut here ] > > [8.700376] ODEBUG: free active (active state 0) object type: timer_list > > hint: delayed_work_timer_fn+0x0/0x10 > > [8.701032] WARNING: CPU: 0 PID: 0 at lib/debugobjects.c:329 > > debug_print_object+0x6a/0x80 > > [8.701032] Kernel panic - not syncing: panic_on_warn set ... > > [8.701032] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc1 #30 > > [8.701032] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > Ubuntu-1.8.2-1ubuntu1 04/01/2014 > > [8.701032] Call Trace: > > [8.701032] > > [8.701032] dump_stack+0x46/0x5b > > [8.701032] panic+0xf3/0x246 > > [8.701032] ? debug_print_object+0x6a/0x80 > > [8.701032] __warn+0xeb/0xf0 > > [8.701032] ? debug_print_object+0x6a/0x80 > > [8.701032] ? debug_print_object+0x6a/0x80 > > [8.701032] report_bug+0xb1/0x120 > > [8.701032] fixup_bug.part.10+0x13/0x30 > > [8.701032] do_error_trap+0x8f/0xb0 > > [8.701032] do_invalid_op+0x31/0x40 > > [8.701032] ? debug_print_object+0x6a/0x80 > > [8.701032] invalid_op+0x14/0x20 > > [8.701032] RIP: 0010:debug_print_object+0x6a/0x80 > > [8.701032] Code: 8b 43 10 83 c2 01 8b 4b 14 89 15 c1 e2 78 01 4c 8b 45 > > 00 4c > > 89 e
Re: kobject lifetime issues in blk-mq
On Mon, Nov 12, 2018 at 05:44:07PM +0800, Ming Lei wrote: > On Mon, Nov 12, 2018 at 05:20:51PM +0800, Ming Lei wrote: > > On Fri, Nov 09, 2018 at 12:35:18PM -0800, Guenter Roeck wrote: > > > Hi, > > > > > > Images with CONFIG_DEBUG_KOBJECT=y, CONFIG_DEBUG_KOBJECT_RELEASE=y > > > enabled report kobject lifetime issues when booting an image in qemu > > > using virtio-scsi-pci. > > > > > > Bisect points to two different commits, depending on the kernel > > > configuration. > > > > > > 1st configuration: defconfig plus: > > > > > > CONFIG_VIRTIO_BLK=y > > > CONFIG_SCSI_LOWLEVEL=y > > > CONFIG_SCSI_VIRTIO=y > > > CONFIG_VIRTIO_PCI=y > > > CONFIG_VIRTIO_MMIO=y > > > CONFIG_DEBUG_OBJECTS=y > > > CONFIG_DEBUG_OBJECTS_FREE=y > > > CONFIG_DEBUG_OBJECTS_TIMERS=y > > > CONFIG_DEBUG_KOBJECT=y > > > CONFIG_DEBUG_KOBJECT_RELEASE=y > > > > > > Bisecting this configuration points to commit b5b6e8c8d3b4 ("scsi: > > > virtio_scsi: fix IO hang caused by automatic irq vector affinity") > > > as the culprit. This commit enforces the use of blk-mq for virtio_scsi; > > > the bisect log is therefore a bit misleading. > > > > > > 2nd configuration: As above, plus: > > > CONFIG_SCSI_MQ_DEFAULT=y > > > > > > With this configuration, bisect points to commit 7ea5fe31c12d > > > ("blk-mq: make lifetime consitent between q/ctx and its kobject"). > > > > > > The qemu command line used to reproduce the problem is as follows. > > > The qemu version does not seem to matter (I tested with qemu versions > > > 2.5, 2.11, and 3.0). > > > > > > qemu-system-x86_64 \ > > > -kernel arch/x86/boot/bzImage \ > > > -device virtio-scsi-pci,id=scsi \ > > > -device scsi-hd,bus=scsi.0,drive=d0 \ > > > -drive file=./wheezy.img,format=raw,if=none,id=d0 \ > > > -snapshot \ > > > -m 2G -smp 4 -enable-kvm \ > > > -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic \ > > > -nographic -monitor none \ > > > -no-reboot \ > > > -append "console=ttyS0 root=/dev/sda earlyprintk=serial panic_on_warn=1 > > > panic=-1" > > > > > > Root file system: > > > https://storage.googleapis.com/syzkaller/wheezy.img > > > or: > > > > > > https://github.com/groeck/linux-build-test/blob/master/rootfs/x86_64/rootfs.ext2.gz > > > > > > It seems that any root file system can be used to reproduce the problem. > > > Also, the problem is not limited to virtio-scsi-pci. I also tried > > > am53c974, lsi53c810, and lsi53c895a (after enabling the respective > > > drivers), with the same result. > > > > > > Overall, this suggests that the problem is related to blk-mq and was > > > indeed introduced with commit 7ea5fe31c12d. > > > > > > For reference, I attached an actual crash log as well as a set of bisect > > > logs below. > > > > > > Unfortunately, my understanding of blk-mq is not good enough to find the > > > underlying problem and suggest a fix. Please let me know if there is > > > anything else I can do to help fixing the problem. > > > > > > Note that the problem was originally reported by syzbot running a test > > > on chromeos-4.14. It may well be that the problem has already been > > > reported and is being worked on. If so, please apologize the noise. > > > > > > Thanks, > > > Guenter > > > > > > --- > > > [8.700038] [ cut here ] > > > [8.700376] ODEBUG: free active (active state 0) object type: > > > timer_list hint: delayed_work_timer_fn+0x0/0x10 > > > [8.701032] WARNING: CPU: 0 PID: 0 at lib/debugobjects.c:329 > > > debug_print_object+0x6a/0x80 > > > [8.701032] Kernel panic - not syncing: panic_on_warn set ... > > > [8.701032] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc1 #30 > > > [8.701032] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > > BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 > > > [8.701032] Call Trace: > > > [8.701032] > > > [8.701032] dump_stack+0x46/0x5b > > > [8.701032] panic+0xf3/0x246 > > > [8.701032] ? debug_print_object+0x6a/0x80 > > > [8.701032] __warn+0xeb/0xf0 > > > [8.701032] ? debug_print_object+0x6a/0x80 > > > [8.701
Re: BUG in scsi_lib.c due to a bad commit
On 11/11/2014 04:17 PM, Bjorn Helgaas wrote: [+cc Guenter, linux-scsi] On Tue, Nov 11, 2014 at 4:33 PM, Barto wrote: Hello everyone, I notice a bug since kernel 3.17 ( and also with 3.18 branch ), a random hang at boot on some PC configurations, I did a "git bisect" and I found that the culprit is : [045065d8a300a37218c548e9aa7becd581c6a0e8] [SCSI] fix qemu boot hang problem http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=045065d8a300a37218c548e9aa7becd581c6a0e8 the author of this commit has choosen to inverse the logic of the if statement in the file drivers/scsi/scsi_lib.c in order to solve an issue with qemu : --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1774,7 +1774,7 @@ static void scsi_request_fn(struct request_queue *q) blk_requeue_request(q, req); atomic_dec(&sdev->device_busy); out_delay: - if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) + if (!atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) blk_delay_queue(q, SCSI_QUEUE_DELAY); } The above commit was a follow-up to commit 71e75c97f97a, which did the following: - if (sdev->device_busy == 0 && !scsi_device_blocked(sdev)) + if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) meaning it reversed the polarity of the check in the original code. Since that commit created problems as observed in qemu, I thought it would be obvious that restoring the original polarity would make sense. This is explained in my patch, so I am somewhat at loss why ... this change triggers a bug on my PC ( I don't have SCSI devices, but only 3 SATA harddisks and 2 IDE harddisks, SATA disks are on an ICH7 sata controler on the motherboard gigabyte GA-P31-DS3L, and IDE disk on a JMicron JMB363/368 Sata/IDE PCIe card ), every 5~10 boots the boot stops suddenly because of this commit, If I revert this commit then the bug is gone, more details can be found here, where I created a patch who reverts commit 045065d8 : https://bugzilla.kernel.org/show_bug.cgi?id=87581 my question: why Guenter Roeck ( the author of the bad commit ) has choosen to inverse the logic in the if statement ? this question is asked. Having said that, I don't really care one way or another. I am fine with reverting my commit; if that is done, I'll simply remove the related scsi tests from my qemu test runs. Guenter before his commit the if statement was like this : if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) blk_delay_queue(q, SCSI_QUEUE_DELAY); if his decision to inverse the logic of "atomic_read(&sdev->device_busy)" is acceptable then the real bug is probably located elsewhere in the scsi source code, and we must solve this mistery because there is obviously a bug regression in SCSI code because with older kernels ( 3.16.7 and lower ) I don't have the random hang boot bug with my configuration, another user in archlinux forums has also this bug and he has a more modern PC ( intel i7 core cpu, SSD device ), my fear is when linux distros will move to kernel 3.17 then more users will have this weird random bug who can occur only on boot and only with a specific PC configuration, if the boot step is passed despite the random bug then the bug will not occur, it occurs only during the boot process, which probably means that the faulty source code is only called during the boot process, thanks for anyone who wants to dig this problem with me -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] cxlflash: Select IRQ_POLL
The driver now uses IRQ_POLL and needs to select it to avoid the following build error. ERROR: ".irq_poll_complete" [drivers/scsi/cxlflash/cxlflash.ko] undefined! ERROR: ".irq_poll_sched" [drivers/scsi/cxlflash/cxlflash.ko] undefined! ERROR: ".irq_poll_disable" [drivers/scsi/cxlflash/cxlflash.ko] undefined! ERROR: ".irq_poll_init" [drivers/scsi/cxlflash/cxlflash.ko] undefined! Fixes: cba06e6de403 ("scsi: cxlflash: Implement IRQ polling for RRQ processing") Signed-off-by: Guenter Roeck --- drivers/scsi/cxlflash/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/cxlflash/Kconfig b/drivers/scsi/cxlflash/Kconfig index c052104e523e..a011c5dbf214 100644 --- a/drivers/scsi/cxlflash/Kconfig +++ b/drivers/scsi/cxlflash/Kconfig @@ -5,6 +5,7 @@ config CXLFLASH tristate "Support for IBM CAPI Flash" depends on PCI && SCSI && CXL && EEH + select IRQ_POLL default m help Allows CAPI Accelerated IO to Flash -- 2.7.4
Re: [PATCH 09/13] timer: Remove users of expire and data arguments to DEFINE_TIMER
On 10/04/2017 04:27 PM, Kees Cook wrote: The expire and data arguments of DEFINE_TIMER are only used in two places and are ignored by the code (malta-display.c only uses mod_timer(), never add_timer(), so the preset expires value is ignored). Set both sets of arguments to zero. Cc: Ralf Baechle Cc: Wim Van Sebroeck Cc: Guenter Roeck Cc: Geert Uytterhoeven Cc: linux-m...@linux-mips.org Cc: linux-watch...@vger.kernel.org Signed-off-by: Kees Cook For watchdog: Acked-by: Guenter Roeck --- arch/mips/mti-malta/malta-display.c | 6 +++--- drivers/watchdog/alim7101_wdt.c | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/mips/mti-malta/malta-display.c b/arch/mips/mti-malta/malta-display.c index d4f807191ecd..ac813158b9b8 100644 --- a/arch/mips/mti-malta/malta-display.c +++ b/arch/mips/mti-malta/malta-display.c @@ -36,10 +36,10 @@ void mips_display_message(const char *str) } } -static void scroll_display_message(unsigned long data); -static DEFINE_TIMER(mips_scroll_timer, scroll_display_message, HZ, 0); +static void scroll_display_message(unsigned long unused); +static DEFINE_TIMER(mips_scroll_timer, scroll_display_message, 0, 0); -static void scroll_display_message(unsigned long data) +static void scroll_display_message(unsigned long unused) { mips_display_message(&display_string[display_count++]); if (display_count == max_display_count) diff --git a/drivers/watchdog/alim7101_wdt.c b/drivers/watchdog/alim7101_wdt.c index 665e0e7dfe1e..3c1f6ac68ea9 100644 --- a/drivers/watchdog/alim7101_wdt.c +++ b/drivers/watchdog/alim7101_wdt.c @@ -71,7 +71,7 @@ MODULE_PARM_DESC(use_gpio, "Use the gpio watchdog (required by old cobalt boards)."); static void wdt_timer_ping(unsigned long); -static DEFINE_TIMER(timer, wdt_timer_ping, 0, 1); +static DEFINE_TIMER(timer, wdt_timer_ping, 0, 0); static unsigned long next_heartbeat; static unsigned long wdt_is_open; static char wdt_expect_close; @@ -87,7 +87,7 @@ MODULE_PARM_DESC(nowayout, *Whack the dog */ -static void wdt_timer_ping(unsigned long data) +static void wdt_timer_ping(unsigned long unused) { /* If we got a heartbeat pulse within the WDT_US_INTERVAL * we agree to ping the WDT
Re: [PATCH 10/13] timer: Remove expires and data arguments from DEFINE_TIMER
On 10/04/2017 04:27 PM, Kees Cook wrote: Drop the arguments from the macro and adjust all callers with the following script: perl -pi -e 's/DEFINE_TIMER\((.*), 0, 0\);/DEFINE_TIMER($1);/g;' \ $(git grep DEFINE_TIMER | cut -d: -f1 | sort -u | grep -v timer.h) Signed-off-by: Kees Cook Acked-by: Geert Uytterhoeven # for m68k parts For watchdog: Acked-by: Guenter Roeck --- arch/arm/mach-ixp4xx/dsmg600-setup.c | 2 +- arch/arm/mach-ixp4xx/nas100d-setup.c | 2 +- arch/m68k/amiga/amisound.c| 2 +- arch/m68k/mac/macboing.c | 2 +- arch/mips/mti-malta/malta-display.c | 2 +- arch/parisc/kernel/pdc_cons.c | 2 +- arch/s390/mm/cmm.c| 2 +- drivers/atm/idt77105.c| 4 ++-- drivers/atm/iphase.c | 2 +- drivers/block/ataflop.c | 8 drivers/char/dtlk.c | 2 +- drivers/char/hangcheck-timer.c| 2 +- drivers/char/nwbutton.c | 2 +- drivers/char/rtc.c| 2 +- drivers/input/touchscreen/s3c2410_ts.c| 2 +- drivers/net/cris/eth_v10.c| 6 +++--- drivers/net/hamradio/yam.c| 2 +- drivers/net/wireless/atmel/at76c50x-usb.c | 2 +- drivers/staging/speakup/main.c| 2 +- drivers/staging/speakup/synth.c | 2 +- drivers/tty/cyclades.c| 2 +- drivers/tty/isicom.c | 2 +- drivers/tty/moxa.c| 2 +- drivers/tty/rocket.c | 2 +- drivers/tty/vt/keyboard.c | 2 +- drivers/tty/vt/vt.c | 2 +- drivers/watchdog/alim7101_wdt.c | 2 +- drivers/watchdog/machzwd.c| 2 +- drivers/watchdog/mixcomwd.c | 2 +- drivers/watchdog/sbc60xxwdt.c | 2 +- drivers/watchdog/sc520_wdt.c | 2 +- drivers/watchdog/via_wdt.c| 2 +- drivers/watchdog/w83877f_wdt.c| 2 +- drivers/xen/grant-table.c | 2 +- fs/pstore/platform.c | 2 +- include/linux/timer.h | 4 ++-- kernel/irq/spurious.c | 2 +- lib/random32.c| 2 +- net/atm/mpc.c | 2 +- net/decnet/dn_route.c | 2 +- net/ipv6/ip6_flowlabel.c | 2 +- net/netrom/nr_loopback.c | 2 +- security/keys/gc.c| 2 +- sound/oss/midibuf.c | 2 +- sound/oss/soundcard.c | 2 +- sound/oss/sys_timer.c | 2 +- sound/oss/uart6850.c | 2 +- 47 files changed, 54 insertions(+), 54 deletions(-) diff --git a/arch/arm/mach-ixp4xx/dsmg600-setup.c b/arch/arm/mach-ixp4xx/dsmg600-setup.c index b3bd0e137f6d..b3689a141ec6 100644 --- a/arch/arm/mach-ixp4xx/dsmg600-setup.c +++ b/arch/arm/mach-ixp4xx/dsmg600-setup.c @@ -174,7 +174,7 @@ static int power_button_countdown; #define PBUTTON_HOLDDOWN_COUNT 4 /* 2 secs */ static void dsmg600_power_handler(unsigned long data); -static DEFINE_TIMER(dsmg600_power_timer, dsmg600_power_handler, 0, 0); +static DEFINE_TIMER(dsmg600_power_timer, dsmg600_power_handler); static void dsmg600_power_handler(unsigned long data) { diff --git a/arch/arm/mach-ixp4xx/nas100d-setup.c b/arch/arm/mach-ixp4xx/nas100d-setup.c index 4e0f762bc651..562d05f9888e 100644 --- a/arch/arm/mach-ixp4xx/nas100d-setup.c +++ b/arch/arm/mach-ixp4xx/nas100d-setup.c @@ -197,7 +197,7 @@ static int power_button_countdown; #define PBUTTON_HOLDDOWN_COUNT 4 /* 2 secs */ static void nas100d_power_handler(unsigned long data); -static DEFINE_TIMER(nas100d_power_timer, nas100d_power_handler, 0, 0); +static DEFINE_TIMER(nas100d_power_timer, nas100d_power_handler); static void nas100d_power_handler(unsigned long data) { diff --git a/arch/m68k/amiga/amisound.c b/arch/m68k/amiga/amisound.c index 90a60d758f8b..a23f48181fd6 100644 --- a/arch/m68k/amiga/amisound.c +++ b/arch/m68k/amiga/amisound.c @@ -66,7 +66,7 @@ void __init amiga_init_sound(void) } static void nosound( unsigned long ignored ); -static DEFINE_TIMER(sound_timer, nosound, 0, 0); +static DEFINE_TIMER(sound_timer, nosound); void amiga_mksound( unsigned int hz, unsigned int ticks ) { diff --git a/arch/m68k/mac/macboing.c b/arch/m68k/mac/macboing.c index ffaa1f6439ae..9a52aff183d0 100644 --- a/arch/m68k/mac/macboing.c +++ b/arch/m68k/mac/macboing.c @@ -56,7 +56,7 @@ static void ( *mac_special_bell )( unsigned int, unsigned int, unsigned int ); /* * our timer to start/continue/stop the bell */ -static DEFINE_TIMER(mac_sound_timer, mac_nosound, 0, 0); +static DEFINE_TIMER(mac_sound_timer, mac_nosound); /* * Sort of initialize the sound chip (called from mac_mksound on the first