On Tue, Aug 04, 2020 at 11:58:47AM +0100, Marc Zyngier wrote: > On 2020-08-04 09:56, Jason Liu wrote: > > No need to do the irq_chip->irq_mask() if it already masked. > > BTW, unconditionally do the irq_chip->irq_mask() will also bring issues > > when the irq_chip in the runtime PM suspend. Accessing registers of the > > irq_chip will bring in the exceptions. For example on the i.MX: > > > > root@imx8qmmek:~# echo c > /proc/sysrq-trigger > > [ 177.796182] sysrq: Trigger a crash > > [ 177.799596] Kernel panic - not syncing: sysrq triggered crash > > [ 177.875616] SMP: stopping secondary CPUs > > [ 177.891936] Internal error: synchronous external abort: 96000210 > > [#1] PREEMPT SMP > > [ 177.899429] Modules linked in: crct10dif_ce mxc_jpeg_encdec > > [ 177.905018] CPU: 1 PID: 944 Comm: sh Kdump: loaded Not tainted > > [ 177.913457] Hardware name: Freescale i.MX8QM MEK (DT) > > [ 177.918517] pstate: a0000085 (NzCv daIf -PAN -UAO) > > [ 177.923318] pc : imx_irqsteer_irq_mask+0x50/0x80 > > [ 177.927944] lr : imx_irqsteer_irq_mask+0x38/0x80 > > [ 177.932561] sp : ffff800011fe3a50 > > [ 177.935880] x29: ffff800011fe3a50 x28: ffff0008f7708e00 > > [ 177.941196] x27: 0000000000000000 x26: 0000000000000000 > > [ 177.946513] x25: ffff800011a30c80 x24: 0000000000000000 > > [ 177.951830] x23: ffff800011fe3af8 x22: ffff0008f24469d4 > > [ 177.957147] x21: ffff0008f2446880 x20: ffff0008f25f5658 > > [ 177.962463] x19: ffff800012611004 x18: 0000000000000001 > > [ 177.967780] x17: 0000000000000000 x16: 0000000000000000 > > [ 177.973097] x15: ffff0008f7709270 x14: 0000000060000085 > > [ 177.978414] x13: ffff800010177570 x12: ffff800011fe3ab0 > > [ 177.983730] x11: ffff80001017749c x10: 0000000000000040 > > [ 177.989047] x9 : ffff8000119f1c80 x8 : ffff8000119f1c78 > > [ 177.994364] x7 : ffff0008f46bedf8 x6 : 0000000000000000 > > [ 177.999681] x5 : ffff0008f46beda0 x4 : 0000000000000000 > > [ 178.004997] x3 : ffff0008f24469d4 x2 : ffff800012611000 > > [ 178.010314] x1 : 0000000000000080 x0 : 0000000000000080 > > [ 178.015630] Call trace: > > [ 178.018077] imx_irqsteer_irq_mask+0x50/0x80 > > [ 178.022352] machine_crash_shutdown+0xa8/0x100 > > [ 178.026802] __crash_kexec+0x6c/0x118 > > [ 178.030464] panic+0x19c/0x324 > > [ 178.033524] sysrq_handle_reboot+0x0/0x20 > > [ 178.037537] __handle_sysrq+0x88/0x180 > > [ 178.041290] write_sysrq_trigger+0x8c/0xb0 > > [ 178.045389] proc_reg_write+0x78/0xb0 > > [ 178.049055] __vfs_write+0x18/0x40 > > [ 178.052461] vfs_write+0xdc/0x1c8 > > [ 178.055779] ksys_write+0x68/0xf0 > > [ 178.059098] __arm64_sys_write+0x18/0x20 > > [ 178.063027] el0_svc_common.constprop.0+0x68/0x160 > > [ 178.067821] el0_svc_handler+0x20/0x80 > > [ 178.071573] el0_svc+0x8/0xc > > [ 178.074463] Code: 93407e73 91001273 aa0003e1 8b130053 (b9400260) > > [ 178.080567] ---[ end trace 652333f6c6d6b05d ]--- > > > > Signed-off-by: Jason Liu <jason.hui....@nxp.com> > > Cc: <sta...@vger.kernel.org> > > Cc: Catalin Marinas <catalin.mari...@arm.com> > > Cc: Will Deacon <w...@kernel.org> > > Cc: Sasha Levin <sas...@kernel.org> > > --- > > arch/arm64/kernel/machine_kexec.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/arm64/kernel/machine_kexec.c > > b/arch/arm64/kernel/machine_kexec.c > > index a0b144cfaea7..8ab263c733bf 100644 > > --- a/arch/arm64/kernel/machine_kexec.c > > +++ b/arch/arm64/kernel/machine_kexec.c > > @@ -236,7 +236,7 @@ static void machine_kexec_mask_interrupts(void) > > chip->irq_eoi) > > chip->irq_eoi(&desc->irq_data); > > > > - if (chip->irq_mask) > > + if (chip->irq_mask && !irqd_irq_masked(&desc->irq_data)) > > chip->irq_mask(&desc->irq_data); > > > > if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) > > This is pretty dodgy. irq_mask() should be an idempotent action > (masking twice must not be harmful). >
That was my understanding too, but was not totally against adding it here. > Even more, it really isn't obvious to me how this can work at all, > as even if the interrupt isn't masked, the irqsteer could well be > suspended. > Indeed, the runtime PM ops in that driver looks dodgy. Any calls to mask_irq from drivers or anywhere with irqchip suspended with just blows up the system. > So as is, this change is just papering over a much deeper issue > in your driver. > Thanks for confirming -- Regards, Sudeep