Hi All, While working on Whiskey Cove PMIC support for Cherrytrail devices I noticed that the irq handling in the Broxton Whiskey Cove MFD code seems to be wrong.
There seems to be some misunderstanding here about how the irqchip stuff should work. If I understand the hardware correctly, there are 2 levels of interrupt masks first there is BXTWC_MIRQLVL1 which creates a number of interrupt groups, like thermal, bcu, etc. Then per interrupt group there is not only 1 or more write 1 to clear status registers to indicate the exact reason of the interrupt but also to possibility to mask out certain status bits from contributing to raising the interrupt for that group if not masked in BXTWC_MIRQLVL1. The current code defines 2 separate interrupt chips for this (well 3 really, but the 3th one is not relevant for this discussion). But instead of chaining these interrupt chips, as they are in hardware, they are setup as separate chips sharing the same interrupt. And all interrupt listeners (callers of request_interrupt on the irq-chip registered interrupts) only listen to the 2nd level irq-chip irqs. Since no one has requested any interrupts on the 1st level irq chip, this will cause ALL interrupt groups to get masked there by the interrupt core (unused interrupts are always masked), causing the 2nd level interrupts to never trigger. Currently this is being worked around for charging interrupts by the last hunk of this commit: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/mfd/intel_soc_pmic_bxtwc.c?id=9c6235c8633210cc2da0882e2e9d6ff90aa37503 Which blames the hardware for masking the lvl1 interrupt, but this really is expected behavior of the interrupt chip since no-one has requested the BXTWC_CHGR_LVL1_IRQ. I hope the above properly explains what I believe is wrong here. If not please ask. ### Now on to a solution, if you look at almost all the consumers then they are actually interested in all interrupts in a group, so it seems that there really is no need for the bxtwc_regmap_irq_chip_level2 Taking the thermal irq as example we could simple replace: static struct resource thermal_resources[] = { DEFINE_RES_IRQ(BXTWC_THRM0_IRQ), DEFINE_RES_IRQ(BXTWC_THRM1_IRQ), DEFINE_RES_IRQ(BXTWC_THRM2_IRQ), }; With: static struct resource thermal_resources[] = { DEFINE_RES_IRQ(BXTWC_THRM_LVL1_IRQ), }; And make the thermal driver responsible for masking any status bits it is not interested in and write-1-clearing BXTWC_THRM0IRQ - BXTWC_THRM2IRQ in its interrupt handler, where it needs to read them anyway to find out the exact cause(s) of the interrupt. Note the driver needs to do a read, check set bits (and act upon them) and then write back the read value for these registers. It is important to write back the read value and not e.g. 0xff to makes sure that no interrupts are missed due to races, e.g. blindy writing 0xff to ack may cause some status bits which get set while the interrupt handler runs to get missed by the driver. This is how this is done in devices similar like this in general and I believe that the same should be done for the bxtwc code. One special case here seems to be the CHGR interrupt which gets used by 2 sub-devices. The standard solution for this would be to simple have both drivers request the same level1 irq with the SHARED flag, and have them both only check + write-1-clear the bits they are interested in. This does mean that both drivers must be loaded (or none) to avoid an interrupt storm. Alternatively you could register a 2nd level interrupt chip chained to the level1 chip (so listening to the BXTWC_CHGR_LVL1_IRQ not to pmic->irq) and have that split out things into separate irqs, that avoids the need to have both drivers successfully loaded to avoid interrupt storms. But this needs to be chained to the lvl1 chgr irq and not directly listen to pmic->irq. Another option would be to keep the current design and add a 2nd level irqchip per 1st level irq listening to the 1st level irq, but that would require 8 additional irq chips and for all consumers except the chgr consumers there is no added value in doing that. Regards, Hans p.s. There also is a bug in the code adding the TMU irqchip: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/mfd/intel_soc_pmic_bxtwc.c?id=957ae5098185e763b5c06be6c3b4b6e98c048712 Since this is a new irqchip it should use a new irq enum and not add its irqs to the bxtwc_irqs_level2 enum. Note that irqchip can be removed entirely instead directly listening to the lvl1 TMU irq, which would also fix this.