On 09/05/16 13:23, Marc Zyngier wrote:
> On 04/05/16 17:25, Jon Hunter wrote:
>> Some IRQ chips, such as GPIO controllers or secondary level interrupt
>> controllers, may require require additional runtime power management
>> control to ensure they are accessible. For such IRQ chips, it makes sense
>> to enable the IRQ chip when interrupts are requested and disabled them
>> again once all interrupts have been freed.
>>
>> When mapping an IRQ, the IRQ type settings are read and then programmed.
>> The mapping of the IRQ happens before the IRQ is requested and so the
>> programming of the type settings occurs before the IRQ is requested. This
>> is a problem for IRQ chips that require additional power management
>> control because they may not be accessible yet. Therefore, when mapping
>> the IRQ, don't program the type settings, just save them and then program
>> these saved settings when the IRQ is requested (so long as if they are not
>> overridden via the call to request the IRQ).
>>
>> Add a stub function for irq_domain_free_irqs() to avoid any compilation
>> errors when CONFIG_IRQ_DOMAIN_HIERARCHY is not selected.
>>
>> Signed-off-by: Jon Hunter <jonath...@nvidia.com>
>> ---
>>  include/linux/irqdomain.h |  3 +++
>>  kernel/irq/irqdomain.c    | 23 ++++++++++++++++++-----
>>  2 files changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
>> index 2aed04396210..fc66876d1965 100644
>> --- a/include/linux/irqdomain.h
>> +++ b/include/linux/irqdomain.h
>> @@ -440,6 +440,9 @@ static inline int irq_domain_alloc_irqs(struct 
>> irq_domain *domain,
>>      return -1;
>>  }
>>  
>> +static inline void irq_domain_free_irqs(unsigned int virq,
>> +                                    unsigned int nr_irqs) { }
>> +
>>  static inline bool irq_domain_is_hierarchy(struct irq_domain *domain)
>>  {
>>      return false;
>> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
>> index d68371213fc9..bbf5b9b8ac3d 100644
>> --- a/kernel/irq/irqdomain.c
>> +++ b/kernel/irq/irqdomain.c
>> @@ -564,6 +564,7 @@ static void of_phandle_args_to_fwspec(struct 
>> of_phandle_args *irq_data,
>>  unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
>>  {
>>      struct irq_domain *domain;
>> +    struct irq_data *irq_data;
>>      irq_hw_number_t hwirq;
>>      unsigned int type = IRQ_TYPE_NONE;
>>      int virq;
>> @@ -613,7 +614,11 @@ unsigned int irq_create_fwspec_mapping(struct 
>> irq_fwspec *fwspec)
>>               * it now and return the interrupt number.
>>               */
>>              if (irq_get_trigger_type(virq) == IRQ_TYPE_NONE) {
>> -                    irq_set_irq_type(virq, type);
>> +                    irq_data = irq_get_irq_data(virq);
>> +                    if (!irq_data)
>> +                            return 0;
>> +
>> +                    irqd_set_trigger_type(irq_data, type);
>>                      return virq;
>>              }
>>  
>> @@ -633,10 +638,18 @@ unsigned int irq_create_fwspec_mapping(struct 
>> irq_fwspec *fwspec)
>>                      return virq;
>>      }
>>  
>> -    /* Set type if specified and different than the current one */
>> -    if (type != IRQ_TYPE_NONE &&
>> -        type != irq_get_trigger_type(virq))
>> -            irq_set_irq_type(virq, type);
>> +    irq_data = irq_get_irq_data(virq);
>> +    if (!irq_data) {
>> +            if (irq_domain_is_hierarchy(domain))
>> +                    irq_domain_free_irqs(virq, 1);
>> +            else
>> +                    irq_dispose_mapping(virq);
>> +            return 0;
>> +    }
>> +
>> +    /* Store trigger type */
>> +    irqd_set_trigger_type(irq_data, type);
>> +
>>      return virq;
>>  }
>>  EXPORT_SYMBOL_GPL(irq_create_fwspec_mapping);
>>
> 
> This patch have the effect of making misconfigured PPIs absolutely
> obvious. I still need to wrap my head around the root cause, but here's
> the findings I have so far:
>
> - kvmtool generates a DT with the wrong trigger information (edge
> instead of level) for the timer.
> - with this patch applied, "cyclictest -S" reliably locks up when run in
> a guest (missing a timer interrupt, goodbye CPU).
> - Either fixing kvmtool or reverting that patch makes it work reliably
> again.
> 
> My gut feeling is that until that patch, the failing irq_set_irq_type()
> wasn't affecting the kernel's view of the trigger (it was still treated
> as level). With this patch, the kernel now trusts whatever is coming
> from the firmware, and the misconfiguration becomes obvious. And just
> grepping through the DT files for arm and arm64 sends makes me thing
> "Holly effin' crap!".
> 
> I'm not saying that we shouldn't perform this change though. But it is
> quite obvious that it is going to break an awful lot of existing code
> and platforms. I'm also cooking a small patch for the arch timer (which
> seems to be described in DT with a fairly high level of brokenness), so
> that we can mop-up most of the brain damage.

Hmmm ... yes I see. I wonder if we should make the setting of the type
here dependent upon PM being enabled for an irqchip? We could check to
see if the .parent_device is populated and if so only then save the type
and otherwise just set it as we do today.

We could add a WARN to the existing irq_set_irq_type() or may be just a
pr_warn() if a WARN is too verbose so people can fix up any issues.

I am also wondering if patch 4/17 "iqdomain: Fix handling of type
settings for existing mappings" could generate a lot of reports
interrupts failing due to bad firmware? I wonder if I should tone this
patch down to a warning message as well as opposed to a complete failure.

Cheers
Jon




Reply via email to