That's right, the maintenance interrupt handler is not called, but it doesn't do anything so we are fine. The important thing is that an interrupt is sent and git_clear_lrs gets called on hypervisor entry.
On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: > The only ambiguity left - maintenance interrupt handler is not called. > It was requested for specific IRQ number, retrieved from device tree. > But when we trigger GICH_HCR_UIE - we got maintenance interrupt for > spurious number 1023. > > Regards, > Andrii > > On Wed, Nov 19, 2014 at 7:47 PM, Andrii Tseglytskyi > <andrii.tseglyts...@globallogic.com> wrote: > > On Wed, Nov 19, 2014 at 7:42 PM, Stefano Stabellini > > <stefano.stabell...@eu.citrix.com> wrote: > >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: > >>> Hi Stefano, > >>> > >>> On Wed, Nov 19, 2014 at 7:07 PM, Stefano Stabellini > >>> <stefano.stabell...@eu.citrix.com> wrote: > >>> > I think that's OK: it looks like that on your board for some reasons > >>> > when UIE is set you get irq 1023 (spurious interrupt) instead of your > >>> > normal maintenance interrupt. > >>> > >>> OK, but I think this should be investigated too. What do you think ? > >> > >> I think it is harmless: my guess is that if we clear UIE before reading > >> GICC_IAR, GICC_IAR returns spurious interrupt instead of maintenance > >> interrupt. But it doesn't really matter to us. > > > > OK. I think catching this will be a good exercise for someone )) But > > out of scope for this issue. > > > >> > >>> > > >>> > But everything should work anyway without issues. > >>> > > >>> > This is the same patch as before but on top of the lastest xen-unstable > >>> > tree. Please confirm if it works. > >>> > > >>> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > >>> > index 70d10d6..df140b9 100644 > >>> > --- a/xen/arch/arm/gic.c > >>> > +++ b/xen/arch/arm/gic.c > >>> > @@ -403,6 +403,8 @@ void gic_clear_lrs(struct vcpu *v) > >>> > if ( is_idle_vcpu(v) ) > >>> > return; > >>> > > >>> > + gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 0); > >>> > + > >>> > spin_lock_irqsave(&v->arch.vgic.lock, flags); > >>> > > >>> > while ((i = find_next_bit((const unsigned long *) > >>> > &this_cpu(lr_mask), > >>> > @@ -527,8 +529,6 @@ void gic_inject(void) > >>> > > >>> > if ( !list_empty(¤t->arch.vgic.lr_pending) && lr_all_full() ) > >>> > gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 1); > >>> > - else > >>> > - gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 0); > >>> > } > >>> > > >>> > >>> I confirm - it works fine. Will this be a final fix ? > >> > >> Yep :-) > >> Many thanks for your help on this! > > > > Thank you Stefano. This issue was really critical for us :) > > > > Regards, > > Andrii > > > >> > >> > >>> Regards, > >>> Andrii > >>> > >>> > static void do_sgi(struct cpu_user_regs *regs, enum gic_sgi sgi) > >>> > > >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: > >>> >> I got this strange log: > >>> >> > >>> >> (XEN) received maintenance interrupt irq=1023 > >>> >> > >>> >> And platform does not hang due to this: > >>> >> + hcr = GICH[GICH_HCR]; > >>> >> + if ( hcr & GICH_HCR_UIE ) > >>> >> + { > >>> >> + GICH[GICH_HCR] &= ~GICH_HCR_UIE; > >>> >> + uie_on = 1; > >>> >> + } > >>> >> > >>> >> On Wed, Nov 19, 2014 at 6:50 PM, Stefano Stabellini > >>> >> <stefano.stabell...@eu.citrix.com> wrote: > >>> >> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: > >>> >> >> On Wed, Nov 19, 2014 at 6:13 PM, Stefano Stabellini > >>> >> >> <stefano.stabell...@eu.citrix.com> wrote: > >>> >> >> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: > >>> >> >> >> On Wed, Nov 19, 2014 at 6:01 PM, Andrii Tseglytskyi > >>> >> >> >> <andrii.tseglyts...@globallogic.com> wrote: > >>> >> >> >> > On Wed, Nov 19, 2014 at 5:41 PM, Stefano Stabellini > >>> >> >> >> > <stefano.stabell...@eu.citrix.com> wrote: > >>> >> >> >> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: > >>> >> >> >> >>> Hi Stefano, > >>> >> >> >> >>> > >>> >> >> >> >>> On Wed, Nov 19, 2014 at 4:52 PM, Stefano Stabellini > >>> >> >> >> >>> <stefano.stabell...@eu.citrix.com> wrote: > >>> >> >> >> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: > >>> >> >> >> >>> >> Hi Stefano, > >>> >> >> >> >>> >> > >>> >> >> >> >>> >> > > if ( !list_empty(¤t->arch.vgic.lr_pending) > >>> >> >> >> >>> >> > > && lr_all_full() ) > >>> >> >> >> >>> >> > > - GICH[GICH_HCR] |= GICH_HCR_UIE; > >>> >> >> >> >>> >> > > + GICH[GICH_HCR] |= GICH_HCR_NPIE; > >>> >> >> >> >>> >> > > else > >>> >> >> >> >>> >> > > - GICH[GICH_HCR] &= ~GICH_HCR_UIE; > >>> >> >> >> >>> >> > > + GICH[GICH_HCR] &= ~GICH_HCR_NPIE; > >>> >> >> >> >>> >> > > > >>> >> >> >> >>> >> > > } > >>> >> >> >> >>> >> > > >>> >> >> >> >>> >> > Yes, exactly > >>> >> >> >> >>> >> > >>> >> >> >> >>> >> I tried, hang still occurs with this change > >>> >> >> >> >>> > > >>> >> >> >> >>> > We need to figure out why during the hang you still have > >>> >> >> >> >>> > all the LRs > >>> >> >> >> >>> > busy even if you are getting maintenance interrupts that > >>> >> >> >> >>> > should cause > >>> >> >> >> >>> > them to be cleared. > >>> >> >> >> >>> > > >>> >> >> >> >>> > >>> >> >> >> >>> I see that I have free LRs during maintenance interrupt > >>> >> >> >> >>> > >>> >> >> >> >>> (XEN) gic.c:871:d0v0 maintenance interrupt > >>> >> >> >> >>> (XEN) GICH_LRs (vcpu 0) mask=0 > >>> >> >> >> >>> (XEN) HW_LR[0]=9a015856 > >>> >> >> >> >>> (XEN) HW_LR[1]=0 > >>> >> >> >> >>> (XEN) HW_LR[2]=0 > >>> >> >> >> >>> (XEN) HW_LR[3]=0 > >>> >> >> >> >>> (XEN) Inflight irq=86 lr=0 > >>> >> >> >> >>> (XEN) Inflight irq=2 lr=255 > >>> >> >> >> >>> (XEN) Pending irq=2 > >>> >> >> >> >>> > >>> >> >> >> >>> But I see that after I got hang - maintenance interrupts are > >>> >> >> >> >>> generated > >>> >> >> >> >>> continuously. Platform continues printing the same log till > >>> >> >> >> >>> reboot. > >>> >> >> >> >> > >>> >> >> >> >> Exactly the same log? As in the one above you just pasted? > >>> >> >> >> >> That is very very suspicious. > >>> >> >> >> > > >>> >> >> >> > Yes exactly the same log. And looks like it means that LRs are > >>> >> >> >> > flushed > >>> >> >> >> > correctly. > >>> >> >> >> > > >>> >> >> >> >> > >>> >> >> >> >> I am thinking that we are not handling GICH_HCR_UIE correctly > >>> >> >> >> >> and > >>> >> >> >> >> something we do in Xen, maybe writing to an LR register, > >>> >> >> >> >> might trigger a > >>> >> >> >> >> new maintenance interrupt immediately causing an infinite > >>> >> >> >> >> loop. > >>> >> >> >> >> > >>> >> >> >> > > >>> >> >> >> > Yes, this is what I'm thinking about. Taking in account all > >>> >> >> >> > collected > >>> >> >> >> > debug info it looks like once LRs are overloaded with SGIs - > >>> >> >> >> > maintenance interrupt occurs. > >>> >> >> >> > And then it is not handled properly, and occurs again and > >>> >> >> >> > again - so > >>> >> >> >> > platform hangs inside its handler. > >>> >> >> >> > > >>> >> >> >> >> Could you please try this patch? It disable GICH_HCR_UIE > >>> >> >> >> >> immediately on > >>> >> >> >> >> hypervisor entry. > >>> >> >> >> >> > >>> >> >> >> > > >>> >> >> >> > Now trying. > >>> >> >> >> > > >>> >> >> >> >> > >>> >> >> >> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > >>> >> >> >> >> index 4d2a92d..6ae8dc4 100644 > >>> >> >> >> >> --- a/xen/arch/arm/gic.c > >>> >> >> >> >> +++ b/xen/arch/arm/gic.c > >>> >> >> >> >> @@ -701,6 +701,8 @@ void gic_clear_lrs(struct vcpu *v) > >>> >> >> >> >> if ( is_idle_vcpu(v) ) > >>> >> >> >> >> return; > >>> >> >> >> >> > >>> >> >> >> >> + GICH[GICH_HCR] &= ~GICH_HCR_UIE; > >>> >> >> >> >> + > >>> >> >> >> >> spin_lock_irqsave(&v->arch.vgic.lock, flags); > >>> >> >> >> >> > >>> >> >> >> >> while ((i = find_next_bit((const unsigned long *) > >>> >> >> >> >> &this_cpu(lr_mask), > >>> >> >> >> >> @@ -821,12 +823,8 @@ void gic_inject(void) > >>> >> >> >> >> > >>> >> >> >> >> gic_restore_pending_irqs(current); > >>> >> >> >> >> > >>> >> >> >> >> - > >>> >> >> >> >> if ( !list_empty(¤t->arch.vgic.lr_pending) && > >>> >> >> >> >> lr_all_full() ) > >>> >> >> >> >> GICH[GICH_HCR] |= GICH_HCR_UIE; > >>> >> >> >> >> - else > >>> >> >> >> >> - GICH[GICH_HCR] &= ~GICH_HCR_UIE; > >>> >> >> >> >> - > >>> >> >> >> >> } > >>> >> >> >> >> > >>> >> >> >> >> static void do_sgi(struct cpu_user_regs *regs, int othercpu, > >>> >> >> >> >> enum gic_sgi sgi) > >>> >> >> >> > > >>> >> >> >> > >>> >> >> >> Heh - I don't see hangs with this patch :) But also I see that > >>> >> >> >> maintenance interrupt doesn't occur (and no hang as result) > >>> >> >> >> Stefano - is this expected? > >>> >> >> > > >>> >> >> > No maintenance interrupts at all? That's strange. You should be > >>> >> >> > receiving them when LRs are full and you still have interrupts > >>> >> >> > pending > >>> >> >> > to be added to them. > >>> >> >> > > >>> >> >> > You could add another printk here to see if you should be > >>> >> >> > receiving > >>> >> >> > them: > >>> >> >> > > >>> >> >> > if ( !list_empty(¤t->arch.vgic.lr_pending) && > >>> >> >> > lr_all_full() ) > >>> >> >> > + { > >>> >> >> > + gdprintk(XENLOG_DEBUG, "requesting maintenance > >>> >> >> > interrupt\n"); > >>> >> >> > GICH[GICH_HCR] |= GICH_HCR_UIE; > >>> >> >> > - else > >>> >> >> > - GICH[GICH_HCR] &= ~GICH_HCR_UIE; > >>> >> >> > - > >>> >> >> > + } > >>> >> >> > } > >>> >> >> > > >>> >> >> > >>> >> >> Requested properly: > >>> >> >> > >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt > >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt > >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt > >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt > >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt > >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt > >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt > >>> >> >> > >>> >> >> But does not occur > >>> >> > > >>> >> > OK, let's see what's going on then by printing the irq number of the > >>> >> > maintenance interrupt: > >>> >> > > >>> >> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > >>> >> > index 4d2a92d..fed3167 100644 > >>> >> > --- a/xen/arch/arm/gic.c > >>> >> > +++ b/xen/arch/arm/gic.c > >>> >> > @@ -55,6 +55,7 @@ static struct { > >>> >> > static DEFINE_PER_CPU(uint64_t, lr_mask); > >>> >> > > >>> >> > static uint8_t nr_lrs; > >>> >> > +static bool uie_on; > >>> >> > #define lr_all_full() (this_cpu(lr_mask) == ((1 << nr_lrs) - 1)) > >>> >> > > >>> >> > /* The GIC mapping of CPU interfaces does not necessarily match the > >>> >> > @@ -694,6 +695,7 @@ void gic_clear_lrs(struct vcpu *v) > >>> >> > { > >>> >> > int i = 0; > >>> >> > unsigned long flags; > >>> >> > + unsigned long hcr; > >>> >> > > >>> >> > /* The idle domain has no LRs to be cleared. Since > >>> >> > gic_restore_state > >>> >> > * doesn't write any LR registers for the idle domain they > >>> >> > could be > >>> >> > @@ -701,6 +703,13 @@ void gic_clear_lrs(struct vcpu *v) > >>> >> > if ( is_idle_vcpu(v) ) > >>> >> > return; > >>> >> > > >>> >> > + hcr = GICH[GICH_HCR]; > >>> >> > + if ( hcr & GICH_HCR_UIE ) > >>> >> > + { > >>> >> > + GICH[GICH_HCR] &= ~GICH_HCR_UIE; > >>> >> > + uie_on = 1; > >>> >> > + } > >>> >> > + > >>> >> > spin_lock_irqsave(&v->arch.vgic.lock, flags); > >>> >> > > >>> >> > while ((i = find_next_bit((const unsigned long *) > >>> >> > &this_cpu(lr_mask), > >>> >> > @@ -865,6 +873,11 @@ void gic_interrupt(struct cpu_user_regs *regs, > >>> >> > int is_fiq) > >>> >> > intack = GICC[GICC_IAR]; > >>> >> > irq = intack & GICC_IA_IRQ; > >>> >> > > >>> >> > + if ( uie_on ) > >>> >> > + { > >>> >> > + uie_on = 0; > >>> >> > + printk("received maintenance interrupt irq=%d\n", irq); > >>> >> > + } > >>> >> > if ( likely(irq >= 16 && irq < 1021) ) > >>> >> > { > >>> >> > local_irq_enable(); > >>> >> > >>> >> > >>> >> > >>> >> -- > >>> >> > >>> >> Andrii Tseglytskyi | Embedded Dev > >>> >> GlobalLogic > >>> >> www.globallogic.com > >>> >> > >>> > >>> > >>> > >>> -- > >>> > >>> Andrii Tseglytskyi | Embedded Dev > >>> GlobalLogic > >>> www.globallogic.com > >>> > > > > > > > > -- > > > > Andrii Tseglytskyi | Embedded Dev > > GlobalLogic > > www.globallogic.com > > > > -- > > Andrii Tseglytskyi | Embedded Dev > GlobalLogic > www.globallogic.com > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel