That's right, the maintenance interrupt handler is not called, but it
doesn't do anything so we are fine. The important thing is that an
interrupt is sent and git_clear_lrs gets called on hypervisor entry.

On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> The only ambiguity left - maintenance interrupt handler is not called.
> It was requested for specific IRQ number, retrieved from device tree.
> But when we trigger GICH_HCR_UIE - we got maintenance interrupt for
> spurious number 1023.
> 
> Regards,
> Andrii
> 
> On Wed, Nov 19, 2014 at 7:47 PM, Andrii Tseglytskyi
> <andrii.tseglyts...@globallogic.com> wrote:
> > On Wed, Nov 19, 2014 at 7:42 PM, Stefano Stabellini
> > <stefano.stabell...@eu.citrix.com> wrote:
> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> Hi Stefano,
> >>>
> >>> On Wed, Nov 19, 2014 at 7:07 PM, Stefano Stabellini
> >>> <stefano.stabell...@eu.citrix.com> wrote:
> >>> > I think that's OK: it looks like that on your board for some reasons
> >>> > when UIE is set you get irq 1023 (spurious interrupt) instead of your
> >>> > normal maintenance interrupt.
> >>>
> >>> OK, but I think this should be investigated too. What do you think ?
> >>
> >> I think it is harmless: my guess is that if we clear UIE before reading
> >> GICC_IAR, GICC_IAR returns spurious interrupt instead of maintenance
> >> interrupt. But it doesn't really matter to us.
> >
> > OK. I think catching this will be a good exercise for someone )) But
> > out of scope for this issue.
> >
> >>
> >>> >
> >>> > But everything should work anyway without issues.
> >>> >
> >>> > This is the same patch as before but on top of the lastest xen-unstable
> >>> > tree. Please confirm if it works.
> >>> >
> >>> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >>> > index 70d10d6..df140b9 100644
> >>> > --- a/xen/arch/arm/gic.c
> >>> > +++ b/xen/arch/arm/gic.c
> >>> > @@ -403,6 +403,8 @@ void gic_clear_lrs(struct vcpu *v)
> >>> >      if ( is_idle_vcpu(v) )
> >>> >          return;
> >>> >
> >>> > +    gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 0);
> >>> > +
> >>> >      spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >>> >
> >>> >      while ((i = find_next_bit((const unsigned long *) 
> >>> > &this_cpu(lr_mask),
> >>> > @@ -527,8 +529,6 @@ void gic_inject(void)
> >>> >
> >>> >      if ( !list_empty(&current->arch.vgic.lr_pending) && lr_all_full() )
> >>> >          gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 1);
> >>> > -    else
> >>> > -        gic_hw_ops->update_hcr_status(GICH_HCR_UIE, 0);
> >>> >  }
> >>> >
> >>>
> >>> I confirm - it works fine. Will this be a final fix ?
> >>
> >> Yep :-)
> >> Many thanks for your help on this!
> >
> > Thank you Stefano. This issue was really critical for us :)
> >
> > Regards,
> > Andrii
> >
> >>
> >>
> >>> Regards,
> >>> Andrii
> >>>
> >>> >  static void do_sgi(struct cpu_user_regs *regs, enum gic_sgi sgi)
> >>> >
> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> I got this strange log:
> >>> >>
> >>> >> (XEN) received maintenance interrupt irq=1023
> >>> >>
> >>> >> And platform does not hang due to this:
> >>> >> +    hcr = GICH[GICH_HCR];
> >>> >> +    if ( hcr & GICH_HCR_UIE )
> >>> >> +    {
> >>> >> +        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> +        uie_on = 1;
> >>> >> +    }
> >>> >>
> >>> >> On Wed, Nov 19, 2014 at 6:50 PM, Stefano Stabellini
> >>> >> <stefano.stabell...@eu.citrix.com> wrote:
> >>> >> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> >> On Wed, Nov 19, 2014 at 6:13 PM, Stefano Stabellini
> >>> >> >> <stefano.stabell...@eu.citrix.com> wrote:
> >>> >> >> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> >> >> On Wed, Nov 19, 2014 at 6:01 PM, Andrii Tseglytskyi
> >>> >> >> >> <andrii.tseglyts...@globallogic.com> wrote:
> >>> >> >> >> > On Wed, Nov 19, 2014 at 5:41 PM, Stefano Stabellini
> >>> >> >> >> > <stefano.stabell...@eu.citrix.com> wrote:
> >>> >> >> >> >> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> >> >> >>> Hi Stefano,
> >>> >> >> >> >>>
> >>> >> >> >> >>> On Wed, Nov 19, 2014 at 4:52 PM, Stefano Stabellini
> >>> >> >> >> >>> <stefano.stabell...@eu.citrix.com> wrote:
> >>> >> >> >> >>> > On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> >> >> >>> >> Hi Stefano,
> >>> >> >> >> >>> >>
> >>> >> >> >> >>> >> > >      if ( !list_empty(&current->arch.vgic.lr_pending) 
> >>> >> >> >> >>> >> > > && lr_all_full() )
> >>> >> >> >> >>> >> > > -        GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>> >> >> >> >>> >> > > +        GICH[GICH_HCR] |= GICH_HCR_NPIE;
> >>> >> >> >> >>> >> > >      else
> >>> >> >> >> >>> >> > > -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> >> >> >>> >> > > +        GICH[GICH_HCR] &= ~GICH_HCR_NPIE;
> >>> >> >> >> >>> >> > >
> >>> >> >> >> >>> >> > >  }
> >>> >> >> >> >>> >> >
> >>> >> >> >> >>> >> > Yes, exactly
> >>> >> >> >> >>> >>
> >>> >> >> >> >>> >> I tried, hang still occurs with this change
> >>> >> >> >> >>> >
> >>> >> >> >> >>> > We need to figure out why during the hang you still have 
> >>> >> >> >> >>> > all the LRs
> >>> >> >> >> >>> > busy even if you are getting maintenance interrupts that 
> >>> >> >> >> >>> > should cause
> >>> >> >> >> >>> > them to be cleared.
> >>> >> >> >> >>> >
> >>> >> >> >> >>>
> >>> >> >> >> >>> I see that I have free LRs during maintenance interrupt
> >>> >> >> >> >>>
> >>> >> >> >> >>> (XEN) gic.c:871:d0v0 maintenance interrupt
> >>> >> >> >> >>> (XEN) GICH_LRs (vcpu 0) mask=0
> >>> >> >> >> >>> (XEN)    HW_LR[0]=9a015856
> >>> >> >> >> >>> (XEN)    HW_LR[1]=0
> >>> >> >> >> >>> (XEN)    HW_LR[2]=0
> >>> >> >> >> >>> (XEN)    HW_LR[3]=0
> >>> >> >> >> >>> (XEN) Inflight irq=86 lr=0
> >>> >> >> >> >>> (XEN) Inflight irq=2 lr=255
> >>> >> >> >> >>> (XEN) Pending irq=2
> >>> >> >> >> >>>
> >>> >> >> >> >>> But I see that after I got hang - maintenance interrupts are 
> >>> >> >> >> >>> generated
> >>> >> >> >> >>> continuously. Platform continues printing the same log till 
> >>> >> >> >> >>> reboot.
> >>> >> >> >> >>
> >>> >> >> >> >> Exactly the same log? As in the one above you just pasted?
> >>> >> >> >> >> That is very very suspicious.
> >>> >> >> >> >
> >>> >> >> >> > Yes exactly the same log. And looks like it means that LRs are 
> >>> >> >> >> > flushed
> >>> >> >> >> > correctly.
> >>> >> >> >> >
> >>> >> >> >> >>
> >>> >> >> >> >> I am thinking that we are not handling GICH_HCR_UIE correctly 
> >>> >> >> >> >> and
> >>> >> >> >> >> something we do in Xen, maybe writing to an LR register, 
> >>> >> >> >> >> might trigger a
> >>> >> >> >> >> new maintenance interrupt immediately causing an infinite 
> >>> >> >> >> >> loop.
> >>> >> >> >> >>
> >>> >> >> >> >
> >>> >> >> >> > Yes, this is what I'm thinking about. Taking in account all 
> >>> >> >> >> > collected
> >>> >> >> >> > debug info it looks like once LRs are overloaded with SGIs -
> >>> >> >> >> > maintenance interrupt occurs.
> >>> >> >> >> > And then it is not handled properly, and occurs again and 
> >>> >> >> >> > again - so
> >>> >> >> >> > platform hangs inside its handler.
> >>> >> >> >> >
> >>> >> >> >> >> Could you please try this patch? It disable GICH_HCR_UIE 
> >>> >> >> >> >> immediately on
> >>> >> >> >> >> hypervisor entry.
> >>> >> >> >> >>
> >>> >> >> >> >
> >>> >> >> >> > Now trying.
> >>> >> >> >> >
> >>> >> >> >> >>
> >>> >> >> >> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >>> >> >> >> >> index 4d2a92d..6ae8dc4 100644
> >>> >> >> >> >> --- a/xen/arch/arm/gic.c
> >>> >> >> >> >> +++ b/xen/arch/arm/gic.c
> >>> >> >> >> >> @@ -701,6 +701,8 @@ void gic_clear_lrs(struct vcpu *v)
> >>> >> >> >> >>      if ( is_idle_vcpu(v) )
> >>> >> >> >> >>          return;
> >>> >> >> >> >>
> >>> >> >> >> >> +    GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> >> >> >> +
> >>> >> >> >> >>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >>> >> >> >> >>
> >>> >> >> >> >>      while ((i = find_next_bit((const unsigned long *) 
> >>> >> >> >> >> &this_cpu(lr_mask),
> >>> >> >> >> >> @@ -821,12 +823,8 @@ void gic_inject(void)
> >>> >> >> >> >>
> >>> >> >> >> >>      gic_restore_pending_irqs(current);
> >>> >> >> >> >>
> >>> >> >> >> >> -
> >>> >> >> >> >>      if ( !list_empty(&current->arch.vgic.lr_pending) && 
> >>> >> >> >> >> lr_all_full() )
> >>> >> >> >> >>          GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>> >> >> >> >> -    else
> >>> >> >> >> >> -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> >> >> >> -
> >>> >> >> >> >>  }
> >>> >> >> >> >>
> >>> >> >> >> >>  static void do_sgi(struct cpu_user_regs *regs, int othercpu, 
> >>> >> >> >> >> enum gic_sgi sgi)
> >>> >> >> >> >
> >>> >> >> >>
> >>> >> >> >> Heh - I don't see hangs with this patch :) But also I see that
> >>> >> >> >> maintenance interrupt doesn't occur (and no hang as result)
> >>> >> >> >> Stefano - is this expected?
> >>> >> >> >
> >>> >> >> > No maintenance interrupts at all? That's strange. You should be
> >>> >> >> > receiving them when LRs are full and you still have interrupts 
> >>> >> >> > pending
> >>> >> >> > to be added to them.
> >>> >> >> >
> >>> >> >> > You could add another printk here to see if you should be 
> >>> >> >> > receiving
> >>> >> >> > them:
> >>> >> >> >
> >>> >> >> >      if ( !list_empty(&current->arch.vgic.lr_pending) && 
> >>> >> >> > lr_all_full() )
> >>> >> >> > +    {
> >>> >> >> > +        gdprintk(XENLOG_DEBUG, "requesting maintenance 
> >>> >> >> > interrupt\n");
> >>> >> >> >          GICH[GICH_HCR] |= GICH_HCR_UIE;
> >>> >> >> > -    else
> >>> >> >> > -        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> >> > -
> >>> >> >> > +    }
> >>> >> >> >  }
> >>> >> >> >
> >>> >> >>
> >>> >> >> Requested properly:
> >>> >> >>
> >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> >> >> (XEN) gic.c:756:d0v0 requesting maintenance interrupt
> >>> >> >>
> >>> >> >> But does not occur
> >>> >> >
> >>> >> > OK, let's see what's going on then by printing the irq number of the
> >>> >> > maintenance interrupt:
> >>> >> >
> >>> >> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >>> >> > index 4d2a92d..fed3167 100644
> >>> >> > --- a/xen/arch/arm/gic.c
> >>> >> > +++ b/xen/arch/arm/gic.c
> >>> >> > @@ -55,6 +55,7 @@ static struct {
> >>> >> >  static DEFINE_PER_CPU(uint64_t, lr_mask);
> >>> >> >
> >>> >> >  static uint8_t nr_lrs;
> >>> >> > +static bool uie_on;
> >>> >> >  #define lr_all_full() (this_cpu(lr_mask) == ((1 << nr_lrs) - 1))
> >>> >> >
> >>> >> >  /* The GIC mapping of CPU interfaces does not necessarily match the
> >>> >> > @@ -694,6 +695,7 @@ void gic_clear_lrs(struct vcpu *v)
> >>> >> >  {
> >>> >> >      int i = 0;
> >>> >> >      unsigned long flags;
> >>> >> > +    unsigned long hcr;
> >>> >> >
> >>> >> >      /* The idle domain has no LRs to be cleared. Since 
> >>> >> > gic_restore_state
> >>> >> >       * doesn't write any LR registers for the idle domain they 
> >>> >> > could be
> >>> >> > @@ -701,6 +703,13 @@ void gic_clear_lrs(struct vcpu *v)
> >>> >> >      if ( is_idle_vcpu(v) )
> >>> >> >          return;
> >>> >> >
> >>> >> > +    hcr = GICH[GICH_HCR];
> >>> >> > +    if ( hcr & GICH_HCR_UIE )
> >>> >> > +    {
> >>> >> > +        GICH[GICH_HCR] &= ~GICH_HCR_UIE;
> >>> >> > +        uie_on = 1;
> >>> >> > +    }
> >>> >> > +
> >>> >> >      spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >>> >> >
> >>> >> >      while ((i = find_next_bit((const unsigned long *) 
> >>> >> > &this_cpu(lr_mask),
> >>> >> > @@ -865,6 +873,11 @@ void gic_interrupt(struct cpu_user_regs *regs, 
> >>> >> > int is_fiq)
> >>> >> >          intack = GICC[GICC_IAR];
> >>> >> >          irq = intack & GICC_IA_IRQ;
> >>> >> >
> >>> >> > +        if ( uie_on )
> >>> >> > +        {
> >>> >> > +            uie_on = 0;
> >>> >> > +            printk("received maintenance interrupt irq=%d\n", irq);
> >>> >> > +        }
> >>> >> >          if ( likely(irq >= 16 && irq < 1021) )
> >>> >> >          {
> >>> >> >              local_irq_enable();
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >>
> >>> >> Andrii Tseglytskyi | Embedded Dev
> >>> >> GlobalLogic
> >>> >> www.globallogic.com
> >>> >>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Andrii Tseglytskyi | Embedded Dev
> >>> GlobalLogic
> >>> www.globallogic.com
> >>>
> >
> >
> >
> > --
> >
> > Andrii Tseglytskyi | Embedded Dev
> > GlobalLogic
> > www.globallogic.com
> 
> 
> 
> -- 
> 
> Andrii Tseglytskyi | Embedded Dev
> GlobalLogic
> www.globallogic.com
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to