Hello Dave,

It looks like we identified the problem.

We are working on fix and will send it as soon as it is ready.

~Dmitry.

Sent from my iPhone

> On 15 May 2017, at 12:22, Dr. David Alan Gilbert <dgilb...@redhat.com> wrote:
> 
> * Dmitry Fleytman (dmi...@daynix.com) wrote:
>> Hello Dave,
> 
> Hi Dmitry,
>  Thanks for the reply.
> 
>> We are trying to reproduce this issue on our systems but with no luck so far…
> 
> Note our QE hit this with both a Win8.1 and a win2012r2 guest - although
> the 2012r2 is reported to have recoverd after a few minutes.
> 2016 apparently works OK.
> 
>> From what you describe it looks like some bit in ICR is not being cleared by 
>> the driver.
>> This usually means that this bit should never be set in that specific 
>> interrupt mode.
>> 
>> Could you please check which bit is not cleared and who sets it?
> 
> The full set of e1000e_irq_pending_interrupts after migration is:
> 23004@1494519346.673905:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 
> (ICR: 0x80100082, IMS: 0x1f00004)
> 23004@1494519346.674787:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x80100082, IMS: 0x1e00004)
> 23004@1494519346.674946:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x80100082, IMS: 0x1e00004)
> 23004@1494519346.675119:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 
> (ICR: 0x80300082, IMS: 0x1e00004)
> 23004@1494519346.675302:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x80100082, IMS: 0x1c00004)
>  <repeated lots>
> 23004@1494519346.716279:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x80300082, IMS: 0x1c00004)
> 23004@1494519346.716380:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 
> (ICR: 0x813000c2, IMS: 0x1c00004)
> 23004@1494519346.717040:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 
> (ICR: 0x813000c2, IMS: 0x1400004)
> 23004@1494519346.717276:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 
> (ICR: 0x813000c2, IMS: 0x1000004)
> 23004@1494519346.717443:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x813000c2, IMS: 0x4)
> 23004@1494519346.717567:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x813000c2, IMS: 0x4)
> 23004@1494519346.717782:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x813000c2, IMS: 0x4)
> 23004@1494519346.717918:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x813000c2, IMS: 0x4)
> 23004@1494519346.718319:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x813000c2, IMS: 0x4)
> 23004@1494519346.718523:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 
> (ICR: 0x813000c2, IMS: 0xa00004)
> 23004@1494519346.718684:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0x4)
> 23004@1494519346.718890:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0x4)
> 23004@1494519346.719034:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0xa00004)
> 23004@1494519346.719130:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0xa00004)
>  <repeats>
> 23004@1494519346.722699:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0xa00004)
> 23004@1494519346.722868:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 
> (ICR: 0x813000c2, IMS: 0xa00004)
> 23004@1494519346.723068:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0x800004)
>  <repeats>
> 23004@1494519346.731198:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x813000c2, IMS: 0x800004)
> 23004@1494519346.731422:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x813000c2, IMS: 0x4)
> 23004@1494519346.731930:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 
> (ICR: 0x813000c2, IMS: 0xa00004)
> 23004@1494519346.732082:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0x4)
> 23004@1494519346.732274:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0x4)
> 23004@1494519346.732404:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0xa00004)
> 23004@1494519346.732504:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x811000c2, IMS: 0xa00004)
> 23004@1494519346.784150:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.786506:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.786534:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.789644:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 
> (ICR: 0x815000c2, IMS: 0x1a00004)
> 23004@1494519346.789864:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.789992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.790413:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.790539:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.792593:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.792620:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x815000c2, IMS: 0xa00004)
> 23004@1494519346.795943:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 
> (ICR: 0x815000c2, IMS: 0x1a00004)
> 
> and then I think we get stuck in this cycle of this one always being the
> one that fires repeatedly.  I think that's the 'other' firing, I think
> because of the receive-overrun.  One thing I've not
> figured out is why the receive overrun happens - is that because we
> really have a very heavy packet rate or is it because something has
> stopped receiving them.
> The network I'm testing on does have a fair amount of broadcast traffic
> on.
> 
> Dave
> 
>> Regards,
>> Dmitry
>> 
>>> On 11 May 2017, at 15:36 PM, Dr. David Alan Gilbert <dgilb...@redhat.com> 
>>> wrote:
>>> 
>>> Hi Dmitry,
>>> Have you seen any problems with e1000e migration under windows?
>>> I've got a repeatable case where after migration with e1000e windows
>>> hangs/almost hangs.
>>> I'm seeing the e1000e generate interrupts at a very very high
>>> rate (maybe ~1000 second ish?) after migration.
>>> 
>>> Some versions of qemu do it and some dont, but my attempts
>>> at bisection lead me to code that should be irrelevant.
>>> 
>>> Prior to migration I see:
>>> 
>>> 36461@1494504466.711929:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 
>>> (ICR: 0x80100082, IMS: 0x1f00004)
>>> 36461@1494504466.711992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 
>>> (ICR: 0x80000082, IMS: 0x1a00004)
>>> 36461@1494504466.712076:e1000e_irq_pending_interrupts ICR PENDING: 0x0 
>>> (ICR: 0x80000082, IMS: 0x1f00004)
>>> 36461@1494504466.712245:e1000e_irq_pending_interrupts ICR PENDING: 0x0 
>>> (ICR: 0x80000082, IMS: 0x1a00004)
>>> 36461@1494504466.712332:e1000e_irq_pending_interrupts ICR PENDING: 0x0 
>>> (ICR: 0x80000082, IMS: 0x1f00004)
>>> 
>>> which I think the ICR means:
>>>     31 - int asserted
>>>     20 - RxQ0 - receive queue 0 interrupt
>>>     7  - RXT0 - receiver timer interrupt
>>>     1  - TXQE - Transmit Queue empty
>>> 
>>> after migration it varies more, I'm seeing mostly:
>>> 21977@1494504516.320707:e1000e_irq_pending_interrupts ICR PENDING: 
>>> 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
>>>     31 - int asserted
>>>     24 - 'Other'
>>>     22 - TxQ0 interrupt
>>>     20 - RxQ0 interrupt
>>>     07 - RXT0 Receiver timer interrupt
>>>     06 - RX0 - Receiver overrun
>>>     01 - TXQE - Transmit queue empty
>>> 
>>> For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935
>>> 
>>> Dave
>>> --
>>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>> 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Reply via email to