* Ronciak, John ([email protected]) wrote:
> Hi Dave,
> 
> Please see my comments in-line below.
> 
> Cheers,
> John
> 
> 
> > -----Original Message-----
> > From: Dr. David Alan Gilbert [mailto:[email protected]]
> > Sent: Monday, February 03, 2014 12:11 PM
> > To: Ronciak, John
> > Cc: [email protected]
> > Subject: Re: [E1000-devel] dual e100 'exec cuc_dump_reset' vs PCI
> > latency (possibly vs Tulip)
> > 
> > * Ronciak, John ([email protected]) wrote:
> > > Some list removed for now
> > 
> > Hi John,
> >   Thanks for the reply.
> > 
> > > What do the HW stats for the failing port say?
> > > Is it receiving what it thinks are packets that a problem in some
> > way?
> > 
> > I'm fairly sure they weren't incrementing at all, and I took a tcpdump
> > that was showing nothing coming from the e100's at that point.
> > Let me know which counters/debug to collect and I'll be happy to gather
> > it.
> Output the stats using 'ethtool -S <ethx>'.  Do this before the failure and 
> then again after. You can also get us the stack stats using 'netstat -s'. 

The log of ethtool -S and netstat -s  is at: 
http://www.treblig.org/daveG/bad-e100.log

that's sitting in a loop doing it once a minute (the 'NIC statistics' is the 
output of
the ethtool)

To line those times up here are the lines from the dmesg; 

I powered the switch/machines on the end of the e100 up here
[Sat Feb  8 19:40:08 2014] e100 0000:08:04.0 ethdad: NIC Link is Up 100 Mbps 
Full Duplex

and it was watching the camera a few minutes later, until it died here:

[Sat Feb  8 19:58:06 2014] e100 0000:08:04.0 ethdad: exec cuc_dump_reset failed

<repeats regularly>

[Sat Feb  8 19:58:10 2014] e100 0000:08:04.0 ethdad: exec cuc_dump_reset failed
[Sat Feb  8 19:59:54 2014] e100 0000:08:04.0 ethdad: No space for CB
[Sat Feb  8 19:59:54 2014] e100 0000:08:04.0 ethdad: scb.status=0x50

It recovered itself about here (I'd not seen it recover before, or do the
No space for CB/scb.status before)

[Sat Feb  8 19:59:54 2014] e100 0000:08:04.0 ethdad: NIC Link is Up 100 Mbps 
Full Duplex
[Sat Feb  8 20:01:58 2014] e100 0000:08:04.0 ethdad: NIC Link is Down
[Sat Feb  8 20:05:40 2014] e100 0000:08:04.0 ethdad: NIC Link is Up 100 Mbps 
Full Duplex

If you search in the bad-e100.log for 19:56 that's before it failed, and 
everything seems like
it's chugging along OK until 19:57, but then there is no change in the output 
of the ethtool
between 19:58 and 19:59.

Note:
    there is other stuff going on other interfaces that the netstat -s is seeing

    I've only got one e100 in use tonight; and the other Tulips seem to be 
carrying on ok
    even when the e100 is upset.

Dave
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\ gro.gilbert @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to