On Wed, 24 May 2006, Jeff Garzik wrote: > Brent Cook wrote: > > Note that this is just clearing the hardware statistics on the interface, > > and > > would not require any kind of atomic_increment addition for interfaces that > > support that. It would be kind-of awkward to implement this on drivers that > > > > increment stats in hardware though (lo, vlan, br, etc.) This also brings up > > the question of resetting the stats for 'netstat -s' > > If you don't atomically clear the statistics, then you are leaving open > a window where the stats could easily be corrupted, if the network > interface is under load. > > This 'clearing' operation has implications on the rest of the statistics > usage. > > More complexity, and breaking of apps, when we could just use the > existing, working system? I'll take the "do nothing, break nothing, > everything still works" route any day.
I'll admit to not knowing all the intricacies of the kernel coding involved, but I don't offhand see how zeroing the stats would be significantly more complex than updating the stats during normal usage. But I'll have to leave that argument to the experts. To me the main argument is that such a stat zeroing feature would be extremely useful. When trying to track down nasty networking problems that traverse a multitude of devices, it is often highly desirable to zero the interface statistics on all the interfaces in the path (which is available on all networking switches and routers I have worked with), run some kind of stress test across the path, and then examine the packet and error counters on all the involved interfaces. This makes it easy to pinpoint where packets are getting lost or errors are being introduced, especially when there are scores of stats per device and you may not even know a priori exactly what you are looking for. Using such a scheme, the human mind can quickly discern patterns in the data and focus in on any likely problem areas. The human mind (at least speaking for myself) is not nearly as adept when having to deal with deltas. Yes, you can record the initial state of all the devices, run the stress test, record the new state of all the devices, and then spend a large amount of time devising a script to calculate all the deltas for all the scores of variables on all the involved devices, and then finally try and figure out what is wrong. But it would be so much better, easier, and more efficient, if the kernel simply provided such a feature that almost all other networking devices provide. I also think the SNMP/mgt apps argument is specious. A) SNMP isn't even an issue with all networks. B) As has been pointed out by others, there is no requirement to have to use such a new stats zeroing feature. It would simply be a tool in the network engineer's toolbelt, just like possibly taking an interface down and back up to see if it corrects a problem. The network engineer has to balance the potential benefit/harm of any action he chooses to take, but let him have that choice. And C) I don't think any decent SNMP/mgt app will be particularly bothered by zeroing interface stats. I believe they are fairly decent about dealing with such events (I don't recall our MRTG graphs getting any giant spikes when I've zeroed interface stats on our GigE/10-GigE switches). I think the main harm in such a case would be the loss of a sampling interval. -Bill - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html