On Thu, Aug 18, 2005 at 03:01:39PM -0700, David S. Miller wrote:

> Yes, it all boils down to what we intend rx_errors to mean.
> Purely hardware errors?  Or also include errors due to running
> out of resources necessary for the card to continue operating
> correctly (ie. packet buffer memory exhaustion).
 
My original thought was that the stats in question were motivated by
MIB-II (RFC1213, RFC1573) or inspired by the same.  The corresponding
MIB-II definition would be ifInDiscards: "The number of inbound packets
which were chosen to be discarded even though no errors had been
detected to prevent their being deliverable to a higher-layer protocol.
One possible reason for discarding such a packet could be to free up
buffer space."  That definition would not seem to make any distinction
between hardware buffer constraints and OS allocation failures.  So,
I was thinking that these frames should be counted in rx_discards.

> The commentary in net_device_stats's definition, and the fact that
> a seperate "rx_dropped" defined as "no space in linux buffers"
> exists in the structure, leads me to conclude that you are right
> in that rx_errors should not be incremented for the rx_dropped
> case.
> 
> On the other hand, these statistics you are adding to rx_dropped
> from rx_errors are generated by the hardware, long before the
> drivers gets a chance to try and process the packet in any way.
 
Along the way, I did note the existence of another field in the
net_device_stats structure called rx_missed_errors.  Some drivers
were recording this class of frame under that counter, including e1000.

I looked at the net-snmp sources to determine how they were collecting
MIB-II stats.  They are parsing /proc/net/dev for those stats, with
ifInDiscards coming directly from the discards field of that table.
Looking deeper reveals that the value in that field of the table is
the sum of rx_discards and rx_missed_errors.

> I really think rx_dropped is meant to be a purely software driver
> piece of state meaning "could not allocate RX replenish buffer,
> therefore dropped packet and recycled it back to the chip",
> and nothing else.

So, now I'm basically convinced that frames dropped by hardware should
be counted in the rx_missed_errors field.  SNMP agents and others
wanting the equivalent of ifInDiscards just have to either take the
info from /proc/net/dev or be smart enough to sum the rx_discards
and rx_missed_errors fields of the structure.  I'll include another
sample patch for discussion that includes this change from e100,
e1000, and tg3.

The question that I think remains is whether or not frames counted as
rx_missed_errors should also be counted as part of rx_errors, or not.
Your statement above would seem to favor _not_ counting them as part
of rx_errors.  If so, is there any appetite for a patch to rename
rx_missed_errors as simply rx_missed?  That would seem more clear,
but I dunno if that definition is getting used in userland...?

Anyway, I've got the patch for e100/e1000/tg3 below for discussion
only.  If no one has any problems with it, I'll repackage it in a
more formal post.

Thanks,

John

diff --git a/drivers/net/e100.c b/drivers/net/e100.c
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -1198,13 +1198,13 @@ static void e100_update_stats(struct nic
                ns->collisions += nic->tx_collisions;
                ns->tx_errors += le32_to_cpu(s->tx_max_collisions) +
                        le32_to_cpu(s->tx_lost_crs);
-               ns->rx_dropped += le32_to_cpu(s->rx_resource_errors);
                ns->rx_length_errors += le32_to_cpu(s->rx_short_frame_errors) +
                        nic->rx_over_length_errors;
                ns->rx_crc_errors += le32_to_cpu(s->rx_crc_errors);
                ns->rx_frame_errors += le32_to_cpu(s->rx_alignment_errors);
                ns->rx_over_errors += le32_to_cpu(s->rx_overrun_errors);
                ns->rx_fifo_errors += le32_to_cpu(s->rx_overrun_errors);
+               ns->rx_missed_errors += le32_to_cpu(s->rx_resource_errors);
                ns->rx_errors += le32_to_cpu(s->rx_crc_errors) +
                        le32_to_cpu(s->rx_alignment_errors) +
                        le32_to_cpu(s->rx_short_frame_errors) +
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2542,9 +2542,7 @@ e1000_update_stats(struct e1000_adapter 
 
        adapter->net_stats.rx_errors = adapter->stats.rxerrc +
                adapter->stats.crcerrs + adapter->stats.algnerrc +
-               adapter->stats.rlec + adapter->stats.mpc + 
-               adapter->stats.cexterr;
-       adapter->net_stats.rx_dropped = adapter->stats.mpc;
+               adapter->stats.rlec + adapter->stats.cexterr;
        adapter->net_stats.rx_length_errors = adapter->stats.rlec;
        adapter->net_stats.rx_crc_errors = adapter->stats.crcerrs;
        adapter->net_stats.rx_frame_errors = adapter->stats.algnerrc;
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -6848,8 +6848,7 @@ static struct net_device_stats *tg3_get_
                get_stat64(&hw_stats->tx_octets);
 
        stats->rx_errors = old_stats->rx_errors +
-               get_stat64(&hw_stats->rx_errors) +
-               get_stat64(&hw_stats->rx_discards);
+               get_stat64(&hw_stats->rx_errors);
        stats->tx_errors = old_stats->tx_errors +
                get_stat64(&hw_stats->tx_errors) +
                get_stat64(&hw_stats->tx_mac_errors) +
@@ -6877,6 +6876,9 @@ static struct net_device_stats *tg3_get_
        stats->rx_crc_errors = old_stats->rx_crc_errors +
                calc_crc_errors(tp);
 
+       stats->rx_missed_errors = old_stats->rx_missed_errors +
+               get_stat64(&hw_stats->rx_discards);
+
        return stats;
 }
 
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to