i was poked to take a look at this draft. despite lack of yang fu,
here are some cursory comments.
While certain types of packet loss, such as policy-based discards,
are intentional and part of normal network operation, unintended
packet loss can impact customer services.
intentional drops also impact the customer. who are we kidding here?
when debugging loss, ignoring intentional drop can hide misconfigured
drop configuration.
The scope of this document is limited to reporting packet loss at
Layer 3 and frames discarded at Layer 2. This document considers
only the signals that may trigger automated mitigation actions and
not how the actions are defined or executed.
The fundamental problem for network operators is how to
automatically
detect when unintended packet loss is occurring
^ and where
FEATURE-DISCARD-CLASS: The type or class of discards, which is
crucial for selecting the appropriate of mitigation - for
example:
error discards may require taking faulty components out of
service; no-buffer discards may require traffic redistribution;
policy discards typically require no automated action
policy discards may be due to misconfiguration of policies
The discard reporting can be organized into several types: control
plane, interface, flow, and device.
is a drop reported in multiple types? i.e. on the device, the interface
where it was dropped, the flow it affected, ...? while 5.2 makes this
clear, comment here might be helpful.
The "ietf-packet-discard-reporting-sx" module uses the "sx"
structure
defined in [RFC8791].
the Features list in 4.3 is not the same order as the abstract data
model structure in 4.1
identity ingress {
base direction;
description
"Reports statistics for the received from the network
packets.";
}
identity egress {
base direction;
description
"Reports statistics for the sent to the network
packets.";
}
in a complex device, i wonder if ingress and egress could be a bit
confusing
grouping qos {
description
"Quality of Service (QoS) traffic counters.";
not differentiated from security/acl, no-route, etc. drops?
grouping errors-l3-rx {
...
leaf no-route {
no-route on a received packet, tx, sure, but rx? or are you expecting a
route back toward the source?
i wonder about being able to differentiate between drops due to static
vs dynamic (e.g. flow export) security/TE policies.
If all of the requirements listed in Section 5.2 are met, a "good"
unicast IPv4 packet received would increment:
...
the analogous rules for counting L2 frames are not formally described
+-----------+
| |
| CPU |
| |
+---+---^---+
from_cpu | | to_cpu
i suspect the intent is s/CPU/Control Plane/
+----+----+ +----------+ +---------+ +----------+ +----+----+
| | | | | | | | | |
Rx--> PHY/MAC +--> Ingress +--> Buffers +--> Egress +--> PHY/MAC
+-> Tx
| | | Pipeline | | | | Pipeline | | |
+---------+ +----------+ +---------+ +----------+ +---------+
on complex devices, there are more buffers. at a minimum input vs
output buffers. C.3 starts to address this.
worse, the control plane can be more complex than ingress and egress.
punt path is good, but what about rib to (distributed) fib?
The effectiveness of automated mitigation depends on correctly
mapping discard signals to root causes and appropriate actions.
Table 1 gives example discard signal-to-mitigation action mappings
based on the features described in section 3.
i wonder about the effects of different mitigation actions across
different vendors in a multi-vendor environment. with more coffee, i
suspect one could posit undesirable behavior.
i abuse the excuse of not being a yang expert to not dive deeply into
the model presentations :)
and again, i am a n00b here. but no refunds will be provided. :)
randy_______________________________________________
OPSAWG mailing list -- [email protected]
To unsubscribe send an email to [email protected]