Hi Chris, Evaluating packet processing time in software is a very challenging issue, as mentioned by Dave, it is likely to impact the performance we are trying to evaluate. I worked on that issue and have an unpublished, under review, academic paper proposing a solution using the NetFPGA-SUME platform. Basically, I built a custom FPGA design, mimicking a NIC capable of timestamping every packets at the ingress and the egress (immediately after packets arrivals from the wire, and immediately before the packets departures on the wire). I also wrote a DPDK driver for that NIC, and made it work with VPP, so that the latency introduced by (VPP+PCI-based DMA) can be evaluated. I played with this design and VPP in various configurations (l2-patch l2 crossconnect and l3 forward) and I think it could be an interesting tool to diagnose latency issues on a “per-packet” basis. Downside is, of course, from the perspective of VPP, this is a custom NIC, with a custom driver (not necessarily super-optimised), and the evaluated packet forwarding latency takes the driver’s performance into account.
If you are interested in discussing this work, I can give you more details and resources in unicast, don’t hesitate to contact me :) Cheers, Mohammed Hawari Software Engineer & PhD student Cisco Systems > On 18 Apr 2020, at 22:14, Dave Barach via lists.fd.io > <dbarach=cisco....@lists.fd.io> wrote: > > If you turn on the main loop dispatch event logs and look at the results in > the g2 viewer [or dump them in ascii] you can make pretty accurate lap time > estimates for any workload. Roughly speaking, packets take 1 lap time to > arrive and then leave. > > The “circuit-node <node-name>” game produces one elog event per frame, so you > can look at several million frame circuit times. > > Individually timestamping packets would be more precise, but calling > clib_cpu_time_now(...) (rdtsc instrs on x86_64) twice per packet would almost > certainly affect forwarding performance. > > See https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html > <https://fd.io/docs/vpp/master/gettingstarted/developers/eventviewer.html> > > /*? > * Control event logging of api, cli, and thread barrier events > * With no arguments, displays the current trace status. > * Name the event groups you wish to trace or stop tracing. > * > * @cliexpar > * @clistart > * elog trace api cli barrier > * elog trace api cli barrier disable > * elog trace dispatch > * elog trace circuit-node ethernet-input > * elog trace > * @cliend > * @cliexcmd{elog trace [api][cli][barrier][disable]} > ?*/ > /* *INDENT-OFF* */ > > From: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io > <mailto:vpp-dev@lists.fd.io>> On Behalf Of Christian Hopps > Sent: Saturday, April 18, 2020 3:14 PM > To: vpp-dev <vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>> > Cc: Christian Hopps <cho...@chopps.org <mailto:cho...@chopps.org>> > Subject: [vpp-dev] Packet processing time. > > The recent discussion on reference counting and barrier timing has got me > interested in packet processing time. I realize there's a way to use "show > runtime" along with knowledge of the arc a packet follows, but I'm curious if > something more straight-forward has been attempted where packets are > timestamped on ingress (or creation) and stats are collected on egress > (transmission)? > > I also have an unrelated interest in hooking into the graph > immediate-post-transmission -- I'd like to adjust an input queue size only > when the packet that enqueued on it is actually transmitted on the wire, and > not just handed off downstream on the arc -- this would be a likely the same > place packet stat collection might occur. :) > > Thanks, > Chris. > > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16123): https://lists.fd.io/g/vpp-dev/message/16123 Mute This Topic: https://lists.fd.io/mt/73114130/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-