On Sat, 12 Apr 2025 13:06:55 +0200 Morten Brørup <m...@smartsharesystems.com> wrote:
> > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > Sent: Saturday, 12 April 2025 01.45 > > > > This is a rework of how packet capture is done in DPDK. > > The existing mechanism using callbacks has a number of problems; > > the main one is that any packets sent/received in a secondary process > > are not visible. The root cause is that callbacks can not function > > across process boundaries, they are specific to the role. > > > > The new mechanism builds on the concept of port mirroring used > > in Linux/FreeBSD and also router vendors (SPAN ports). The > > infrastructure > > is built around a new ethdev call to mirror a port. > > > > The internals of dumpcap are redone but the program command > > syntax is unchanged (no doc change needed). Usingthe new mirror > > mechanism, > > the dumpcap program creates a port using ring PMD and > > then directs mirror to that. Then the packets are extracted from the > > ring. > > If capturing on multiple devices, they all get mirrored to > > the same ring PMD. > > Here's some general feedback... > > I very much like the concept of using a shared ring for carrying the mirrored > packets. > It allows other types of future consumers to process the mirrored packets, > e.g. encapsulating and forwarding them into an L2 or L3 tunnel, or Wireshark > remote capture. > > Using a Ring PMD instead of setting up a dedicated ring also has some > advantages, such as the ability to set up multiple separate mirror target > instances. > For performance reasons, we should ensure that a lightweight Ring PMD is > available for mirroring, in case the Ring PMD is extended with new features > affecting its performance. > Or maybe create a new type of mirror/capture virtual PMD. This would allow > applications to enqueue packets from non-ethdev interfaces into it. The new modifications are simple, and do not impact performance of ring PMD. We don't need another PMD for this. It did uncover some broken API's in ring PMD, but those can be marked deprecated and disappear in some future version. > > I'm not convinced you should undo the VLAN offloads when enqueueing a mirror > packet... If you do, you should also undo QinQ offloads. Undoing offloads is > never going to end. > And if you create a dedicated PMD type for carrying mirrored packets, you can > ensure that the offload fields remain intact on dequeue. VLAN needs to be unoffloaded before filtering and writing to file. Can move that logic to the other end of the ring (i.e in dumpcap). The issue with filtering is that current DPDK BPF has no way to reference offload meta data. There is a way to do that with Linux kernel BPF (via negative offsets). Looked into doing this in the DPDK BPF, but there are several blockers: not all the offloads are the same, and more importantly the pcap library to build filters (pcap_compile) has some internal issues. The pcap library "knows" when it is build a direct Linux socket filter, versus just class BPF. For example, if you call pcap_compile() with "vlan 13", it will generate different code based on whether it is a Linux filter or not. > You should consider sampling and VLAN filtering as typical mirroring features. > It would improve the performance if such filtering is done before copying the > packets. The problem is that filtering before going into the ring leads to creating BPF dependency inside ethdev which is a build nightmare. Tried it and was not successful. > > PS: I agree with your choice of copying (rather than cloning by refcount) > when mirroring the packets.