After several hours (6 hour average) of running a dpdk application , rte_eth_rx_burst suddenly fills only one mbuf with no data, thats is an mbuf with mbuf->pool == NULL && m->buf_physaddr == 0 && m->buf_addr == NULL.
Obviosly that breaks our application. (rte_mbuf_sanity_check abort the program) How can we track the source of this kind of mis- behavoir? We are using dpdk 1.6.0r2 and we also use the qos framework api. The nic is 82599ES 10-Gigabit SFI/SFP+ with tapped traffic. The use case is simply, our client is using a traffic tap to divert a copy of around 10gbps of traffic to our appliance. We use a rxtx code similar to the load_balancer example. We read in a pair of rx queue and the use a hash function over the source ip field to deliver the packet in a worker core. Then , when the worker core finishes to process that packet and it is delivery to the tx core. The tx core enqueue the packet to the qos framework, and just a few lines code later dequeue several packet from the qos scheduler. Because we are using a tap to divert a copy of the traffic , we disable the tx code to the phisycal nic, so when we dequeue packets from the qos scheduler wi just drop all of that packets. Of course there is a reason why we use the qos scheduler code without physically transmiting a packet, and is because we just want a few stats about the qos framework behaviour. Any ideas?