This RFC describes a proposal to improve packet processing in bonding mode 4 using hardware offloads.
1. Overview Packet processing in the current path for bonding in mode 4, requires parse all packets in the fast path, to classify and process LACP packets. The idea of performance improvement is to use hardware offloads to improve packet classification. 2. Scope of work a) Optimization of software LACP packet classification by using packet_type metadata to eliminate the requirement of parsing each packet in the received burst. b) Implementation of classification mechanism using flow director to redirect LACP packets to the dedicated queue (not visible by application). - Filter pattern choosing (not all filters are supported by all devices), - Changing processing path to speed up non-LACP packets processing, - Handle LACP packets from dedicated Rx queue and send to dedicated Tx queue, c) Creation of fallback mechanism allowing to select the most preferable method of processing: - Flow director, - Packet type metadata, - Software parsing, 3. Proposal 3.1. Packet type The packet_type approach would result in a performance improvement as packets data would no longer be required to be read, but with this approach the bonded driver would still need to look at the mbuf of each packet thereby having an impact on the achievable Rx performance. There's not packet_type value describing LACP packets directly. However, it can be used to limit number of packets required to be parsed, e.g. if packet_type indicates >L2 packets. It should improve performance while well-known non-LACP packets can be skipped without the need to look up into its data. 3.2. Flow director Using rte_flow API and pattern on ethernet type of packet (0x8809), we can configure flow director to redirect slow packets to separated queue. An independent Rx queues for LACP would remove the requirement to filter all ingress traffic in sw which should result in a performance increase. Other queues stay untouched and processing of packets on the fast path will be reduced to simple packet collecting from slaves. Separated Tx queue for LACP daemon allows to send LACP responses immediately, without interfering into Tx fast path. RECEIVE .---------------. | Slave 0 | | .------. | | Fd | Rxq | | Rx ======o==>| |==============. | | +======+ | | .---------------. | `-->| LACP |--------. | | Bonding | | `------' | | | | .------. | `---------------' | | | | | | | >============>| |=======> Rx .---------------. | | | +======+ | | Slave 1 | | | | | XXXX | | | .------. | | | | `------' | | Fd | Rxq | | | | `---------------' Rx ======o==>| |==============' .---------------. | | +======+ | | | | | `-->| LACP |--------+----------->+ LACP DAEMON | | `------' | Tx <--| | `---------------' `---------------' All slow packets received by slaves in bonding are redirected to the separated queue using flow director. Other packets are collected from slaves and exposed to the application with Rx burst on bonded device. TRANSMIT .---------------. | Slave 0 | | .------. | | | | | Tx <=====+===| |<=============. | | |------| | | .---------------. | `---| LACP |<-------. | | Bonding | | `------' | | | | .------. | `---------------' | | | | | | | +<============| |<====== Tx .---------------. | | | +======+ | | Slave 1 | | | | | XXXX | | | .------. | | | | `------' | | | | | | | `---------------' Tx <=====+===| |<=============' Rx .---------------. | | |------| | | `->| | | `---| LACP |<-------+------------+ LACP DAEMON | | `------' | | | `---------------' `---------------' On transmit, packets are propagated on the slaves. While we have separated Tx queue for LACP responses, it can be sent regardless of the fast path. -- 1.9.1