Regards _Sugesh
> -----Original Message----- > From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Bodireddy, > Bhanuprakash > Sent: Thursday, August 18, 2016 4:59 PM > To: dev@openvswitch.org > Cc: Ilya Maximets <i.maxim...@samsung.com> > Subject: [ovs-dev] OVS DPDK performance drop with multiple flows > > Hello All, > > I found significant performance drop using OVS DPDK when testing with > multiple IXIA streams and matching flow rules. > Example: For a packet stream with src ip 2.2.2.1 and dst ip 3.3.3.1, > corresponding flow rule is set up as below. > $ ovs-ofctl add-flow br0 dl_type=0x0800,nw_src=2.2.2.1,actions=output:2 > > From the implementation, I see that post the emc_lookup(), the packets are > batched matching the flow and get processed in 'batches' with > packet_batch_execute(). > > In OVS 2.6, during the testing I observed that with only few packets in a > batch > the netdev_send() gets called which internally invokes rte_eth_tx_burst() > that incurs an expensive MMIO write. I was told that OVS 2.5 has > intermediate queue feature enabled that queues and burst as many packets > as it can to amortize the cost of MMIO write. When tested on OVS 2.5 > performance drop is still noticed inspite of intermediate queue > implementation due to below reason. > > With single queue in use txq_needs_locking is 'false' and flush_tx is always > 'true'. With flush_tx always 'true' the Intermediate queue flushes packets for > each batch using dpdk_queue_flush__() instead of queueing packets and > the behavior is same as OVS 2.6. This may not be the idea behind the initial > implementation of intermediate queue logic with dpdk_queue_pkts(). [Sugesh] I feel we need the queuing logic at least for the DPDK PHY ports to Amortize the cost of MMIO write. The performance impact is significant with more number of flows with same action. I guess in 2.5 the logic to set the 'flush_tx' should be netdev->tx_q[i].flush_tx = netdev->socket_id != numa_id; than netdev->tx_q[i].flush_tx = netdev->socket_id == numa_id; This way the flush_tx will set when socket_id is not equal to the current numa_id. May be this could be the reason why flush_tx is 'true' all the time?? On the latest master the queue selection logic is changed to XPS. So there is a slight change to implement the queueing logic as the Tx queue is not mapped to PMD core id . There should be a way to identify the queues that are supposed to be flushed out after every burst operation. I can see ~2 Mpps performance improvement on a PHY-PHY test with 5 flows, using a hack implementation of queueing on the master. Is there any other impact of enabling the queuing logic atleast on PHY ports? > > Appreciate your comments on this. > > Regards, > Bhanu Prakash. > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev