> -----Original Message----- > From: Daniele Di Proietto [mailto:diproiet...@vmware.com] > Sent: Tuesday, June 16, 2015 6:45 PM > To: Pravin Shelar; Wei li; Gray, Mark D > Cc: d...@openvswitch.com > Subject: Re: [ovs-dev] [PATCH v2] Do not flush tx queue which is shared > among CPUs since it is always flushed > > > > On 16/06/2015 07:40, "Pravin Shelar" <pshe...@nicira.com> wrote: > > >On Mon, Jun 8, 2015 at 7:42 PM, Pravin Shelar <pshe...@nicira.com> wrote: > >> On Mon, Jun 8, 2015 at 6:13 PM, Wei li <l...@dtdream.com> wrote: > >>> When tx queue is shared among CPUS,the pkts always be flush in > >>>'netdev_dpdk_eth_send' > >>> So it is unnecessarily for flushing in netdev_dpdk_rxq_recv > >>>Otherwise tx will be accessed without locking > >>> > >>> Signed-off-by: Wei li <l...@dtdream.com> > >>> --- > >>> v1->v2: fix typo > >>> > >>> lib/netdev-dpdk.c | 7 +++++-- > >>> 1 file changed, 5 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index > >>> 63243d8..1e0a483 100644 > >>> --- a/lib/netdev-dpdk.c > >>> +++ b/lib/netdev-dpdk.c > >>> @@ -892,8 +892,11 @@ netdev_dpdk_rxq_recv(struct netdev_rxq > *rxq_, > >>>struct dp_packet **packets, > >>> int nb_rx; > >>> > >>> /* There is only one tx queue for this core. Do not flush other > >>> - * queueus. */ > >>> - if (rxq_->queue_id == rte_lcore_id()) { > >>> + * queues. > >>> + * Do not flush tx queue which is shared among CPUs > >>> + * since it is always flushed */ > >>> + if (rxq_->queue_id == rte_lcore_id() && > >>> + OVS_LIKELY(!dev->txq_needs_locking)) { > >>> dpdk_queue_flush(dev, rxq_->queue_id); > >>> } > >> > >> Patch looks good, But Daniele has plan to get rid of queue flushing > >> logic. do you see problem with doing this? > > > >Daniele, > >I am not sure if we should fix this queue flushing logic if we are > >going to remove it. So I would like to discuss it first. > >You mentioned that the removing flushing logic results in performance > >drop. Can you explain it? How much is performance drop and which is > >the case? > > Hi Pravin, > > sorry for the late reply. I suspected that removing the queues in netdev- > dpdk was going to have a performance impact in the following scenario: a > batch of packets from different megaflows with the same action (output on > port 1). > Here's what happens: > > 1) With the queues in netdev-dpdk. dpif-netdev calls netdev_send() for > each > packet. netdev_dpdk_send() just copies the pointer in the queue. Before > receiving the next batch dpdk_queue_flush() call rte_eth_tx_burst() with > a whole batch of packets > 2) Without the queues in netdev-dpdk. dpif-netdev calls netdev_send() for > each > packet. netdev_dpdk_send() calls rte_eth_tx_burst() for each packet. > > Scenario 2 could be slower because rte_eth_tx_burst() is called for each > packet (instead of each batch). After some testing this turns out not to be > the case. > It appears that the bottleneck is not rte_eth_tx_burst(), and copying the > pointers in the temporary queue makes the code slower. > > What do you think? Should we just remove the queues? > CC'ing Mark since he expressed interest in the issue
Every call to rte_eth_tx_burst() will generate an expensive MMIO write. Best practice would be to burst as many packets as you can to amortize the cost of the MMIO write. I am surprised at your findings for (2) above. Maybe in this case the cost of queuing is greater than the MMIO write? _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev