I'll rebase, study and correct to OVS coding style and repost. There is a very good reason for putting constants on the left hand side of a compare statement. For example: if (NULL = x) will be a compiler error, while the following will compile and need debugging: if (x = NULL)
Although I try not making the comparison mistakes, I have recently made that exact mistake and had to debug. If I had used the second format, the complier would have output an error and saved the time of debugging. Mike Polehn -----Original Message----- From: Ben Pfaff [mailto:b...@nicira.com] Sent: Wednesday, June 25, 2014 1:48 PM To: Polehn, Mike A Cc: dev@openvswitch.org Subject: Re: [ovs-dev] [PATCH 1/1] PMD dpdk TX output SMP dpdk queue use and packet surge assoprtion. On Fri, Jun 20, 2014 at 10:24:33PM +0000, Polehn, Mike A wrote: > Put in a DPDK queue to receive from multiple core SMP input from vSwitch for > NIC TX output. > Eliminated the inside polling loop SMP TX output lock (DPDK queue handles > SMP). > Added a SMP lock for non-polling operation to allow TX output by the > non-polling thread > when interface not being polled. Lock accessed only when polling is not > enabled. > Added new netdev subroutine to control polling lock and enable and disable > flag. > Packets do not get discarded between TX pre-queue and NIC queue to handle > surges. > > Measured improved average PMD port to port 2544 zero loss packet rate > of 268,000 for packets 512 bytes and smaller. Predict double that when using > 1 cpu core/interface. > > Observed better persistence of obtaining 100% 10 GbE for larger > packets with the added DPDK queue, consistent with other tests outside > of OVS where large surges from fast path interfaces transferring > larger sized packets from VMs were being absorbed in the NIC TX pre-queue and > TX queue and packet loss was suppressed. > > Signed-off-by: Mike A. Polehn <mike.a.pol...@intel.com> This doesn't apply to the current tree. You'll need to rebase and repost it. I have some stylistic comments. Most of the following are cut-and-paste from CONTRIBUTING or CodingStyle (please read both). Many of them apply in multiple places, but I only pointed them out once. Please limit lines in the commit message to 79 characters in width. Comments should be written as full sentences that start with a capital letter and end with a period: > + /* get poll ownership */ Enclose single statements in braces: if (a > b) { return a; } else { return b; } > + for (i = 0; i < poll_cnt; i++) > + netdev_rxq_do_polling(poll_list[i].rx, true); > + > for (;;) { > unsigned int c_port_seq; > int i; When using a relational operator like "<" or "==", put an expression or variable argument on the left and a constant argument on the right, e.g. "x == 0", *not* "0 == x": > + if (NULL == dev->tx_q[i].tx_preq) { > + VLOG_ERR("eth dev tx pre-queue alloc error"); > + return -ENOMEM; > + } > } We don't generally put "inline" on functions in C files, since it suppresses otherwise useful "function not used" warnings and doesn't usually help code generation: > inline static void > -dpdk_queue_flush(struct netdev_dpdk *dev, int qid) > +dpdk_port_out(struct dpdk_tx_queue *tx_q, int qid) > { > - struct dpdk_tx_queue *txq = &dev->tx_q[qid]; > - uint32_t nb_tx; > + /* get packets from NIC tx staging queue */ > + if (likely(tx_q->count == 0)) > + tx_q->count = rte_ring_sc_dequeue_burst(tx_q->tx_preq, > + (void **)&tx_q->tx_trans[0], NIC_TX_PRE_Q_TRANS); > + > + /* send packets to NIC tx queue */ > + if (likely(tx_q->count != 0)) { > + unsigned sent = rte_eth_tx_burst(tx_q->port_id, qid, tx_q->tx_trans, > + tx_q->count); > + tx_q->count -= sent; > + > + if (unlikely((tx_q->count != 0) && (sent > 0))) > + /* move unsent packets to front of list */ > + memmove(&tx_q->tx_trans[0], &tx_q->tx_trans[sent], > + (sizeof(struct rte_mbuf *) * tx_q->count)); > + } > +} > > - if (txq->count == 0) { > - return; Put the return type, function name, and the braces that surround the function's code on separate lines, all starting in column 0: > +static void netdev_dpdk_do_poll(struct netdev_rxq *rxq_, unsigned > +enable) { > + struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq_); > + struct netdev *netdev = rx->up.netdev; > + struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > + struct dpdk_tx_queue *tx_q = &dev->tx_q[rxq_->queue_id]; > + > + if (enable) { > + tx_q->is_polled = true; > + /* get polling ownership */ > + rte_spinlock_lock(&tx_q->tx_lock); > + } else { > + tx_q->is_polled = false; > + /* clear queue after flag for race of the non-poll queuer */ > + dpdk_port_out(tx_q, rxq_->queue_id); > + rte_spinlock_unlock(&tx_q->tx_lock); > } > - rte_spinlock_lock(&txq->tx_lock); > - nb_tx = rte_eth_tx_burst(dev->port_id, qid, txq->burst_pkts, txq->count); > - if (nb_tx != txq->count) { > - /* free buffers if we couldn't transmit packets */ > - rte_mempool_put_bulk(dev->dpdk_mp->mp, > - (void **) &txq->burst_pkts[nb_tx], > - (txq->count - nb_tx)); > +} I guess that this "likely" macro is coming from the Linux kernel headers. Please use OVS_LIKELY instead: > + if (likely(tx_q->is_polled)) > + dpdk_port_out(tx_q, rxq_->queue_id); > > nb_rx = rte_eth_rx_burst(rx->port_id, rxq_->queue_id, > (struct rte_mbuf **) packets, Ditto OVS_UNLIKELY here: > - if (!pkt) { > + if (unlikely(!pkt)) { > ovs_mutex_lock(&dev->mutex); > dev->stats.tx_dropped++; > ovs_mutex_unlock(&dev->mutex); This is much too short to explain how the client is supposed to use it: > + /* Get poll ownership for PMD, enable before starting RX polling loop and > + * disable after exiting the polling loop. NULL if not supported. */ > + void (*rxq_do_polling)(struct netdev_rxq *rx, bool enable); > }; Ditto here: > +/* Tell when entering and exiting polling mode for 'rx' interface */ > +void netdev_rxq_do_polling(struct netdev_rxq *rx, bool enable) { > + if (rx->netdev->netdev_class->rxq_do_polling) > + rx->netdev->netdev_class->rxq_do_polling(rx, enable); } Omit parameter names from function prototypes when the names do not give useful information, e.g.: int netdev_get_mtu(const struct netdev *, int *mtup); > +void netdev_rxq_do_polling(struct netdev_rxq *rx, bool enable); _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev