> -----Original Message----- > From: Daniele Di Proietto [mailto:diproiet...@vmware.com] > Sent: Tuesday, May 24, 2016 9:39 PM > To: Stokes, Ian > Cc: dev@openvswitch.org > Subject: Re: [PATCH v3 2/2] netdev-dpdk.c: Add ingress-policing > functionality. > > Thanks for the patch! > > I realized that we also need to delete the policer in > netdev_dpdk_destruct, so I folded in the following: > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index > 900ec64..01da275 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -929,6 +929,8 @@ netdev_dpdk_destruct(struct netdev *netdev) > > ovs_mutex_lock(&dev->mutex); > rte_eth_dev_stop(dev->port_id); > + free(ovsrcu_get_protected(struct ingress_policer *, > + &dev->ingress_policer)); > ovs_mutex_unlock(&dev->mutex); > > ovs_mutex_lock(&dpdk_mutex); > @@ -958,6 +960,11 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev) > fatal_signal_remove_file_to_unlink(dev->vhost_id); > } > > + ovs_mutex_lock(&dev->mutex); > + free(ovsrcu_get_protected(struct ingress_policer *, > + &dev->ingress_policer)); > + ovs_mutex_unlock(&dev->mutex); > + > ovs_mutex_lock(&dpdk_mutex); > rte_free(dev->tx_q); > ovs_list_remove(&dev->list_node); > > > with other minor style changes and applied this to master. > > Thanks, > > Daniele
Thanks for the reviews and applying this Daniele, there's a few lessons in here around RCU and style fixes that I can look to apply to our existing QoS egress solution, I'll take a look at these and see what can be done. Thanks Ian > > > On 24/05/2016 09:36, "Ian Stokes" <ian.sto...@intel.com> wrote: > > >This patch provides the modifications required in netdev-dpdk.c and > >vswitch.xml to enable ingress policing for DPDK interfaces. > > > >This patch implements the necessary netdev functions to netdev-dpdk.c > >as well as various helper functions required for ingress policing. > > > >The vswitch.xml has been modified to explain the expected parameters > >and behaviour when using ingress policing. > > > >The INSTALL.DPDK.md guide has been modified to provide an example > >configuration of ingress policing. > > > >Signed-off-by: Ian Stokes <ian.sto...@intel.com> > >--- > > > >v3: > >- Rebase to master. > > > >*netdev-dpdk.c > >- Modified calls to function ingress_policer_run() to use pointer to > > ingress_policer struct rather than pointer to netdev-dpdk struct. > >- Clean up policer check by removing '!= NULL' in if conditions in > > netdev_dpdk_rxq_recv(). > >- Clean up policer check by removing '!= NULL' in if conditions in > > netdev_dpdk_vhost_rxq_recv(). > >- Remove '__' suffix from 'netdev_dpdk_policer_construct()'. > >- Modified 'netdev_dpdk_policer_construct()' to return pointer to > > ingress_policer. > >- Modified 'netdev_dpdk_policer_construct()' to flag VLOG_ERR if > > creation of rte_meter fails. > >- Removed function netdev_dpdk_policer_destruct__() as it is not > > required any longer. > >- Use correct variable names for netdev-dpdk *dev and netdev *netdev > > though out all new functions. > >- Remove variable err from netdev_dpdk_set_policing() as function > >cannot > > fail. > >- Simplify netdev_dpdk_set_policing() function by using single call to > > free the existing policer before creating new policer, set the rate > > and burst once before function exits. > >- Remove comment for set policing in NETDEV_DPDK DEFINE as function is > >no > > longer null. > > > >v2: > >- Rebase to latest master. > > > >*netdev-dpdk.c > >- Modified 'netdev_dpdk_set_policing()' to set default value of > > policer_burst to 8000 kbits. > >- Modified 'netdev_dpdk_policer_construct__()' to use '1000/8' for > >clarity > > instead of '125' when converting kbits to bytes for rate and burst. > > > >v1: > > > >*INSTALL.DPDK.md > >- Add ingress_policing usage example. > > > >*NEWS > >- Add support for ingress_policing to DPDK section. > > > >*netdev-dpdk.c > >- Add ingress_policer struct. > >- Modify struct netdev-dpdk to include a ovsrcu pointer to an ingress > > policer. > >- Modify netdev_dpdk_init() to initialise ingress policer to null. > >- Add function 'ingress_policer_run()'to process packets with a given > > netdev, rte_mbuf and packet count. > >- Modified 'netdev_dpdk_vhost_update_rx_counters()' to accept new > > parameter int dropped representing number of packets dropped. > >- Modified 'netdev_dpdk_vhost_update_rx_counters()' to update number of > > dropped packets in stats->rx_dropped. > >- Modified 'netdev_dpdk_vhost_rxq_recv()' to check and call > > ingress_policing functionality with 'ingress_policer_run()'. > >- Modified 'netdev_dpdk_rxq_recv()' to check and call > > ingress_policing functionality with 'ingress_policer_run()'. > >- Modified 'netdev_dpdk_rxq_recv()' to update rx dropped stats. > >- Modified 'netdev_dpdk_vhost_get_stats()' to add support for > rx_dropped. > >- Add function 'netdev_dpdk_policer_construct__()' to create an > > ingress_policer. > >- Add function 'netdev_dpdk_policer_destruct__()' to destroy an ingress > > policer. > >- Add function 'netdev_dpdk_set_policing()' responsible for calling > > helper functions to create and destroy an ingress policer for the > > device. > >- Add function 'netdev_dpdk_get_ingress_policer()' to return a pointer > > to the ingress_policer using the ovsrcu_get function. > > > >*vswitch.xml > >- Modify ingress policing section to include support in OVS with DPDK. > >--- > > INSTALL.DPDK.md | 19 +++++++ > > NEWS | 1 + > > lib/netdev-dpdk.c | 148 > ++++++++++++++++++++++++++++++++++++++++++++++++-- > > vswitchd/vswitch.xml | 4 +- > > 4 files changed, 166 insertions(+), 6 deletions(-) > > > >diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md index 93f92e4..68735cc > >100644 > >--- a/INSTALL.DPDK.md > >+++ b/INSTALL.DPDK.md > >@@ -267,6 +267,25 @@ Using the DPDK with ovs-vswitchd: > > For more details regarding egress-policer parameters please refer > to the > > vswitch.xml. > > > >+9. Ingress Policing Example > >+ > >+ Assuming you have a vhost-user port receiving traffic consisting of > >+ packets of size 64 bytes, the following command would limit the > reception > >+ rate of the port to ~1,000,000 packets per second: > >+ > >+ `ovs-vsctl set interface vhost-user0 ingress_policing_rate=368000 > >+ ingress_policing_burst=1000` > >+ > >+ To examine the ingress policer configuration of the port: > >+ > >+ `ovs-vsctl list interface vhost-user0` > >+ > >+ To clear the ingress policer configuration from the port use the > following: > >+ > >+ `ovs-vsctl set interface vhost-user0 ingress_policing_rate=0` > >+ > >+ For more details regarding ingress-policer see the vswitch.xml. > >+ > > Performance Tuning: > > ------------------- > > > >diff --git a/NEWS b/NEWS > >index 4e81cad..ba201cf 100644 > >--- a/NEWS > >+++ b/NEWS > >@@ -32,6 +32,7 @@ Post-v2.5.0 > > * DB entries have been added for many of the DPDK EAL command > line > > arguments. Additional arguments can be passed via the dpdk- > extra > > entry. > >+ * Add ingress policing functionality. > > - ovs-benchmark: This utility has been removed due to lack of use > and > > bitrot. > > - ovs-appctl: > >diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index > >c78f892..0389bb7 100644 > >--- a/lib/netdev-dpdk.c > >+++ b/lib/netdev-dpdk.c > >@@ -328,6 +328,12 @@ struct dpdk_ring { > > struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex); }; > > > >+struct ingress_policer { > >+ struct rte_meter_srtcm_params app_srtcm_params; > >+ struct rte_meter_srtcm in_policer; > >+ rte_spinlock_t policer_lock; > >+}; > >+ > > struct netdev_dpdk { > > struct netdev up; > > int port_id; > >@@ -378,6 +384,11 @@ struct netdev_dpdk { > > * netdev_dpdk*_reconfigure() is called */ > > int requested_n_txq; > > int requested_n_rxq; > >+ > >+ /* Ingress Policer */ > >+ OVSRCU_TYPE(struct ingress_policer *) ingress_policer; > >+ uint32_t policer_rate; > >+ uint32_t policer_burst; > > }; > > > > struct netdev_rxq_dpdk { > >@@ -391,6 +402,9 @@ static int netdev_dpdk_construct(struct netdev *); > > > > struct virtio_net * netdev_dpdk_get_virtio(const struct netdev_dpdk > > *dev); > > > >+struct ingress_policer * > >+netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev); > >+ > > static bool > > is_dpdk_class(const struct netdev_class *class) { @@ -764,6 +778,11 > >@@ netdev_dpdk_init(struct netdev *netdev, unsigned int port_no, > > dev->qos_conf = NULL; > > rte_spinlock_init(&dev->qos_lock); > > > >+ /* Initialise rcu pointer for ingress policer to NULL */ > >+ ovsrcu_init(&dev->ingress_policer, NULL); > >+ dev->policer_rate = 0; > >+ dev->policer_burst = 0; > >+ > > netdev->n_txq = NR_QUEUE; > > netdev->n_rxq = NR_QUEUE; > > dev->requested_n_rxq = NR_QUEUE; > >@@ -1136,6 +1155,19 @@ netdev_dpdk_policer_run(struct rte_meter_srtcm > *meter, > > return cnt; > > } > > > >+static int > >+ingress_policer_run(struct ingress_policer *policer, struct rte_mbuf > **pkts, > >+ int pkt_cnt) > >+{ > >+ int cnt = 0; > >+ > >+ rte_spinlock_lock(&policer->policer_lock); > >+ cnt = netdev_dpdk_policer_run(&policer->in_policer, pkts, > pkt_cnt); > >+ rte_spinlock_unlock(&policer->policer_lock); > >+ > >+ return cnt; > >+} > >+ > > static bool > > is_vhost_running(struct virtio_net *virtio_dev) { @@ -1170,13 > >+1202,15 @@ netdev_dpdk_vhost_update_rx_size_counters(struct > >netdev_stats *stats, > > > > static inline void > > netdev_dpdk_vhost_update_rx_counters(struct netdev_stats *stats, > >- struct dp_packet **packets, int > count) > >+ struct dp_packet **packets, int > count, > >+ int dropped) > > { > > int i; > > unsigned int packet_size; > > struct dp_packet *packet; > > > > stats->rx_packets += count; > >+ stats->rx_dropped += dropped; > > for (i = 0; i < count; i++) { > > packet = packets[i]; > > packet_size = dp_packet_size(packet); @@ -1211,7 +1245,9 @@ > >netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > > struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev); > > struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev); > > int qid = rxq->queue_id; > >+ struct ingress_policer *policer = > >+ netdev_dpdk_get_ingress_policer(dev); > > uint16_t nb_rx = 0; > >+ uint16_t dropped = 0; > > > > if (OVS_UNLIKELY(!is_vhost_running(virtio_dev))) { > > return EAGAIN; > >@@ -1229,8 +1265,14 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq > *rxq, > > return EAGAIN; > > } > > > >+ if (policer) { > >+ dropped = nb_rx; > >+ nb_rx = ingress_policer_run(policer, (struct rte_mbuf > **)packets, nb_rx); > >+ dropped -= nb_rx; > >+ } > >+ > > rte_spinlock_lock(&dev->stats_lock); > >- netdev_dpdk_vhost_update_rx_counters(&dev->stats, packets, nb_rx); > >+ netdev_dpdk_vhost_update_rx_counters(&dev->stats, packets, nb_rx, > >+ dropped); > > rte_spinlock_unlock(&dev->stats_lock); > > > > *c = (int) nb_rx; > >@@ -1243,7 +1285,9 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, > >struct dp_packet **packets, { > > struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq); > > struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev); > >+ struct ingress_policer *policer = > >+ netdev_dpdk_get_ingress_policer(dev); > > int nb_rx; > >+ int dropped = 0; > > > > /* There is only one tx queue for this core. Do not flush other > > * queues. > >@@ -1261,6 +1305,19 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, > struct dp_packet **packets, > > return EAGAIN; > > } > > > >+ if (policer) { > >+ dropped = nb_rx; > >+ nb_rx = ingress_policer_run(policer, (struct rte_mbuf **) > packets, nb_rx); > >+ dropped -= nb_rx; > >+ } > >+ > >+ /* Update stats to reflect dropped packets */ > >+ if (OVS_UNLIKELY(dropped)) { > >+ rte_spinlock_lock(&dev->stats_lock); > >+ dev->stats.rx_dropped += dropped; > >+ rte_spinlock_unlock(&dev->stats_lock); > >+ } > >+ > > *c = nb_rx; > > > > return 0; > >@@ -1694,6 +1751,7 @@ netdev_dpdk_vhost_get_stats(const struct netdev > *netdev, > > /* Supported Stats */ > > stats->rx_packets += dev->stats.rx_packets; > > stats->tx_packets += dev->stats.tx_packets; > >+ stats->rx_dropped = dev->stats.rx_dropped; > > stats->tx_dropped += dev->stats.tx_dropped; > > stats->multicast = dev->stats.multicast; > > stats->rx_bytes = dev->stats.rx_bytes; @@ -1821,11 +1879,12 @@ > >netdev_dpdk_get_stats(const struct netdev *netdev, struct netdev_stats > >*stats) > > > > rte_spinlock_lock(&dev->stats_lock); > > stats->tx_dropped = dev->stats.tx_dropped; > >+ stats->rx_dropped = dev->stats.rx_dropped; > > rte_spinlock_unlock(&dev->stats_lock); > > > > /* These are the available DPDK counters for packets not received > due to > > * local resource constraints in DPDK and NIC respectively. */ > >- stats->rx_dropped = rte_stats.rx_nombuf + rte_stats.imissed; > >+ stats->rx_dropped += rte_stats.rx_nombuf + rte_stats.imissed; > > stats->rx_missed_errors = rte_stats.imissed; > > > > ovs_mutex_unlock(&dev->mutex); > >@@ -1879,6 +1938,81 @@ netdev_dpdk_get_features(const struct netdev > *netdev, > > return 0; > > } > > > >+static struct ingress_policer * > >+netdev_dpdk_policer_construct(uint32_t rate, uint32_t burst) { > >+ struct ingress_policer *policer = NULL; > >+ uint64_t rate_bytes; > >+ uint64_t burst_bytes; > >+ int err = 0; > >+ > >+ policer = xmalloc(sizeof *policer); > >+ rte_spinlock_init(&policer->policer_lock); > >+ > >+ /* rte_meter requires bytes so convert kbits rate and burst to > bytes. */ > >+ rate_bytes = rate * 1000/8; > >+ burst_bytes = burst * 1000/8; > >+ > >+ policer->app_srtcm_params.cir = rate_bytes; > >+ policer->app_srtcm_params.cbs = burst_bytes; > >+ policer->app_srtcm_params.ebs = 0; > >+ err = rte_meter_srtcm_config(&policer->in_policer, > >+ &policer->app_srtcm_params); > >+ if(err) { > >+ VLOG_ERR("Could not create rte meter for ingress policer"); > >+ return NULL; > >+ } > >+ > >+ return policer; > >+} > >+ > >+static int > >+netdev_dpdk_set_policing(struct netdev* netdev, uint32_t policer_rate, > >+ uint32_t policer_burst) { > >+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > >+ struct ingress_policer *policer; > >+ > >+ /* Force to 0 if no rate specified, > >+ * default to 8000 kbits if burst is 0, > >+ * else stick with user-specified value. > >+ */ > >+ policer_burst = (!policer_rate ? 0 > >+ : !policer_burst ? 8000 > >+ : policer_burst); > >+ > >+ ovs_mutex_lock(&dev->mutex); > >+ > >+ policer = ovsrcu_get_protected(struct ingress_policer *, > >+ &dev->ingress_policer); > >+ > >+ if (dev->policer_rate == policer_rate && > >+ dev->policer_burst == policer_burst) { > >+ /* Assume that settings haven't changed since we last set > them. */ > >+ ovs_mutex_unlock(&dev->mutex); > >+ return 0; > >+ } > >+ > >+ /* Destroy any existing ingress policer for the device if one > exists > >+ * and the policer_rate and policer_burst are being set to 0 > >+ */ > >+ if (policer) { > >+ ovsrcu_postpone(free, policer); > >+ } > >+ > >+ if (policer_rate != 0) { > >+ policer = netdev_dpdk_policer_construct(policer_rate, > policer_burst); > >+ } else { > >+ policer = NULL; > >+ } > >+ ovsrcu_set(&dev->ingress_policer, policer); > >+ dev->policer_rate = policer_rate; > >+ dev->policer_burst = policer_burst; > >+ ovs_mutex_unlock(&dev->mutex); > >+ > >+ return 0; > >+} > >+ > > static int > > netdev_dpdk_get_ifindex(const struct netdev *netdev) { @@ -2326,6 > >+2460,12 @@ netdev_dpdk_get_virtio(const struct netdev_dpdk *dev) > > return ovsrcu_get(struct virtio_net *, &dev->virtio_dev); } > > > >+struct ingress_policer * > >+netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev) { > >+ return ovsrcu_get(struct ingress_policer *, > >+&dev->ingress_policer); } > >+ > > /* > > * These callbacks allow virtio-net devices to be added to vhost ports > >when > > * configuration has been fully complete. > >@@ -2839,7 +2979,7 @@ netdev_dpdk_vhost_cuse_reconfigure(struct netdev > *netdev) > > GET_FEATURES, \ > > NULL, /* set_advertisements */ \ > > \ > >- NULL, /* set_policing */ \ > >+ netdev_dpdk_set_policing, \ > > netdev_dpdk_get_qos_types, \ > > NULL, /* get_qos_capabilities */ \ > > netdev_dpdk_get_qos, \ > >diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index > >944d8ec..0c9e60c 100644 > >--- a/vswitchd/vswitch.xml > >+++ b/vswitchd/vswitch.xml > >@@ -2502,8 +2502,8 @@ > > table="Queue"/> tables). > > </p> > > <p> > >- Policing is currently implemented only on Linux. The Linux > >- implementation uses a simple ``token bucket'' approach: > >+ Policing is currently implemented on Linux and OVS with DPDK. > Both > >+ implementations use a simple ``token bucket'' approach: > > </p> > > <ul> > > <li> > >-- > >1.7.4.1 > > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev