Thanks for the patch! I realized that we also need to delete the policer in netdev_dpdk_destruct, so I folded in the following:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 900ec64..01da275 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -929,6 +929,8 @@ netdev_dpdk_destruct(struct netdev *netdev) ovs_mutex_lock(&dev->mutex); rte_eth_dev_stop(dev->port_id); + free(ovsrcu_get_protected(struct ingress_policer *, + &dev->ingress_policer)); ovs_mutex_unlock(&dev->mutex); ovs_mutex_lock(&dpdk_mutex); @@ -958,6 +960,11 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev) fatal_signal_remove_file_to_unlink(dev->vhost_id); } + ovs_mutex_lock(&dev->mutex); + free(ovsrcu_get_protected(struct ingress_policer *, + &dev->ingress_policer)); + ovs_mutex_unlock(&dev->mutex); + ovs_mutex_lock(&dpdk_mutex); rte_free(dev->tx_q); ovs_list_remove(&dev->list_node); with other minor style changes and applied this to master. Thanks, Daniele On 24/05/2016 09:36, "Ian Stokes" <ian.sto...@intel.com> wrote: >This patch provides the modifications required in netdev-dpdk.c and >vswitch.xml to enable ingress policing for DPDK interfaces. > >This patch implements the necessary netdev functions to netdev-dpdk.c as >well as various helper functions required for ingress policing. > >The vswitch.xml has been modified to explain the expected parameters and >behaviour when using ingress policing. > >The INSTALL.DPDK.md guide has been modified to provide an example >configuration of ingress policing. > >Signed-off-by: Ian Stokes <ian.sto...@intel.com> >--- > >v3: >- Rebase to master. > >*netdev-dpdk.c >- Modified calls to function ingress_policer_run() to use pointer to > ingress_policer struct rather than pointer to netdev-dpdk struct. >- Clean up policer check by removing '!= NULL' in if conditions in > netdev_dpdk_rxq_recv(). >- Clean up policer check by removing '!= NULL' in if conditions in > netdev_dpdk_vhost_rxq_recv(). >- Remove '__' suffix from 'netdev_dpdk_policer_construct()'. >- Modified 'netdev_dpdk_policer_construct()' to return pointer to > ingress_policer. >- Modified 'netdev_dpdk_policer_construct()' to flag VLOG_ERR if > creation of rte_meter fails. >- Removed function netdev_dpdk_policer_destruct__() as it is not > required any longer. >- Use correct variable names for netdev-dpdk *dev and netdev *netdev > though out all new functions. >- Remove variable err from netdev_dpdk_set_policing() as function cannot > fail. >- Simplify netdev_dpdk_set_policing() function by using single call to > free the existing policer before creating new policer, set the rate > and burst once before function exits. >- Remove comment for set policing in NETDEV_DPDK DEFINE as function is no > longer null. > >v2: >- Rebase to latest master. > >*netdev-dpdk.c >- Modified 'netdev_dpdk_set_policing()' to set default value of > policer_burst to 8000 kbits. >- Modified 'netdev_dpdk_policer_construct__()' to use '1000/8' for clarity > instead of '125' when converting kbits to bytes for rate and burst. > >v1: > >*INSTALL.DPDK.md >- Add ingress_policing usage example. > >*NEWS >- Add support for ingress_policing to DPDK section. > >*netdev-dpdk.c >- Add ingress_policer struct. >- Modify struct netdev-dpdk to include a ovsrcu pointer to an ingress > policer. >- Modify netdev_dpdk_init() to initialise ingress policer to null. >- Add function 'ingress_policer_run()'to process packets with a given > netdev, rte_mbuf and packet count. >- Modified 'netdev_dpdk_vhost_update_rx_counters()' to accept new > parameter int dropped representing number of packets dropped. >- Modified 'netdev_dpdk_vhost_update_rx_counters()' to update number of > dropped packets in stats->rx_dropped. >- Modified 'netdev_dpdk_vhost_rxq_recv()' to check and call > ingress_policing functionality with 'ingress_policer_run()'. >- Modified 'netdev_dpdk_rxq_recv()' to check and call > ingress_policing functionality with 'ingress_policer_run()'. >- Modified 'netdev_dpdk_rxq_recv()' to update rx dropped stats. >- Modified 'netdev_dpdk_vhost_get_stats()' to add support for rx_dropped. >- Add function 'netdev_dpdk_policer_construct__()' to create an > ingress_policer. >- Add function 'netdev_dpdk_policer_destruct__()' to destroy an ingress > policer. >- Add function 'netdev_dpdk_set_policing()' responsible for calling > helper functions to create and destroy an ingress policer for the > device. >- Add function 'netdev_dpdk_get_ingress_policer()' to return a pointer > to the ingress_policer using the ovsrcu_get function. > >*vswitch.xml >- Modify ingress policing section to include support in OVS with DPDK. >--- > INSTALL.DPDK.md | 19 +++++++ > NEWS | 1 + > lib/netdev-dpdk.c | 148 ++++++++++++++++++++++++++++++++++++++++++++++++-- > vswitchd/vswitch.xml | 4 +- > 4 files changed, 166 insertions(+), 6 deletions(-) > >diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md >index 93f92e4..68735cc 100644 >--- a/INSTALL.DPDK.md >+++ b/INSTALL.DPDK.md >@@ -267,6 +267,25 @@ Using the DPDK with ovs-vswitchd: > For more details regarding egress-policer parameters please refer to the > vswitch.xml. > >+9. Ingress Policing Example >+ >+ Assuming you have a vhost-user port receiving traffic consisting of >+ packets of size 64 bytes, the following command would limit the reception >+ rate of the port to ~1,000,000 packets per second: >+ >+ `ovs-vsctl set interface vhost-user0 ingress_policing_rate=368000 >+ ingress_policing_burst=1000` >+ >+ To examine the ingress policer configuration of the port: >+ >+ `ovs-vsctl list interface vhost-user0` >+ >+ To clear the ingress policer configuration from the port use the following: >+ >+ `ovs-vsctl set interface vhost-user0 ingress_policing_rate=0` >+ >+ For more details regarding ingress-policer see the vswitch.xml. >+ > Performance Tuning: > ------------------- > >diff --git a/NEWS b/NEWS >index 4e81cad..ba201cf 100644 >--- a/NEWS >+++ b/NEWS >@@ -32,6 +32,7 @@ Post-v2.5.0 > * DB entries have been added for many of the DPDK EAL command line > arguments. Additional arguments can be passed via the dpdk-extra > entry. >+ * Add ingress policing functionality. > - ovs-benchmark: This utility has been removed due to lack of use and > bitrot. > - ovs-appctl: >diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >index c78f892..0389bb7 100644 >--- a/lib/netdev-dpdk.c >+++ b/lib/netdev-dpdk.c >@@ -328,6 +328,12 @@ struct dpdk_ring { > struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex); > }; > >+struct ingress_policer { >+ struct rte_meter_srtcm_params app_srtcm_params; >+ struct rte_meter_srtcm in_policer; >+ rte_spinlock_t policer_lock; >+}; >+ > struct netdev_dpdk { > struct netdev up; > int port_id; >@@ -378,6 +384,11 @@ struct netdev_dpdk { > * netdev_dpdk*_reconfigure() is called */ > int requested_n_txq; > int requested_n_rxq; >+ >+ /* Ingress Policer */ >+ OVSRCU_TYPE(struct ingress_policer *) ingress_policer; >+ uint32_t policer_rate; >+ uint32_t policer_burst; > }; > > struct netdev_rxq_dpdk { >@@ -391,6 +402,9 @@ static int netdev_dpdk_construct(struct netdev *); > > struct virtio_net * netdev_dpdk_get_virtio(const struct netdev_dpdk *dev); > >+struct ingress_policer * >+netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev); >+ > static bool > is_dpdk_class(const struct netdev_class *class) > { >@@ -764,6 +778,11 @@ netdev_dpdk_init(struct netdev *netdev, unsigned int >port_no, > dev->qos_conf = NULL; > rte_spinlock_init(&dev->qos_lock); > >+ /* Initialise rcu pointer for ingress policer to NULL */ >+ ovsrcu_init(&dev->ingress_policer, NULL); >+ dev->policer_rate = 0; >+ dev->policer_burst = 0; >+ > netdev->n_txq = NR_QUEUE; > netdev->n_rxq = NR_QUEUE; > dev->requested_n_rxq = NR_QUEUE; >@@ -1136,6 +1155,19 @@ netdev_dpdk_policer_run(struct rte_meter_srtcm *meter, > return cnt; > } > >+static int >+ingress_policer_run(struct ingress_policer *policer, struct rte_mbuf **pkts, >+ int pkt_cnt) >+{ >+ int cnt = 0; >+ >+ rte_spinlock_lock(&policer->policer_lock); >+ cnt = netdev_dpdk_policer_run(&policer->in_policer, pkts, pkt_cnt); >+ rte_spinlock_unlock(&policer->policer_lock); >+ >+ return cnt; >+} >+ > static bool > is_vhost_running(struct virtio_net *virtio_dev) > { >@@ -1170,13 +1202,15 @@ netdev_dpdk_vhost_update_rx_size_counters(struct >netdev_stats *stats, > > static inline void > netdev_dpdk_vhost_update_rx_counters(struct netdev_stats *stats, >- struct dp_packet **packets, int count) >+ struct dp_packet **packets, int count, >+ int dropped) > { > int i; > unsigned int packet_size; > struct dp_packet *packet; > > stats->rx_packets += count; >+ stats->rx_dropped += dropped; > for (i = 0; i < count; i++) { > packet = packets[i]; > packet_size = dp_packet_size(packet); >@@ -1211,7 +1245,9 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev); > struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev); > int qid = rxq->queue_id; >+ struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); > uint16_t nb_rx = 0; >+ uint16_t dropped = 0; > > if (OVS_UNLIKELY(!is_vhost_running(virtio_dev))) { > return EAGAIN; >@@ -1229,8 +1265,14 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > return EAGAIN; > } > >+ if (policer) { >+ dropped = nb_rx; >+ nb_rx = ingress_policer_run(policer, (struct rte_mbuf **)packets, >nb_rx); >+ dropped -= nb_rx; >+ } >+ > rte_spinlock_lock(&dev->stats_lock); >- netdev_dpdk_vhost_update_rx_counters(&dev->stats, packets, nb_rx); >+ netdev_dpdk_vhost_update_rx_counters(&dev->stats, packets, nb_rx, >dropped); > rte_spinlock_unlock(&dev->stats_lock); > > *c = (int) nb_rx; >@@ -1243,7 +1285,9 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct >dp_packet **packets, > { > struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq); > struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev); >+ struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); > int nb_rx; >+ int dropped = 0; > > /* There is only one tx queue for this core. Do not flush other > * queues. >@@ -1261,6 +1305,19 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct >dp_packet **packets, > return EAGAIN; > } > >+ if (policer) { >+ dropped = nb_rx; >+ nb_rx = ingress_policer_run(policer, (struct rte_mbuf **) packets, >nb_rx); >+ dropped -= nb_rx; >+ } >+ >+ /* Update stats to reflect dropped packets */ >+ if (OVS_UNLIKELY(dropped)) { >+ rte_spinlock_lock(&dev->stats_lock); >+ dev->stats.rx_dropped += dropped; >+ rte_spinlock_unlock(&dev->stats_lock); >+ } >+ > *c = nb_rx; > > return 0; >@@ -1694,6 +1751,7 @@ netdev_dpdk_vhost_get_stats(const struct netdev *netdev, > /* Supported Stats */ > stats->rx_packets += dev->stats.rx_packets; > stats->tx_packets += dev->stats.tx_packets; >+ stats->rx_dropped = dev->stats.rx_dropped; > stats->tx_dropped += dev->stats.tx_dropped; > stats->multicast = dev->stats.multicast; > stats->rx_bytes = dev->stats.rx_bytes; >@@ -1821,11 +1879,12 @@ netdev_dpdk_get_stats(const struct netdev *netdev, >struct netdev_stats *stats) > > rte_spinlock_lock(&dev->stats_lock); > stats->tx_dropped = dev->stats.tx_dropped; >+ stats->rx_dropped = dev->stats.rx_dropped; > rte_spinlock_unlock(&dev->stats_lock); > > /* These are the available DPDK counters for packets not received due to > * local resource constraints in DPDK and NIC respectively. */ >- stats->rx_dropped = rte_stats.rx_nombuf + rte_stats.imissed; >+ stats->rx_dropped += rte_stats.rx_nombuf + rte_stats.imissed; > stats->rx_missed_errors = rte_stats.imissed; > > ovs_mutex_unlock(&dev->mutex); >@@ -1879,6 +1938,81 @@ netdev_dpdk_get_features(const struct netdev *netdev, > return 0; > } > >+static struct ingress_policer * >+netdev_dpdk_policer_construct(uint32_t rate, uint32_t burst) >+{ >+ struct ingress_policer *policer = NULL; >+ uint64_t rate_bytes; >+ uint64_t burst_bytes; >+ int err = 0; >+ >+ policer = xmalloc(sizeof *policer); >+ rte_spinlock_init(&policer->policer_lock); >+ >+ /* rte_meter requires bytes so convert kbits rate and burst to bytes. */ >+ rate_bytes = rate * 1000/8; >+ burst_bytes = burst * 1000/8; >+ >+ policer->app_srtcm_params.cir = rate_bytes; >+ policer->app_srtcm_params.cbs = burst_bytes; >+ policer->app_srtcm_params.ebs = 0; >+ err = rte_meter_srtcm_config(&policer->in_policer, >+ &policer->app_srtcm_params); >+ if(err) { >+ VLOG_ERR("Could not create rte meter for ingress policer"); >+ return NULL; >+ } >+ >+ return policer; >+} >+ >+static int >+netdev_dpdk_set_policing(struct netdev* netdev, uint32_t policer_rate, >+ uint32_t policer_burst) >+{ >+ struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); >+ struct ingress_policer *policer; >+ >+ /* Force to 0 if no rate specified, >+ * default to 8000 kbits if burst is 0, >+ * else stick with user-specified value. >+ */ >+ policer_burst = (!policer_rate ? 0 >+ : !policer_burst ? 8000 >+ : policer_burst); >+ >+ ovs_mutex_lock(&dev->mutex); >+ >+ policer = ovsrcu_get_protected(struct ingress_policer *, >+ &dev->ingress_policer); >+ >+ if (dev->policer_rate == policer_rate && >+ dev->policer_burst == policer_burst) { >+ /* Assume that settings haven't changed since we last set them. */ >+ ovs_mutex_unlock(&dev->mutex); >+ return 0; >+ } >+ >+ /* Destroy any existing ingress policer for the device if one exists >+ * and the policer_rate and policer_burst are being set to 0 >+ */ >+ if (policer) { >+ ovsrcu_postpone(free, policer); >+ } >+ >+ if (policer_rate != 0) { >+ policer = netdev_dpdk_policer_construct(policer_rate, policer_burst); >+ } else { >+ policer = NULL; >+ } >+ ovsrcu_set(&dev->ingress_policer, policer); >+ dev->policer_rate = policer_rate; >+ dev->policer_burst = policer_burst; >+ ovs_mutex_unlock(&dev->mutex); >+ >+ return 0; >+} >+ > static int > netdev_dpdk_get_ifindex(const struct netdev *netdev) > { >@@ -2326,6 +2460,12 @@ netdev_dpdk_get_virtio(const struct netdev_dpdk *dev) > return ovsrcu_get(struct virtio_net *, &dev->virtio_dev); > } > >+struct ingress_policer * >+netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev) >+{ >+ return ovsrcu_get(struct ingress_policer *, &dev->ingress_policer); >+} >+ > /* > * These callbacks allow virtio-net devices to be added to vhost ports when > * configuration has been fully complete. >@@ -2839,7 +2979,7 @@ netdev_dpdk_vhost_cuse_reconfigure(struct netdev *netdev) > GET_FEATURES, \ > NULL, /* set_advertisements */ \ > \ >- NULL, /* set_policing */ \ >+ netdev_dpdk_set_policing, \ > netdev_dpdk_get_qos_types, \ > NULL, /* get_qos_capabilities */ \ > netdev_dpdk_get_qos, \ >diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml >index 944d8ec..0c9e60c 100644 >--- a/vswitchd/vswitch.xml >+++ b/vswitchd/vswitch.xml >@@ -2502,8 +2502,8 @@ > table="Queue"/> tables). > </p> > <p> >- Policing is currently implemented only on Linux. The Linux >- implementation uses a simple ``token bucket'' approach: >+ Policing is currently implemented on Linux and OVS with DPDK. Both >+ implementations use a simple ``token bucket'' approach: > </p> > <ul> > <li> >-- >1.7.4.1 > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev