Thanks for the patch!

I realized that we also need to delete the policer in netdev_dpdk_destruct,
so I folded in the following:

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 900ec64..01da275 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -929,6 +929,8 @@ netdev_dpdk_destruct(struct netdev *netdev)

     ovs_mutex_lock(&dev->mutex);
     rte_eth_dev_stop(dev->port_id);
+    free(ovsrcu_get_protected(struct ingress_policer *,
+                              &dev->ingress_policer));
     ovs_mutex_unlock(&dev->mutex);

     ovs_mutex_lock(&dpdk_mutex);
@@ -958,6 +960,11 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev)
         fatal_signal_remove_file_to_unlink(dev->vhost_id);
     }

+    ovs_mutex_lock(&dev->mutex);
+    free(ovsrcu_get_protected(struct ingress_policer *,
+                              &dev->ingress_policer));
+    ovs_mutex_unlock(&dev->mutex);
+
     ovs_mutex_lock(&dpdk_mutex);
     rte_free(dev->tx_q);
     ovs_list_remove(&dev->list_node);


with other minor style changes and applied this to master.

Thanks,

Daniele


On 24/05/2016 09:36, "Ian Stokes" <ian.sto...@intel.com> wrote:

>This patch provides the modifications required in netdev-dpdk.c and
>vswitch.xml to enable ingress policing for DPDK interfaces.
>
>This patch implements the necessary netdev functions to netdev-dpdk.c as
>well as various helper functions required for ingress policing.
>
>The vswitch.xml has been modified to explain the expected parameters and
>behaviour when using ingress policing.
>
>The INSTALL.DPDK.md guide has been modified to provide an example
>configuration of ingress policing.
>
>Signed-off-by: Ian Stokes <ian.sto...@intel.com>
>---
>
>v3:
>- Rebase to master.
>
>*netdev-dpdk.c
>- Modified calls to function ingress_policer_run() to use pointer to
>  ingress_policer struct rather than pointer to netdev-dpdk struct.
>- Clean up policer check by removing '!= NULL' in if conditions in
>  netdev_dpdk_rxq_recv().
>- Clean up policer check by removing '!= NULL' in if conditions in
>  netdev_dpdk_vhost_rxq_recv().
>- Remove '__' suffix from 'netdev_dpdk_policer_construct()'.
>- Modified 'netdev_dpdk_policer_construct()' to return pointer to
>  ingress_policer.
>- Modified 'netdev_dpdk_policer_construct()' to flag VLOG_ERR if
>  creation of rte_meter fails.
>- Removed function netdev_dpdk_policer_destruct__() as it is not
>  required any longer.
>- Use correct variable names for netdev-dpdk *dev and netdev *netdev
>  though out all new functions.
>- Remove variable err from netdev_dpdk_set_policing() as function cannot
>  fail.
>- Simplify netdev_dpdk_set_policing() function by using single call to
>  free the existing policer before creating new policer, set the rate
>  and burst once before function exits.
>- Remove comment for set policing in NETDEV_DPDK DEFINE as function is no
>  longer null.
>
>v2:
>- Rebase to latest master.
>
>*netdev-dpdk.c
>- Modified 'netdev_dpdk_set_policing()' to set default value of
>  policer_burst to 8000 kbits.
>- Modified 'netdev_dpdk_policer_construct__()' to use '1000/8' for clarity
>  instead of '125' when converting kbits to bytes for rate and burst.
>
>v1:
>
>*INSTALL.DPDK.md
>- Add ingress_policing usage example.
>
>*NEWS
>- Add support for ingress_policing to DPDK section.
>
>*netdev-dpdk.c
>- Add ingress_policer struct.
>- Modify struct netdev-dpdk to include a ovsrcu pointer to an ingress
>  policer.
>- Modify netdev_dpdk_init() to initialise ingress policer to null.
>- Add function 'ingress_policer_run()'to process packets with a given
>  netdev, rte_mbuf and packet count.
>- Modified 'netdev_dpdk_vhost_update_rx_counters()' to accept new
>  parameter int dropped representing number of packets dropped.
>- Modified 'netdev_dpdk_vhost_update_rx_counters()' to update number of
>  dropped packets in stats->rx_dropped.
>- Modified 'netdev_dpdk_vhost_rxq_recv()' to check and call
>  ingress_policing functionality with 'ingress_policer_run()'.
>- Modified 'netdev_dpdk_rxq_recv()' to check and call
>  ingress_policing functionality with 'ingress_policer_run()'.
>- Modified 'netdev_dpdk_rxq_recv()' to update rx dropped stats.
>- Modified 'netdev_dpdk_vhost_get_stats()' to add support for rx_dropped.
>- Add function 'netdev_dpdk_policer_construct__()' to create an
>  ingress_policer.
>- Add function 'netdev_dpdk_policer_destruct__()' to destroy an ingress
>  policer.
>- Add function 'netdev_dpdk_set_policing()' responsible for calling
>  helper functions to create and destroy an ingress policer for the
>  device.
>- Add function 'netdev_dpdk_get_ingress_policer()' to return a pointer
>  to the ingress_policer using the ovsrcu_get function.
>
>*vswitch.xml
>- Modify ingress policing section to include support in OVS with DPDK.
>---
> INSTALL.DPDK.md      |   19 +++++++
> NEWS                 |    1 +
> lib/netdev-dpdk.c    |  148 ++++++++++++++++++++++++++++++++++++++++++++++++--
> vswitchd/vswitch.xml |    4 +-
> 4 files changed, 166 insertions(+), 6 deletions(-)
>
>diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md
>index 93f92e4..68735cc 100644
>--- a/INSTALL.DPDK.md
>+++ b/INSTALL.DPDK.md
>@@ -267,6 +267,25 @@ Using the DPDK with ovs-vswitchd:
>    For more details regarding egress-policer parameters please refer to the
>    vswitch.xml.
> 
>+9. Ingress Policing Example
>+
>+   Assuming you have a vhost-user port receiving traffic consisting of
>+   packets of size 64 bytes, the following command would limit the reception
>+   rate of the port to ~1,000,000 packets per second:
>+
>+   `ovs-vsctl set interface vhost-user0 ingress_policing_rate=368000
>+    ingress_policing_burst=1000`
>+
>+   To examine the ingress policer configuration of the port:
>+
>+   `ovs-vsctl list interface vhost-user0`
>+
>+   To clear the ingress policer configuration from the port use the following:
>+
>+   `ovs-vsctl set interface vhost-user0 ingress_policing_rate=0`
>+
>+   For more details regarding ingress-policer see the vswitch.xml.
>+
> Performance Tuning:
> -------------------
> 
>diff --git a/NEWS b/NEWS
>index 4e81cad..ba201cf 100644
>--- a/NEWS
>+++ b/NEWS
>@@ -32,6 +32,7 @@ Post-v2.5.0
>      * DB entries have been added for many of the DPDK EAL command line
>        arguments. Additional arguments can be passed via the dpdk-extra
>        entry.
>+     * Add ingress policing functionality.
>    - ovs-benchmark: This utility has been removed due to lack of use and
>      bitrot.
>    - ovs-appctl:
>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>index c78f892..0389bb7 100644
>--- a/lib/netdev-dpdk.c
>+++ b/lib/netdev-dpdk.c
>@@ -328,6 +328,12 @@ struct dpdk_ring {
>     struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex);
> };
> 
>+struct ingress_policer {
>+    struct rte_meter_srtcm_params app_srtcm_params;
>+    struct rte_meter_srtcm in_policer;
>+    rte_spinlock_t policer_lock;
>+};
>+
> struct netdev_dpdk {
>     struct netdev up;
>     int port_id;
>@@ -378,6 +384,11 @@ struct netdev_dpdk {
>      * netdev_dpdk*_reconfigure() is called */
>     int requested_n_txq;
>     int requested_n_rxq;
>+
>+    /* Ingress Policer */
>+    OVSRCU_TYPE(struct ingress_policer *) ingress_policer;
>+    uint32_t policer_rate;
>+    uint32_t policer_burst;
> };
> 
> struct netdev_rxq_dpdk {
>@@ -391,6 +402,9 @@ static int netdev_dpdk_construct(struct netdev *);
> 
> struct virtio_net * netdev_dpdk_get_virtio(const struct netdev_dpdk *dev);
> 
>+struct ingress_policer *
>+netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev);
>+
> static bool
> is_dpdk_class(const struct netdev_class *class)
> {
>@@ -764,6 +778,11 @@ netdev_dpdk_init(struct netdev *netdev, unsigned int 
>port_no,
>     dev->qos_conf = NULL;
>     rte_spinlock_init(&dev->qos_lock);
> 
>+    /* Initialise rcu pointer for ingress policer to NULL */
>+    ovsrcu_init(&dev->ingress_policer, NULL);
>+    dev->policer_rate = 0;
>+    dev->policer_burst = 0;
>+
>     netdev->n_txq = NR_QUEUE;
>     netdev->n_rxq = NR_QUEUE;
>     dev->requested_n_rxq = NR_QUEUE;
>@@ -1136,6 +1155,19 @@ netdev_dpdk_policer_run(struct rte_meter_srtcm *meter,
>     return cnt;
> }
> 
>+static int
>+ingress_policer_run(struct ingress_policer *policer, struct rte_mbuf **pkts,
>+                        int pkt_cnt)
>+{
>+    int cnt = 0;
>+
>+    rte_spinlock_lock(&policer->policer_lock);
>+    cnt = netdev_dpdk_policer_run(&policer->in_policer, pkts, pkt_cnt);
>+    rte_spinlock_unlock(&policer->policer_lock);
>+
>+    return cnt;
>+}
>+
> static bool
> is_vhost_running(struct virtio_net *virtio_dev)
> {
>@@ -1170,13 +1202,15 @@ netdev_dpdk_vhost_update_rx_size_counters(struct 
>netdev_stats *stats,
> 
> static inline void
> netdev_dpdk_vhost_update_rx_counters(struct netdev_stats *stats,
>-                                     struct dp_packet **packets, int count)
>+                                     struct dp_packet **packets, int count,
>+                                     int dropped)
> {
>     int i;
>     unsigned int packet_size;
>     struct dp_packet *packet;
> 
>     stats->rx_packets += count;
>+    stats->rx_dropped += dropped;
>     for (i = 0; i < count; i++) {
>         packet = packets[i];
>         packet_size = dp_packet_size(packet);
>@@ -1211,7 +1245,9 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>     struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
>     struct virtio_net *virtio_dev = netdev_dpdk_get_virtio(dev);
>     int qid = rxq->queue_id;
>+    struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
>     uint16_t nb_rx = 0;
>+    uint16_t dropped = 0;
> 
>     if (OVS_UNLIKELY(!is_vhost_running(virtio_dev))) {
>         return EAGAIN;
>@@ -1229,8 +1265,14 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>         return EAGAIN;
>     }
> 
>+    if (policer) {
>+        dropped = nb_rx;
>+        nb_rx = ingress_policer_run(policer, (struct rte_mbuf **)packets, 
>nb_rx);
>+        dropped -= nb_rx;
>+    }
>+
>     rte_spinlock_lock(&dev->stats_lock);
>-    netdev_dpdk_vhost_update_rx_counters(&dev->stats, packets, nb_rx);
>+    netdev_dpdk_vhost_update_rx_counters(&dev->stats, packets, nb_rx, 
>dropped);
>     rte_spinlock_unlock(&dev->stats_lock);
> 
>     *c = (int) nb_rx;
>@@ -1243,7 +1285,9 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct 
>dp_packet **packets,
> {
>     struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq);
>     struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
>+    struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
>     int nb_rx;
>+    int dropped = 0;
> 
>     /* There is only one tx queue for this core.  Do not flush other
>      * queues.
>@@ -1261,6 +1305,19 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct 
>dp_packet **packets,
>         return EAGAIN;
>     }
> 
>+    if (policer) {
>+        dropped = nb_rx;
>+        nb_rx = ingress_policer_run(policer, (struct rte_mbuf **) packets, 
>nb_rx);
>+        dropped -= nb_rx;
>+    }
>+
>+    /* Update stats to reflect dropped packets */
>+    if (OVS_UNLIKELY(dropped)) {
>+        rte_spinlock_lock(&dev->stats_lock);
>+        dev->stats.rx_dropped += dropped;
>+        rte_spinlock_unlock(&dev->stats_lock);
>+    }
>+
>     *c = nb_rx;
> 
>     return 0;
>@@ -1694,6 +1751,7 @@ netdev_dpdk_vhost_get_stats(const struct netdev *netdev,
>     /* Supported Stats */
>     stats->rx_packets += dev->stats.rx_packets;
>     stats->tx_packets += dev->stats.tx_packets;
>+    stats->rx_dropped = dev->stats.rx_dropped;
>     stats->tx_dropped += dev->stats.tx_dropped;
>     stats->multicast = dev->stats.multicast;
>     stats->rx_bytes = dev->stats.rx_bytes;
>@@ -1821,11 +1879,12 @@ netdev_dpdk_get_stats(const struct netdev *netdev, 
>struct netdev_stats *stats)
> 
>     rte_spinlock_lock(&dev->stats_lock);
>     stats->tx_dropped = dev->stats.tx_dropped;
>+    stats->rx_dropped = dev->stats.rx_dropped;
>     rte_spinlock_unlock(&dev->stats_lock);
> 
>     /* These are the available DPDK counters for packets not received due to
>      * local resource constraints in DPDK and NIC respectively. */
>-    stats->rx_dropped = rte_stats.rx_nombuf + rte_stats.imissed;
>+    stats->rx_dropped += rte_stats.rx_nombuf + rte_stats.imissed;
>     stats->rx_missed_errors = rte_stats.imissed;
> 
>     ovs_mutex_unlock(&dev->mutex);
>@@ -1879,6 +1938,81 @@ netdev_dpdk_get_features(const struct netdev *netdev,
>     return 0;
> }
> 
>+static struct ingress_policer *
>+netdev_dpdk_policer_construct(uint32_t rate, uint32_t burst)
>+{
>+    struct ingress_policer *policer = NULL;
>+    uint64_t rate_bytes;
>+    uint64_t burst_bytes;
>+    int err = 0;
>+
>+    policer = xmalloc(sizeof *policer);
>+    rte_spinlock_init(&policer->policer_lock);
>+
>+    /* rte_meter requires bytes so convert kbits rate and burst to bytes. */
>+    rate_bytes = rate * 1000/8;
>+    burst_bytes = burst * 1000/8;
>+
>+    policer->app_srtcm_params.cir = rate_bytes;
>+    policer->app_srtcm_params.cbs = burst_bytes;
>+    policer->app_srtcm_params.ebs = 0;
>+    err = rte_meter_srtcm_config(&policer->in_policer,
>+                                    &policer->app_srtcm_params);
>+    if(err) {
>+        VLOG_ERR("Could not create rte meter for ingress policer");
>+        return NULL;
>+    }
>+
>+    return policer;
>+}
>+
>+static int
>+netdev_dpdk_set_policing(struct netdev* netdev, uint32_t policer_rate,
>+                            uint32_t policer_burst)
>+{
>+    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>+    struct ingress_policer *policer;
>+
>+    /* Force to 0 if no rate specified,
>+     * default to 8000 kbits if burst is 0,
>+     * else stick with user-specified value.
>+     */
>+    policer_burst = (!policer_rate ? 0
>+                    : !policer_burst ? 8000
>+                    : policer_burst);
>+
>+    ovs_mutex_lock(&dev->mutex);
>+
>+    policer = ovsrcu_get_protected(struct ingress_policer *,
>+                                    &dev->ingress_policer);
>+
>+    if (dev->policer_rate == policer_rate &&
>+    dev->policer_burst == policer_burst) {
>+        /* Assume that settings haven't changed since we last set them. */
>+        ovs_mutex_unlock(&dev->mutex);
>+        return 0;
>+    }
>+
>+    /* Destroy any existing ingress policer for the device if one exists
>+     * and the policer_rate and policer_burst are being set to 0
>+     */
>+    if (policer) {
>+        ovsrcu_postpone(free, policer);
>+    }
>+
>+    if (policer_rate != 0) {
>+        policer = netdev_dpdk_policer_construct(policer_rate, policer_burst);
>+    } else {
>+        policer = NULL;
>+    }
>+    ovsrcu_set(&dev->ingress_policer, policer);
>+    dev->policer_rate = policer_rate;
>+    dev->policer_burst = policer_burst;
>+    ovs_mutex_unlock(&dev->mutex);
>+
>+    return 0;
>+}
>+
> static int
> netdev_dpdk_get_ifindex(const struct netdev *netdev)
> {
>@@ -2326,6 +2460,12 @@ netdev_dpdk_get_virtio(const struct netdev_dpdk *dev)
>     return ovsrcu_get(struct virtio_net *, &dev->virtio_dev);
> }
> 
>+struct ingress_policer *
>+netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev)
>+{
>+    return ovsrcu_get(struct ingress_policer *, &dev->ingress_policer);
>+}
>+
> /*
>  * These callbacks allow virtio-net devices to be added to vhost ports when
>  * configuration has been fully complete.
>@@ -2839,7 +2979,7 @@ netdev_dpdk_vhost_cuse_reconfigure(struct netdev *netdev)
>     GET_FEATURES,                                             \
>     NULL,                       /* set_advertisements */      \
>                                                               \
>-    NULL,                       /* set_policing */            \
>+    netdev_dpdk_set_policing,                                 \
>     netdev_dpdk_get_qos_types,                                \
>     NULL,                       /* get_qos_capabilities */    \
>     netdev_dpdk_get_qos,                                      \
>diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>index 944d8ec..0c9e60c 100644
>--- a/vswitchd/vswitch.xml
>+++ b/vswitchd/vswitch.xml
>@@ -2502,8 +2502,8 @@
>         table="Queue"/> tables).
>       </p>
>       <p>
>-        Policing is currently implemented only on Linux.  The Linux
>-        implementation uses a simple ``token bucket'' approach:
>+        Policing is currently implemented on Linux and OVS with DPDK. Both
>+        implementations use a simple ``token bucket'' approach:
>       </p>
>       <ul>
>         <li>
>-- 
>1.7.4.1
>
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to