[dpdk-dev] packet loss in usvhost dpdk interface

2015-04-06 Thread Srinivasreddy R
Hi,
I have observed packet loss  with usvhost dpdk interfaces even at a very
small rate  .
This is because virtqueue [dev->virtqueue]  is found to be full,
and my application should drop the packets .

some one pls help me ,
how can I avoid this loss .


thanks
srinivas.


[dpdk-dev] rte_ring's dequeue appears to be slow

2015-04-06 Thread Dor Green
I have an app which captures packets on a single core and then passes
to multiple workers on different lcores, using the ring queues.

While I manage to capture packets at 10Gbps, when I send it to the
processing lcores there is substantial packet loss. At first I figured
it's the processing I do on the packets and optimized that, which did
help it a little but did not alleviate the problem.

I used Intel VTune amplifier to profile the program, and on all
profiling checks that I did there, the majority of the time in the
program is spent in "__rte_ring_sc_do_dequeue" (about 70%). I was
wondering if anyone can tell me how to optimize this, or if I'm using
the queues incorrectly, or maybe even doing the profiling wrong
(because I do find it weird that this dequeuing is so slow).

My program architecture is as follows (replaced consts with actual values):

A queue is created for each processing lcore:
  rte_ring_create(qname, swsize, NUMA_SOCKET, 1024*1024,
RING_F_SP_ENQ | RING_F_SC_DEQ);

The processing core enqueues packets one by one, to each of the queues
(the packet burst size is 256):
 rte_ring_sp_enqueue(lc[queue_index].queue, (void *const)pkts[i]);

Which are then dequeued in bulk in the processor lcores:
 rte_ring_sc_dequeue_bulk(lc->queue, (void**) &mbufs, 128);

I'm using 16 1GB hugepages, running the new 2.0 version. If there's
any further info required about the program, let me know.

Thank you.


[dpdk-dev] [PATCH 0/5] bonding corrections and additions

2015-04-06 Thread Eric Kinzie
This patchset makes a couple of small corrections to the bonding driver
and introduces the ability to use an external state machine for mode
4 operation.

Eric Kinzie (5):
  bond: use existing enslaved device queues
  bond mode 4: copy entire config structure
  bond mode 4: do not ignore multicast
  bond mode 4: allow external state machine
  bond mode 4: tests for external state machine

 app/test/test_link_bonding_mode4.c|  208 +++--
 lib/librte_pmd_bond/rte_eth_bond_8023ad.c |  176 +
 lib/librte_pmd_bond/rte_eth_bond_8023ad.h |   44 +
 lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h |2 +
 lib/librte_pmd_bond/rte_eth_bond_pmd.c|   11 +-
 5 files changed, 427 insertions(+), 14 deletions(-)

-- 
1.7.10.4



[dpdk-dev] [PATCH 1/5] bond: use existing enslaved device queues

2015-04-06 Thread Eric Kinzie
If a device to be enslaved already has transmit and/or receive queues
allocated, use those and then create any additional queues that are
necessary.

Signed-off-by: Eric Kinzie 
---
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c 
b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
index c937e6b..4fd7d97 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
@@ -1318,7 +1318,9 @@ slave_configure(struct rte_eth_dev *bonded_eth_dev,
}

/* Setup Rx Queues */
-   for (q_id = 0; q_id < bonded_eth_dev->data->nb_rx_queues; q_id++) {
+   /* Use existing queues, if any */
+   for (q_id = slave_eth_dev->data->nb_rx_queues;
+q_id < bonded_eth_dev->data->nb_rx_queues; q_id++) {
bd_rx_q = (struct bond_rx_queue 
*)bonded_eth_dev->data->rx_queues[q_id];

errval = rte_eth_rx_queue_setup(slave_eth_dev->data->port_id, 
q_id,
@@ -1334,7 +1336,9 @@ slave_configure(struct rte_eth_dev *bonded_eth_dev,
}

/* Setup Tx Queues */
-   for (q_id = 0; q_id < bonded_eth_dev->data->nb_tx_queues; q_id++) {
+   /* Use existing queues, if any */
+   for (q_id = slave_eth_dev->data->nb_tx_queues;
+q_id < bonded_eth_dev->data->nb_tx_queues; q_id++) {
bd_tx_q = (struct bond_tx_queue 
*)bonded_eth_dev->data->tx_queues[q_id];

errval = rte_eth_tx_queue_setup(slave_eth_dev->data->port_id, 
q_id,
-- 
1.7.10.4



[dpdk-dev] [PATCH 2/5] bond mode 4: copy entire config structure

2015-04-06 Thread Eric Kinzie
  Copy all needed fields from the mode8023ad_private structure in
  bond_mode_8023ad_conf_get().

Signed-off-by: Eric Kinzie 
---
 lib/librte_pmd_bond/rte_eth_bond_8023ad.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_pmd_bond/rte_eth_bond_8023ad.c 
b/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
index 97a828e..1009d5b 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
@@ -1013,6 +1013,7 @@ bond_mode_8023ad_conf_get(struct rte_eth_dev *dev,
conf->aggregate_wait_timeout_ms = mode4->aggregate_wait_timeout / 
ms_ticks;
conf->tx_period_ms = mode4->tx_period_timeout / ms_ticks;
conf->update_timeout_ms = mode4->update_timeout_us / 1000;
+   conf->rx_marker_period_ms = mode4->rx_marker_timeout / ms_ticks;
 }

 void
-- 
1.7.10.4



[dpdk-dev] [PATCH 3/5] bond mode 4: do not ignore multicast

2015-04-06 Thread Eric Kinzie
The bonding PMD in mode 4 puts all enslaved interfaces into promiscuous
mode in order to receive LACPDUs and must filter unwanted packets
after the traffic has been "collected".  Allow broadcast and multicast
through so that ARP and IPv6 neighbor discovery continue to work.

Signed-off-by: Eric Kinzie 
---
 app/test/test_link_bonding_mode4.c |7 +--
 lib/librte_pmd_bond/rte_eth_bond_pmd.c |3 ++-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/app/test/test_link_bonding_mode4.c 
b/app/test/test_link_bonding_mode4.c
index 02380f9..5a726af 100644
--- a/app/test/test_link_bonding_mode4.c
+++ b/app/test/test_link_bonding_mode4.c
@@ -755,8 +755,11 @@ test_mode4_rx(void)
rte_eth_macaddr_get(test_params.bonded_port_id, &bonded_mac);
ether_addr_copy(&bonded_mac, &dst_mac);

-   /* Assert that dst address is not bonding address */
-   dst_mac.addr_bytes[0]++;
+   /* Assert that dst address is not bonding address.  Do not set the
+* least significant bit of the zero byte as this would create a
+* multicast address.
+*/
+   dst_mac.addr_bytes[0] += 2;

/* First try with promiscuous mode enabled.
 * Add 2 packets to each slave. First with bonding MAC address, second 
with
diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c 
b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
index 4fd7d97..8631e12 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c
@@ -170,7 +170,8 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf 
**bufs,
 * mode and packet address does not match. */
if (unlikely(hdr->ether_type == ether_type_slow_be ||
!collecting || (!promisc &&
-   !is_same_ether_addr(&bond_mac, 
&hdr->d_addr {
+   (!is_multicast_ether_addr(&hdr->d_addr) 
&&
+!is_same_ether_addr(&bond_mac, 
&hdr->d_addr) {

if (hdr->ether_type == ether_type_slow_be) {

bond_mode_8023ad_handle_slow_pkt(internals, slaves[i],
-- 
1.7.10.4



[dpdk-dev] [PATCH 4/5] bond mode 4: allow external state machine

2015-04-06 Thread Eric Kinzie
  Provide functions to allow an external 802.3ad state machine to transmit
  and recieve LACPDUs and to set the collection/distribution flags on
  slave interfaces.

Signed-off-by: Eric Kinzie 
---
 lib/librte_pmd_bond/rte_eth_bond_8023ad.c |  175 +
 lib/librte_pmd_bond/rte_eth_bond_8023ad.h |   44 ++
 lib/librte_pmd_bond/rte_eth_bond_8023ad_private.h |2 +
 3 files changed, 221 insertions(+)

diff --git a/lib/librte_pmd_bond/rte_eth_bond_8023ad.c 
b/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
index 1009d5b..29cd962 100644
--- a/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
+++ b/lib/librte_pmd_bond/rte_eth_bond_8023ad.c
@@ -42,6 +42,8 @@

 #include "rte_eth_bond_private.h"

+static void bond_mode_8023ad_ext_periodic_cb(void *arg);
+
 #ifdef RTE_LIBRTE_BOND_DEBUG_8023AD
 #define MODE4_DEBUG(fmt, ...) RTE_LOG(DEBUG, PMD, "%6u [Port %u: %s] " fmt, \
bond_dbg_get_time_diff_ms(), slave_id, \
@@ -1014,6 +1016,8 @@ bond_mode_8023ad_conf_get(struct rte_eth_dev *dev,
conf->tx_period_ms = mode4->tx_period_timeout / ms_ticks;
conf->update_timeout_ms = mode4->update_timeout_us / 1000;
conf->rx_marker_period_ms = mode4->rx_marker_timeout / ms_ticks;
+   conf->slowrx_cb = mode4->slowrx_cb;
+   conf->external_sm = mode4->external_sm;
 }

 void
@@ -1035,6 +1039,8 @@ bond_mode_8023ad_setup(struct rte_eth_dev *dev,
conf->tx_period_ms = BOND_8023AD_TX_MACHINE_PERIOD_MS;
conf->rx_marker_period_ms = BOND_8023AD_RX_MARKER_PERIOD_MS;
conf->update_timeout_ms = BOND_MODE_8023AX_UPDATE_TIMEOUT_MS;
+   conf->slowrx_cb = NULL;
+   conf->external_sm = 0;
}

mode4->fast_periodic_timeout = conf->fast_periodic_ms * ms_ticks;
@@ -1045,6 +1051,8 @@ bond_mode_8023ad_setup(struct rte_eth_dev *dev,
mode4->tx_period_timeout = conf->tx_period_ms * ms_ticks;
mode4->rx_marker_timeout = conf->rx_marker_period_ms * ms_ticks;
mode4->update_timeout_us = conf->update_timeout_ms * 1000;
+   mode4->slowrx_cb = conf->slowrx_cb;
+   mode4->external_sm = conf->external_sm;
 }

 int
@@ -1062,6 +1070,13 @@ bond_mode_8023ad_enable(struct rte_eth_dev *bond_dev)
 int
 bond_mode_8023ad_start(struct rte_eth_dev *bond_dev)
 {
+   struct bond_dev_private *internals = bond_dev->data->dev_private;
+   struct mode8023ad_private *mode4 = &internals->mode4;
+
+   if (mode4->external_sm)
+   return rte_eal_alarm_set(BOND_MODE_8023AX_UPDATE_TIMEOUT_MS * 
1000,
+   &bond_mode_8023ad_ext_periodic_cb, bond_dev);
+
return rte_eal_alarm_set(BOND_MODE_8023AX_UPDATE_TIMEOUT_MS * 1000,
&bond_mode_8023ad_periodic_cb, bond_dev);
 }
@@ -1069,6 +1084,13 @@ bond_mode_8023ad_start(struct rte_eth_dev *bond_dev)
 void
 bond_mode_8023ad_stop(struct rte_eth_dev *bond_dev)
 {
+   struct bond_dev_private *internals = bond_dev->data->dev_private;
+   struct mode8023ad_private *mode4 = &internals->mode4;
+
+   if (mode4->external_sm) {
+   rte_eal_alarm_cancel(&bond_mode_8023ad_ext_periodic_cb, 
bond_dev);
+   return;
+   }
rte_eal_alarm_cancel(&bond_mode_8023ad_periodic_cb, bond_dev);
 }

@@ -1215,3 +1237,156 @@ rte_eth_bond_8023ad_slave_info(uint8_t port_id, uint8_t 
slave_id,
info->agg_port_id = port->aggregator_port_id;
return 0;
 }
+
+int
+rte_eth_bond_8023ad_ext_collect(uint8_t port_id, uint8_t slave_id, int enabled)
+{
+   struct rte_eth_dev *bond_dev;
+   struct bond_dev_private *internals;
+   struct mode8023ad_private *mode4;
+   struct port *port;
+
+   if (valid_bonded_port_id(port_id) != 0 ||
+   rte_eth_bond_mode_get(port_id) != BONDING_MODE_8023AD)
+   return -EINVAL;
+
+   bond_dev = &rte_eth_devices[port_id];
+
+   internals = bond_dev->data->dev_private;
+   if (find_slave_by_id(internals->active_slaves,
+   internals->active_slave_count, slave_id) ==
+   internals->active_slave_count)
+   return -EINVAL;
+
+   mode4 = &internals->mode4;
+   if (mode4->slowrx_cb == NULL || !mode4->external_sm)
+   return -EINVAL;
+
+   port = &mode_8023ad_ports[slave_id];
+
+   if (enabled)
+   ACTOR_STATE_SET(port, COLLECTING);
+   else
+   ACTOR_STATE_CLR(port, COLLECTING);
+
+   return 0;
+}
+
+int
+rte_eth_bond_8023ad_ext_distrib(uint8_t port_id, uint8_t slave_id, int enabled)
+{
+   struct rte_eth_dev *bond_dev;
+   struct bond_dev_private *internals;
+   struct mode8023ad_private *mode4;
+   struct port *port;
+
+   if (valid_bonded_port_id(port_id) != 0 ||
+   rte_eth_bond_mode_get(port_id) != BONDING_MODE_8023AD)
+   return -EINVAL;
+
+   bond_dev = &rte_eth_devices[port_id];
+
+   internals = bond

[dpdk-dev] [PATCH 5/5] bond mode 4: tests for external state machine

2015-04-06 Thread Eric Kinzie
  This adds test cases for exercising the external state machine API to
  the mode 4 autotest.

Signed-off-by: Eric Kinzie 
---
 app/test/test_link_bonding_mode4.c |  201 ++--
 1 file changed, 192 insertions(+), 9 deletions(-)

diff --git a/app/test/test_link_bonding_mode4.c 
b/app/test/test_link_bonding_mode4.c
index 5a726af..a37b59c 100644
--- a/app/test/test_link_bonding_mode4.c
+++ b/app/test/test_link_bonding_mode4.c
@@ -155,6 +155,8 @@ static struct rte_eth_conf default_pmd_conf = {
.lpbk_mode = 0,
 };

+static uint8_t lacpdu_rx_count[RTE_MAX_ETHPORTS] = {0, };
+
 #define FOR_EACH(_i, _item, _array, _size) \
for (_i = 0, _item = &_array[0]; _i < _size && (_item = &_array[_i]); 
_i++)

@@ -324,8 +326,16 @@ remove_slave(struct slave_conf *slave)
return 0;
 }

+static void
+lacp_recv_cb(uint8_t slave_id, struct rte_mbuf *lacp_pkt)
+{
+   lacpdu_rx_count[slave_id]++;
+   RTE_VERIFY(lacp_pkt != NULL);
+   rte_pktmbuf_free(lacp_pkt);
+}
+
 static int
-initialize_bonded_device_with_slaves(uint8_t slave_count, uint8_t start)
+initialize_bonded_device_with_slaves(uint8_t slave_count, uint8_t external_sm)
 {
uint8_t i;

@@ -341,9 +351,18 @@ initialize_bonded_device_with_slaves(uint8_t slave_count, 
uint8_t start)
rte_eth_bond_8023ad_setup(test_params.bonded_port_id, NULL);
rte_eth_promiscuous_disable(test_params.bonded_port_id);

-   if (start)
-   
TEST_ASSERT_SUCCESS(rte_eth_dev_start(test_params.bonded_port_id),
-   "Failed to start bonded device");
+   if (external_sm) {
+   struct rte_eth_bond_8023ad_conf conf;
+
+   rte_eth_bond_8023ad_conf_get(test_params.bonded_port_id, &conf);
+   conf.external_sm = 1;
+   conf.slowrx_cb = lacp_recv_cb;
+   rte_eth_bond_8023ad_setup(test_params.bonded_port_id, &conf);
+
+   }
+
+   TEST_ASSERT_SUCCESS(rte_eth_dev_start(test_params.bonded_port_id),
+   "Failed to start bonded device");

return TEST_SUCCESS;
 }
@@ -648,7 +667,7 @@ test_mode4_lacp(void)
 {
int retval;

-   retval = initialize_bonded_device_with_slaves(TEST_LACP_SLAVE_COUT, 1);
+   retval = initialize_bonded_device_with_slaves(TEST_LACP_SLAVE_COUT, 0);
TEST_ASSERT_SUCCESS(retval, "Failed to initialize bonded device");

/* Test LACP handshake function */
@@ -746,7 +765,7 @@ test_mode4_rx(void)
struct ether_addr dst_mac;
struct ether_addr bonded_mac;

-   retval = initialize_bonded_device_with_slaves(TEST_PROMISC_SLAVE_COUNT, 
1);
+   retval = initialize_bonded_device_with_slaves(TEST_PROMISC_SLAVE_COUNT, 
0);
TEST_ASSERT_SUCCESS(retval, "Failed to initialize bonded device");

retval = bond_handshake();
@@ -923,7 +942,7 @@ test_mode4_tx_burst(void)
struct ether_addr dst_mac = { { 0x00, 0xFF, 0x00, 0xFF, 0x00, 0x00 } };
struct ether_addr bonded_mac;

-   retval = initialize_bonded_device_with_slaves(TEST_TX_SLAVE_COUNT, 1);
+   retval = initialize_bonded_device_with_slaves(TEST_TX_SLAVE_COUNT, 0);
TEST_ASSERT_SUCCESS(retval, "Failed to initialize bonded device");

retval = bond_handshake();
@@ -1107,7 +1126,7 @@ test_mode4_marker(void)
uint8_t i, j;
const uint16_t ethtype_slow_be = rte_be_to_cpu_16(ETHER_TYPE_SLOW);

-   retval = initialize_bonded_device_with_slaves(TEST_MARKER_SLAVE_COUT, 
1);
+   retval = initialize_bonded_device_with_slaves(TEST_MARKER_SLAVE_COUT, 
0);
TEST_ASSERT_SUCCESS(retval, "Failed to initialize bonded device");

/* Test LACP handshake function */
@@ -1192,7 +1211,7 @@ test_mode4_expired(void)

struct rte_eth_bond_8023ad_conf conf;

-   retval = initialize_bonded_device_with_slaves(TEST_EXPIRED_SLAVE_COUNT, 
1);
+   retval = initialize_bonded_device_with_slaves(TEST_EXPIRED_SLAVE_COUNT, 
0);
/* Set custom timeouts to make test last shorter. */
rte_eth_bond_8023ad_conf_get(test_params.bonded_port_id, &conf);
conf.fast_periodic_ms = 100;
@@ -1274,6 +1293,156 @@ test_mode4_expired(void)
 }

 static int
+test_mode4_ext_ctrl(void)
+{
+   /*
+* configure bonded interface without the external sm enabled
+*   . try to transmit lacpdu (should fail)
+*   . try to set collecting and distributing flags (should fail)
+* reconfigure w/external sm
+*   . transmit one lacpdu on each slave using new api
+*   . make sure each slave receives one lacpdu using the callback api
+*   . transmit one data pdu on each slave (should fail)
+*   . enable distribution and collection, send one data pdu each again
+*/
+
+   int retval;
+   struct slave_conf *slave = NULL;
+   uint8_t i;
+
+   struct rte_mbuf *lacp_tx_buf[SLAVE_COUNT];
+   struct ether_addr src_mac, dst_mac;
+   struct lacpdu_header lacpdu = {
+ 

[dpdk-dev] [PATCH] eth_dev: make ether dev_ops const

2015-04-06 Thread Stephen Hemminger
Ethernet device function tables should be immutable for correctness
and security. Special case for the test code driver.

Also reindent a couple of places where the table was indented
in a non-standard way.

Signed-off-by: Stephen Hemminger 
---
 app/test/virtual_pmd.c   | 78 +++-
 lib/librte_ether/rte_ethdev.h|  2 +-
 lib/librte_pmd_af_packet/rte_eth_af_packet.c |  2 +-
 lib/librte_pmd_e1000/em_ethdev.c |  2 +-
 lib/librte_pmd_e1000/igb_ethdev.c|  4 +-
 lib/librte_pmd_enic/enic_ethdev.c|  2 +-
 lib/librte_pmd_fm10k/fm10k_ethdev.c  |  2 +-
 lib/librte_pmd_i40e/i40e_ethdev.c|  2 +-
 lib/librte_pmd_i40e/i40e_ethdev_vf.c |  2 +-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c  |  5 +-
 lib/librte_pmd_mlx4/mlx4.c   |  2 +-
 lib/librte_pmd_null/rte_eth_null.c   | 24 -
 lib/librte_pmd_pcap/rte_eth_pcap.c   | 26 +-
 lib/librte_pmd_ring/rte_eth_ring.c   | 32 ++--
 lib/librte_pmd_virtio/virtio_ethdev.c|  2 +-
 lib/librte_pmd_vmxnet3/vmxnet3_ethdev.c  |  2 +-
 lib/librte_pmd_xenvirt/rte_eth_xenvirt.c | 26 +-
 17 files changed, 111 insertions(+), 104 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index f163562..f579558 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -241,33 +241,39 @@ virtual_ethdev_promiscuous_mode_disable(struct 
rte_eth_dev *dev __rte_unused)
 {}


-static struct eth_dev_ops virtual_ethdev_default_dev_ops = {
-   .dev_configure = virtual_ethdev_configure_success,
-   .dev_start = virtual_ethdev_start_success,
-   .dev_stop = virtual_ethdev_stop,
-   .dev_close = virtual_ethdev_close,
-   .dev_infos_get = virtual_ethdev_info_get,
-   .rx_queue_setup = virtual_ethdev_rx_queue_setup_success,
-   .tx_queue_setup = virtual_ethdev_tx_queue_setup_success,
-   .rx_queue_release = virtual_ethdev_rx_queue_release,
-   .tx_queue_release = virtual_ethdev_tx_queue_release,
-   .link_update = virtual_ethdev_link_update_success,
-   .stats_get = virtual_ethdev_stats_get,
-   .stats_reset = virtual_ethdev_stats_reset,
-   .promiscuous_enable = virtual_ethdev_promiscuous_mode_enable,
-   .promiscuous_disable = virtual_ethdev_promiscuous_mode_disable
+static const struct eth_dev_ops virtual_ethdev_default_dev_ops = {
+   .dev_configure = virtual_ethdev_configure_success,
+   .dev_start = virtual_ethdev_start_success,
+   .dev_stop = virtual_ethdev_stop,
+   .dev_close = virtual_ethdev_close,
+   .dev_infos_get = virtual_ethdev_info_get,
+   .rx_queue_setup = virtual_ethdev_rx_queue_setup_success,
+   .tx_queue_setup = virtual_ethdev_tx_queue_setup_success,
+   .rx_queue_release = virtual_ethdev_rx_queue_release,
+   .tx_queue_release = virtual_ethdev_tx_queue_release,
+   .link_update = virtual_ethdev_link_update_success,
+   .stats_get = virtual_ethdev_stats_get,
+   .stats_reset = virtual_ethdev_stats_reset,
+   .promiscuous_enable = virtual_ethdev_promiscuous_mode_enable,
+   .promiscuous_disable = virtual_ethdev_promiscuous_mode_disable
 };

-
+/* This driver uses private mutable eth_dev_ops for each
+ * instance so it is safe to override const here.
+ */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wcast-qual"
 void
 virtual_ethdev_start_fn_set_success(uint8_t port_id, uint8_t success)
 {
struct rte_eth_dev *vrtl_eth_dev = &rte_eth_devices[port_id];
+   struct eth_dev_ops *dev_ops
+   = (struct eth_dev_ops *) vrtl_eth_dev->dev_ops;

if (success)
-   vrtl_eth_dev->dev_ops->dev_start = virtual_ethdev_start_success;
+   dev_ops->dev_start = virtual_ethdev_start_success;
else
-   vrtl_eth_dev->dev_ops->dev_start = virtual_ethdev_start_fail;
+   dev_ops->dev_start = virtual_ethdev_start_fail;

 }

@@ -275,50 +281,54 @@ void
 virtual_ethdev_configure_fn_set_success(uint8_t port_id, uint8_t success)
 {
struct rte_eth_dev *vrtl_eth_dev = &rte_eth_devices[port_id];
+   struct eth_dev_ops *dev_ops
+   = (struct eth_dev_ops *) vrtl_eth_dev->dev_ops;

if (success)
-   vrtl_eth_dev->dev_ops->dev_configure = 
virtual_ethdev_configure_success;
+   dev_ops->dev_configure = virtual_ethdev_configure_success;
else
-   vrtl_eth_dev->dev_ops->dev_configure = 
virtual_ethdev_configure_fail;
+   dev_ops->dev_configure = virtual_ethdev_configure_fail;
 }

 void
 virtual_ethdev_rx_queue_setup_fn_set_success(uint8_t port_id, uint8_t success)
 {
struct rte_eth_dev *vrtl_eth_dev = &rte_eth_devices[port_id];
+   struct eth_dev_ops *dev_ops
+   = (struct

[dpdk-dev] [PATCH] eal: fix log level check

2015-04-06 Thread David Marchand
From: Jean Dao 

According to the api, rte_log() / rte_vlog() are supposed to check the log level
and type but they were not doing so. This check was only done in the RTE_LOG
macro while this macro is only there to remove log messages at build time.

rte_log() always calls rte_vlog(), so move the check to rte_vlog() only.

Signed-off-by: Jean Dao 
Signed-off-by: David Marchand 
---
 lib/librte_eal/common/eal_common_log.c  |8 +---
 lib/librte_eal/common/include/rte_log.h |4 +---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c 
b/lib/librte_eal/common/eal_common_log.c
index ff44d23..fe3d7d5 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -265,14 +265,15 @@ rte_log_dump_history(FILE *out)
  * defined by the previous call to rte_openlog_stream().
  */
 int
-rte_vlog(__attribute__((unused)) uint32_t level,
-__attribute__((unused)) uint32_t logtype,
-  const char *format, va_list ap)
+rte_vlog(uint32_t level, uint32_t logtype, const char *format, va_list ap)
 {
int ret;
FILE *f = rte_logs.file;
unsigned lcore_id;

+   if ((level > rte_logs.level) || !(logtype & rte_logs.type))
+   return 0;
+
/* save loglevel and logtype in a global per-lcore variable */
lcore_id = rte_lcore_id();
if (lcore_id < RTE_MAX_LCORE) {
@@ -288,6 +289,7 @@ rte_vlog(__attribute__((unused)) uint32_t level,
 /*
  * Generates a log message The message will be sent in the stream
  * defined by the previous call to rte_openlog_stream().
+ * No need to check level here, done by rte_vlog().
  */
 int
 rte_log(uint32_t level, uint32_t logtype, const char *format, ...)
diff --git a/lib/librte_eal/common/include/rte_log.h 
b/lib/librte_eal/common/include/rte_log.h
index f83a0d9..3b467c1 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -299,9 +299,7 @@ int rte_vlog(uint32_t level, uint32_t logtype, const char 
*format, va_list ap)
  *   - Negative on error.
  */
 #define RTE_LOG(l, t, ...) \
-   (void)(((RTE_LOG_ ## l <= RTE_LOG_LEVEL) && \
- (RTE_LOG_ ## l <= rte_logs.level) &&  \
- (RTE_LOGTYPE_ ## t & rte_logs.type)) ?\
+   (void)((RTE_LOG_ ## l <= RTE_LOG_LEVEL) ?   \
 rte_log(RTE_LOG_ ## l, \
 RTE_LOGTYPE_ ## t, # t ": " __VA_ARGS__) : \
 0)
-- 
1.7.10.4



[dpdk-dev] rte_ring's dequeue appears to be slow

2015-04-06 Thread Stephen Hemminger
On Mon, 6 Apr 2015 15:18:21 +0300
Dor Green  wrote:

> I have an app which captures packets on a single core and then passes
> to multiple workers on different lcores, using the ring queues.
> 
> While I manage to capture packets at 10Gbps, when I send it to the
> processing lcores there is substantial packet loss. At first I figured
> it's the processing I do on the packets and optimized that, which did
> help it a little but did not alleviate the problem.
> 
> I used Intel VTune amplifier to profile the program, and on all
> profiling checks that I did there, the majority of the time in the
> program is spent in "__rte_ring_sc_do_dequeue" (about 70%). I was
> wondering if anyone can tell me how to optimize this, or if I'm using
> the queues incorrectly, or maybe even doing the profiling wrong
> (because I do find it weird that this dequeuing is so slow).
> 
> My program architecture is as follows (replaced consts with actual values):
> 
> A queue is created for each processing lcore:
>   rte_ring_create(qname, swsize, NUMA_SOCKET, 1024*1024,
> RING_F_SP_ENQ | RING_F_SC_DEQ);
> 
> The processing core enqueues packets one by one, to each of the queues
> (the packet burst size is 256):
>  rte_ring_sp_enqueue(lc[queue_index].queue, (void *const)pkts[i]);
> 
> Which are then dequeued in bulk in the processor lcores:
>  rte_ring_sc_dequeue_bulk(lc->queue, (void**) &mbufs, 128);
> 
> I'm using 16 1GB hugepages, running the new 2.0 version. If there's
> any further info required about the program, let me know.
> 
> Thank you.

First off, make sure you are enqueuing and dequeuing in bursts
if possible. That saves a lot of the overhead.

Also, with polling applications, the dequeue function can be
falsely blamed for taking CPU, if most of the time the poll does
not succeed in finding any data.


[dpdk-dev] [PATCH v3 1/5] mbuf: fix clone support when application uses private mbuf data

2015-04-06 Thread Olivier MATZ
Hi Konstantin,

Thanks for your comments.

On 04/02/2015 07:21 PM, Ananyev, Konstantin wrote:
> Hi Olivier,
> 
>> -Original Message-
>> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
>> Sent: Tuesday, March 31, 2015 8:23 PM
>> To: dev at dpdk.org
>> Cc: Ananyev, Konstantin; zoltan.kiss at linaro.org; Richardson, Bruce; 
>> Olivier Matz
>> Subject: [PATCH v3 1/5] mbuf: fix clone support when application uses 
>> private mbuf data
>>
>> From: Olivier Matz 
>>
>> Add a new private_size field in mbuf structure that should
>> be initialized at mbuf pool creation. This field contains the
>> size of the application private data in mbufs.
>>
>> Introduce new static inline functions rte_mbuf_from_indirect()
>> and rte_mbuf_to_baddr() to replace the existing macros, which
>> take the private size in account when attaching and detaching
>> mbufs.
>>
>> Signed-off-by: Olivier Matz 
>> ---
>>  app/test-pmd/testpmd.c |  1 +
>>  examples/vhost/main.c  |  4 +--
>>  lib/librte_mbuf/rte_mbuf.c |  1 +
>>  lib/librte_mbuf/rte_mbuf.h | 77 
>> +++---
>>  4 files changed, 63 insertions(+), 20 deletions(-)
>>
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index 3057791..c5a195a 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -425,6 +425,7 @@ testpmd_mbuf_ctor(struct rte_mempool *mp,
>>  mb->tx_offload   = 0;
>>  mb->vlan_tci = 0;
>>  mb->hash.rss = 0;
>> +mb->priv_size= 0;
>>  }
>>
>>  static void
>> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
>> index c3fcb80..e44e82f 100644
>> --- a/examples/vhost/main.c
>> +++ b/examples/vhost/main.c
>> @@ -139,7 +139,7 @@
>>  /* Number of descriptors per cacheline. */
>>  #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc))
>>
>> -#define MBUF_EXT_MEM(mb)   (RTE_MBUF_FROM_BADDR((mb)->buf_addr) != (mb))
>> +#define MBUF_EXT_MEM(mb)   (rte_mbuf_from_indirect(mb) != (mb))
>>
>>  /* mask of enabled ports */
>>  static uint32_t enabled_port_mask = 0;
>> @@ -1550,7 +1550,7 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
>>  static inline void pktmbuf_detach_zcp(struct rte_mbuf *m)
>>  {
>>  const struct rte_mempool *mp = m->pool;
>> -void *buf = RTE_MBUF_TO_BADDR(m);
>> +void *buf = rte_mbuf_to_baddr(m);
>>  uint32_t buf_ofs;
>>  uint32_t buf_len = mp->elt_size - sizeof(*m);
>>  m->buf_physaddr = rte_mempool_virt2phy(mp, m) + sizeof(*m);
>> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
>> index 526b18d..e095999 100644
>> --- a/lib/librte_mbuf/rte_mbuf.c
>> +++ b/lib/librte_mbuf/rte_mbuf.c
>> @@ -125,6 +125,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
>>  m->pool = mp;
>>  m->nb_segs = 1;
>>  m->port = 0xff;
>> +m->priv_size = 0;
> 
> Why it is 0?
> Shouldn't it be the same calulations as in detach() below:
> m->priv_size = /*get private size from mempool private*/;
> m->buf_addr = (char *)m + sizeof(struct rte_mbuf) + m->priv_size;
> m->buf_len = mp->elt_size - sizeof(struct rte_mbuf) - m->priv_size;
> ?

It's 0 because we also have in the function (not visible in the
patch):

  m->buf_addr = (char *)m + sizeof(struct rte_mbuf);

It means that an application that wants to use a private area has
to provide another init function derived from this default function.
This was already the case before the patch series.

As we discussed in previous mail, I plan to propose a rework of
mbuf pool initialization in another series, and my initial idea was to
change this at the same time. But on the other hand it does not hurt
to do this change now. I'll include it in next version.


> BTW, don't see changes in rte_pktmbuf_pool_init() to setup
> mbp_priv->mbuf_data_room_size properly.
> Without that changes, how can people start using that feature?
> It seems that the only way now - setup priv_size and buf_len for each mbuf 
> manually.

It's the same reason than above. To use a private are, the user has
to provide its own function that sets up data_room_size, derived from
this pool_init default function. This was also the case before the
patch series.


> 
>>  }
>>
>>  /* do some sanity checks on a mbuf: panic if it fails */
>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>> index 17ba791..932fe58 100644
>> --- a/lib/librte_mbuf/rte_mbuf.h
>> +++ b/lib/librte_mbuf/rte_mbuf.h
>> @@ -317,18 +317,51 @@ struct rte_mbuf {
>>  /* uint64_t unused:8; */
>>  };
>>  };
>> +
>> +/** Size of the application private data. In case of an indirect
>> + * mbuf, it stores the direct mbuf private data size. */
>> +uint16_t priv_size;
>>  } __rte_cache_aligned;
>>
>>  /**
>> - * Given the buf_addr returns the pointer to corresponding mbuf.
>> + * Return the mbuf owning the data buffer address of an indirect mbuf.
>> + *
>> + * @param mi
>> + *   The pointer to the indirect mbuf.
>> + * @return
>> + *   The address of the dir