Re: [dpdk-dev] [PATCH] build: enable Arm NEON flags when __aarch64__ is defined

2018-08-29 Thread Gavin Hu



> -Original Message-
> From: Honnappa Nagarahalli 
> Sent: Wednesday, August 22, 2018 11:01 PM
> To: bruce.richard...@intel.com
> Cc: dev@dpdk.org; Gavin Hu ; rasl...@mellanox.com;
> therb...@redhat.com; Honnappa Nagarahalli
> 
> Subject: [PATCH] build: enable Arm NEON flags when __aarch64__ is defined
> 
> GCC version 4.8.5 does not pre-define __ARM_NEON. NEON is not optional
> for ArmV8. Hence NEON related code can be enabled when __aarch64__ is
> defined.
> 
> Bugzilla ID: 82
> 
> Signed-off-by: Honnappa Nagarahalli 
> Reviewed-by: Phil Yang 
> Reviewed-by: Gavin Hu 
> Reported-by: Raslan Darawsheh 
> Reported-by: Thomas F Herbert 
Acked-by: Gavin Hu 
> ---
>  config/arm/meson.build | 3 ++-
>  mk/rte.cpuflags.mk | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/config/arm/meson.build b/config/arm/meson.build index
> 40dbc87f7..94cca490e 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -157,7 +157,8 @@ else
>  endif
>  message(machine_args)
> 
> -if cc.get_define('__ARM_NEON', args: machine_args) != ''
> +if (cc.get_define('__ARM_NEON', args: machine_args) != '' or
> +cc.get_define('__aarch64__', args: machine_args) != '')
>   dpdk_conf.set('RTE_MACHINE_CPUFLAG_NEON', 1)
>   compile_time_cpuflags += ['RTE_CPUFLAG_NEON']  endif diff --git
> a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk index 60713137d..43ed84155
> 100644
> --- a/mk/rte.cpuflags.mk
> +++ b/mk/rte.cpuflags.mk
> @@ -89,7 +89,7 @@ CPUFLAGS += VSX
>  endif
> 
>  # ARM flags
> -ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON),)
> +ifneq ($(filter __ARM_NEON __aarch64__,$(AUTO_CPUFLAGS)),)
>  CPUFLAGS += NEON
>  endif
> 
> --
> 2.17.1



Re: [dpdk-dev] [PATCH] mbuf: add IGMP packet type

2018-08-29 Thread Andrew Rybchenko

On 08/27/2018 03:38 PM, Jerin Jacob wrote:

Add support for IGMP packet type.

Signed-off-by: Jerin Jacob 


Acked-by: Andrew Rybchenko 



[dpdk-dev] [PATCH 0/4] net/failsafe: support deferred queue start

2018-08-29 Thread Andrew Rybchenko
Ian Dolzhansky (4):
  app/testpmd: add queue deferred start switch
  net/failsafe: add checks for deferred queue setup
  net/failsafe: add Rx queue start and stop functions
  net/failsafe: add Tx queue start and stop functions

 app/test-pmd/cmdline.c |  91 ++
 doc/guides/nics/features/failsafe.ini  |   1 +
 doc/guides/rel_notes/release_18_11.rst |  13 ++
 drivers/net/failsafe/failsafe_ether.c  |  88 +
 drivers/net/failsafe/failsafe_ops.c| 167 -
 5 files changed, 359 insertions(+), 1 deletion(-)

-- 
2.17.1



[dpdk-dev] [PATCH 1/4] app/testpmd: add queue deferred start switch

2018-08-29 Thread Andrew Rybchenko
From: Ian Dolzhansky 

Signed-off-by: Ian Dolzhansky 
Signed-off-by: Andrew Rybchenko 
---
 app/test-pmd/cmdline.c | 91 ++
 doc/guides/rel_notes/release_18_11.rst |  6 ++
 2 files changed, 97 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 589121d69..f47ec99f1 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -883,6 +883,10 @@ static void cmd_help_long_parsed(void *parsed_result,
"Start/stop a rx/tx queue of port X. Only take 
effect"
" when port X is started\n\n"
 
+   "port (port_id) (rxq|txq) (queue_id) deferred_start 
(on|off)\n"
+   "Switch on/off a deferred start of port X rx/tx 
queue. Only"
+   " take effect when port X is stopped.\n\n"
+
"port (port_id) (rxq|txq) (queue_id) setup\n"
"Setup a rx/tx queue of port X.\n\n"
 
@@ -2441,6 +2445,92 @@ cmdline_parse_inst_t cmd_config_rxtx_queue = {
},
 };
 
+/* *** configure port rxq/txq deferred start on/off *** */
+struct cmd_config_deferred_start_rxtx_queue {
+   cmdline_fixed_string_t port;
+   portid_t port_id;
+   cmdline_fixed_string_t rxtxq;
+   uint16_t qid;
+   cmdline_fixed_string_t opname;
+   cmdline_fixed_string_t state;
+};
+
+static void
+cmd_config_deferred_start_rxtx_queue_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_config_deferred_start_rxtx_queue *res = parsed_result;
+   struct rte_port *port;
+   uint8_t isrx;
+   uint8_t ison;
+   uint8_t needreconfig = 0;
+
+   if (port_id_is_invalid(res->port_id, ENABLED_WARN))
+   return;
+
+   if (port_is_started(res->port_id) != 0) {
+   printf("Please stop port %u first\n", res->port_id);
+   return;
+   }
+
+   port = &ports[res->port_id];
+
+   isrx = !strcmp(res->rxtxq, "rxq");
+
+   if (isrx && rx_queue_id_is_invalid(res->qid))
+   return;
+   else if (!isrx && tx_queue_id_is_invalid(res->qid))
+   return;
+
+   ison = !strcmp(res->state, "on");
+
+   if (isrx && port->rx_conf[res->qid].rx_deferred_start != ison) {
+   port->rx_conf[res->qid].rx_deferred_start = ison;
+   needreconfig = 1;
+   } else if (!isrx && port->tx_conf[res->qid].tx_deferred_start != ison) {
+   port->tx_conf[res->qid].tx_deferred_start = ison;
+   needreconfig = 1;
+   }
+
+   if (needreconfig)
+   cmd_reconfig_device_queue(res->port_id, 0, 1);
+}
+
+cmdline_parse_token_string_t cmd_config_deferred_start_rxtx_queue_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_deferred_start_rxtx_queue,
+   port, "port");
+cmdline_parse_token_num_t cmd_config_deferred_start_rxtx_queue_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_config_deferred_start_rxtx_queue,
+   port_id, UINT16);
+cmdline_parse_token_string_t cmd_config_deferred_start_rxtx_queue_rxtxq =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_deferred_start_rxtx_queue,
+   rxtxq, "rxq#txq");
+cmdline_parse_token_num_t cmd_config_deferred_start_rxtx_queue_qid =
+   TOKEN_NUM_INITIALIZER(struct cmd_config_deferred_start_rxtx_queue,
+   qid, UINT16);
+cmdline_parse_token_string_t cmd_config_deferred_start_rxtx_queue_opname =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_deferred_start_rxtx_queue,
+   opname, "deferred_start");
+cmdline_parse_token_string_t cmd_config_deferred_start_rxtx_queue_state =
+   TOKEN_STRING_INITIALIZER(struct cmd_config_deferred_start_rxtx_queue,
+   state, "on#off");
+
+cmdline_parse_inst_t cmd_config_deferred_start_rxtx_queue = {
+   .f = cmd_config_deferred_start_rxtx_queue_parsed,
+   .data = NULL,
+   .help_str = "port  rxq|txq  deferred_start on|off",
+   .tokens = {
+   (void *)&cmd_config_deferred_start_rxtx_queue_port,
+   (void *)&cmd_config_deferred_start_rxtx_queue_port_id,
+   (void *)&cmd_config_deferred_start_rxtx_queue_rxtxq,
+   (void *)&cmd_config_deferred_start_rxtx_queue_qid,
+   (void *)&cmd_config_deferred_start_rxtx_queue_opname,
+   (void *)&cmd_config_deferred_start_rxtx_queue_state,
+   NULL,
+   },
+};
+
 /* *** configure port rxq/txq setup *** */
 struct cmd_setup_rxtx_queue {
cmdline_fixed_string_t port;
@@ -17711,6 +17801,7 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)&cmd_config_rss,

[dpdk-dev] [PATCH 2/4] net/failsafe: add checks for deferred queue setup

2018-08-29 Thread Andrew Rybchenko
From: Ian Dolzhansky 

Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD")
Cc: sta...@dpdk.org

Signed-off-by: Ian Dolzhansky 
Signed-off-by: Andrew Rybchenko 
---
 drivers/net/failsafe/failsafe_ops.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/failsafe/failsafe_ops.c 
b/drivers/net/failsafe/failsafe_ops.c
index 24e91c931..f7cce0d8f 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -340,6 +340,11 @@ fs_rx_queue_setup(struct rte_eth_dev *dev,
uint8_t i;
int ret;
 
+   if (rx_conf->rx_deferred_start) {
+   ERROR("Rx queue deferred start is not supported");
+   return -EINVAL;
+   }
+
fs_lock(dev, 0);
rxq = dev->data->rx_queues[rx_queue_id];
if (rxq != NULL) {
@@ -497,6 +502,11 @@ fs_tx_queue_setup(struct rte_eth_dev *dev,
uint8_t i;
int ret;
 
+   if (tx_conf->tx_deferred_start) {
+   ERROR("Tx queue deferred start is not supported");
+   return -EINVAL;
+   }
+
fs_lock(dev, 0);
txq = dev->data->tx_queues[tx_queue_id];
if (txq != NULL) {
-- 
2.17.1



[dpdk-dev] [PATCH 4/4] net/failsafe: add Tx queue start and stop functions

2018-08-29 Thread Andrew Rybchenko
From: Ian Dolzhansky 

Support Tx queue deferred start.

Signed-off-by: Ian Dolzhansky 
Signed-off-by: Andrew Rybchenko 
---
 doc/guides/nics/features/failsafe.ini  |  2 +-
 doc/guides/rel_notes/release_18_11.rst |  4 +-
 drivers/net/failsafe/failsafe_ether.c  | 44 +++
 drivers/net/failsafe/failsafe_ops.c| 77 --
 4 files changed, 120 insertions(+), 7 deletions(-)

diff --git a/doc/guides/nics/features/failsafe.ini 
b/doc/guides/nics/features/failsafe.ini
index 712c0b7f7..74eae4a62 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -7,7 +7,7 @@
 Link status  = Y
 Link status event= Y
 Rx interrupt = Y
-Queue start/stop = P
+Queue start/stop = Y
 MTU update   = Y
 Jumbo frame  = Y
 Promiscuous mode = Y
diff --git a/doc/guides/rel_notes/release_18_11.rst 
b/doc/guides/rel_notes/release_18_11.rst
index 882ef8ac6..ad08a204f 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -58,8 +58,8 @@ New Features
 
   Updated the failsafe driver including the following changes:
 
-  * Support for Rx queues start and stop.
-  * Support for Rx queues deferred start.
+  * Support for Rx and Tx queues start and stop.
+  * Support for Rx and Tx queues deferred start.
 
 * **Added ability to switch queue deferred start flag on testpmd app.**
 
diff --git a/drivers/net/failsafe/failsafe_ether.c 
b/drivers/net/failsafe/failsafe_ether.c
index 305deed63..191f95f14 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -407,6 +407,47 @@ failsafe_eth_dev_rx_queues_sync(struct rte_eth_dev *dev)
return 0;
 }
 
+static int
+failsafe_eth_dev_tx_queues_sync(struct rte_eth_dev *dev)
+{
+   struct txq *txq;
+   int ret;
+   uint16_t i;
+
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   txq = dev->data->tx_queues[i];
+
+   if (txq->info.conf.tx_deferred_start &&
+   dev->data->tx_queue_state[i] ==
+   RTE_ETH_QUEUE_STATE_STARTED) {
+   /*
+* The subdevice Tx queue does not launch on device
+* start if deferred start flag is set. It needs to be
+* started manually in case an appropriate failsafe Tx
+* queue has been started earlier.
+*/
+   ret = dev->dev_ops->tx_queue_start(dev, i);
+   if (ret) {
+   ERROR("Could not synchronize Tx queue %d", i);
+   return ret;
+   }
+   } else if (dev->data->tx_queue_state[i] ==
+   RTE_ETH_QUEUE_STATE_STOPPED) {
+   /*
+* The subdevice Tx queue needs to be stopped manually
+* in case an appropriate failsafe Tx queue has been
+* stopped earlier.
+*/
+   ret = dev->dev_ops->tx_queue_stop(dev, i);
+   if (ret) {
+   ERROR("Could not synchronize Tx queue %d", i);
+   return ret;
+   }
+   }
+   }
+   return 0;
+}
+
 int
 failsafe_eth_dev_state_sync(struct rte_eth_dev *dev)
 {
@@ -466,6 +507,9 @@ failsafe_eth_dev_state_sync(struct rte_eth_dev *dev)
if (ret)
goto err_remove;
ret = failsafe_eth_dev_rx_queues_sync(dev);
+   if (ret)
+   goto err_remove;
+   ret = failsafe_eth_dev_tx_queues_sync(dev);
if (ret)
goto err_remove;
return 0;
diff --git a/drivers/net/failsafe/failsafe_ops.c 
b/drivers/net/failsafe/failsafe_ops.c
index 412d522cf..4d30eb22d 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -174,6 +174,7 @@ static void
 fs_set_queues_state_start(struct rte_eth_dev *dev)
 {
struct rxq *rxq;
+   struct txq *txq;
uint16_t i;
 
for (i = 0; i < dev->data->nb_rx_queues; i++) {
@@ -182,6 +183,12 @@ fs_set_queues_state_start(struct rte_eth_dev *dev)
dev->data->rx_queue_state[i] =
RTE_ETH_QUEUE_STATE_STARTED;
}
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   txq = dev->data->tx_queues[i];
+   if (!txq->info.conf.tx_deferred_start)
+   dev->data->tx_queue_state[i] =
+   RTE_ETH_QUEUE_STATE_STARTED;
+   }
 }
 
 static int
@@ -234,6 +241,8 @@ fs_set_queues_state_stop(struct rte_eth_dev *dev)
 
for (i = 0; i < dev->data->nb_rx_queues; i++)
dev->data->rx_queue_state[i] = RTE

[dpdk-dev] [PATCH 3/4] net/failsafe: add Rx queue start and stop functions

2018-08-29 Thread Andrew Rybchenko
From: Ian Dolzhansky 

Support Rx queue deferred start.

Signed-off-by: Ian Dolzhansky 
Signed-off-by: Andrew Rybchenko 
---
 doc/guides/nics/features/failsafe.ini  |  1 +
 doc/guides/rel_notes/release_18_11.rst |  7 ++
 drivers/net/failsafe/failsafe_ether.c  | 44 
 drivers/net/failsafe/failsafe_ops.c| 96 --
 4 files changed, 143 insertions(+), 5 deletions(-)

diff --git a/doc/guides/nics/features/failsafe.ini 
b/doc/guides/nics/features/failsafe.ini
index 39ee57965..712c0b7f7 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -7,6 +7,7 @@
 Link status  = Y
 Link status event= Y
 Rx interrupt = Y
+Queue start/stop = P
 MTU update   = Y
 Jumbo frame  = Y
 Promiscuous mode = Y
diff --git a/doc/guides/rel_notes/release_18_11.rst 
b/doc/guides/rel_notes/release_18_11.rst
index 1f17befd8..882ef8ac6 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -54,6 +54,13 @@ New Features
  Also, make sure to start the actual text at the margin.
  =
 
+* **Updated failsafe driver.**
+
+  Updated the failsafe driver including the following changes:
+
+  * Support for Rx queues start and stop.
+  * Support for Rx queues deferred start.
+
 * **Added ability to switch queue deferred start flag on testpmd app.**
 
   Added a console command to testpmd app, giving ability to switch
diff --git a/drivers/net/failsafe/failsafe_ether.c 
b/drivers/net/failsafe/failsafe_ether.c
index 5b5cb3b49..305deed63 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -366,6 +366,47 @@ failsafe_dev_remove(struct rte_eth_dev *dev)
}
 }
 
+static int
+failsafe_eth_dev_rx_queues_sync(struct rte_eth_dev *dev)
+{
+   struct rxq *rxq;
+   int ret;
+   uint16_t i;
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   rxq = dev->data->rx_queues[i];
+
+   if (rxq->info.conf.rx_deferred_start &&
+   dev->data->rx_queue_state[i] ==
+   RTE_ETH_QUEUE_STATE_STARTED) {
+   /*
+* The subdevice Rx queue does not launch on device
+* start if deferred start flag is set. It needs to be
+* started manually in case an appropriate failsafe Rx
+* queue has been started earlier.
+*/
+   ret = dev->dev_ops->rx_queue_start(dev, i);
+   if (ret) {
+   ERROR("Could not synchronize Rx queue %d", i);
+   return ret;
+   }
+   } else if (dev->data->rx_queue_state[i] ==
+   RTE_ETH_QUEUE_STATE_STOPPED) {
+   /*
+* The subdevice Rx queue needs to be stopped manually
+* in case an appropriate failsafe Rx queue has been
+* stopped earlier.
+*/
+   ret = dev->dev_ops->rx_queue_stop(dev, i);
+   if (ret) {
+   ERROR("Could not synchronize Rx queue %d", i);
+   return ret;
+   }
+   }
+   }
+   return 0;
+}
+
 int
 failsafe_eth_dev_state_sync(struct rte_eth_dev *dev)
 {
@@ -422,6 +463,9 @@ failsafe_eth_dev_state_sync(struct rte_eth_dev *dev)
if (PRIV(dev)->state < DEV_STARTED)
return 0;
ret = dev->dev_ops->dev_start(dev);
+   if (ret)
+   goto err_remove;
+   ret = failsafe_eth_dev_rx_queues_sync(dev);
if (ret)
goto err_remove;
return 0;
diff --git a/drivers/net/failsafe/failsafe_ops.c 
b/drivers/net/failsafe/failsafe_ops.c
index f7cce0d8f..412d522cf 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -170,6 +170,20 @@ fs_dev_configure(struct rte_eth_dev *dev)
return 0;
 }
 
+static void
+fs_set_queues_state_start(struct rte_eth_dev *dev)
+{
+   struct rxq *rxq;
+   uint16_t i;
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   rxq = dev->data->rx_queues[i];
+   if (!rxq->info.conf.rx_deferred_start)
+   dev->data->rx_queue_state[i] =
+   RTE_ETH_QUEUE_STATE_STARTED;
+   }
+}
+
 static int
 fs_dev_start(struct rte_eth_dev *dev)
 {
@@ -204,13 +218,24 @@ fs_dev_start(struct rte_eth_dev *dev)
}
sdev->state = DEV_STARTED;
}
-   if (PRIV(dev)->state < DEV_STARTED)
+   if (PRIV(dev)->state < DEV_STARTED) {
PRIV(dev)->state = DEV_STA

Re: [dpdk-dev] [PATCH] build: enable Arm NEON flags when __aarch64__ is defined

2018-08-29 Thread Jerin Jacob
-Original Message-
> Date: Wed, 22 Aug 2018 10:01:07 -0500
> From: Honnappa Nagarahalli 
> To: bruce.richard...@intel.com
> CC: dev@dpdk.org, gavin...@arm.com, rasl...@mellanox.com,
>  therb...@redhat.com, honnappa.nagaraha...@arm.com
> Subject: [dpdk-dev] [PATCH] build: enable Arm NEON flags when __aarch64__
>  is defined
> X-Mailer: git-send-email 2.7.4
> 
> External Email
> 
> GCC version 4.8.5 does not pre-define __ARM_NEON. NEON is not
> optional for ArmV8. Hence NEON related code can be enabled
> when __aarch64__ is defined.
> 
> Bugzilla ID: 82
> 
> Signed-off-by: Honnappa Nagarahalli 
> Reviewed-by: Phil Yang 
> Reviewed-by: Gavin Hu 
> Reported-by: Raslan Darawsheh 
> Reported-by: Thomas F Herbert 

Fixes:
Cc: stable
blank line
Reported-by:
Suggested-by:
Signed-off-by:

In general, Please follow the above order.

With above change:
Acked-by: Jerin Jacob 

> ---
>  config/arm/meson.build | 3 ++-
>  mk/rte.cpuflags.mk | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/config/arm/meson.build b/config/arm/meson.build
> index 40dbc87f7..94cca490e 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -157,7 +157,8 @@ else
>  endif
>  message(machine_args)
> 
> -if cc.get_define('__ARM_NEON', args: machine_args) != ''
> +if (cc.get_define('__ARM_NEON', args: machine_args) != '' or
> +cc.get_define('__aarch64__', args: machine_args) != '')
> dpdk_conf.set('RTE_MACHINE_CPUFLAG_NEON', 1)
> compile_time_cpuflags += ['RTE_CPUFLAG_NEON']
>  endif
> diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
> index 60713137d..43ed84155 100644
> --- a/mk/rte.cpuflags.mk
> +++ b/mk/rte.cpuflags.mk
> @@ -89,7 +89,7 @@ CPUFLAGS += VSX
>  endif
> 
>  # ARM flags
> -ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON),)
> +ifneq ($(filter __ARM_NEON __aarch64__,$(AUTO_CPUFLAGS)),)
>  CPUFLAGS += NEON
>  endif
> 
> --
> 2.17.1
> 


Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization

2018-08-29 Thread Ola Liljedahl
Is the rte_kni kernel/user binary interface subject to backwards compatibility 
requirements? Or can we change it for a new DPDK release?

-- Ola

From: "Kokkilagadda, Kiran" 
Date: Wednesday, 29 August 2018 at 07:50
To: Honnappa Nagarahalli , Gavin Hu 
, Ferruh Yigit , "Jacob, Jerin" 

Cc: "dev@dpdk.org" , nd , Ola Liljedahl 
, Steve Capper 
Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization


Agreed. Please go a head and make the changes. You need to make same change in 
kernel side also. And please use c11 ring (see rte_ring) mechanism so that it 
won't impact other platforms like intel. We need this change just for arm and 
ppc.


From: Honnappa Nagarahalli 
Sent: Wednesday, August 29, 2018 10:29 AM
To: Gavin Hu; Kokkilagadda, Kiran; Ferruh Yigit; Jacob, Jerin
Cc: dev@dpdk.org; nd; Ola Liljedahl; Steve Capper
Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization


External Email

I agree with Gavin here. Store to fifo->write and fifo->read can get hoisted 
resulting in accessing invalid buffer array entries or over writing of the 
buffer array entries.

IMO, we should solve this using c11 atomics. This will also help remove the use 
of ‘volatile’ from ‘rte_kni_fifo’ structure.



If you want us to put together a patch with this idea, please let us know.



Thank you,

Honnappa



From: Gavin Hu
Sent: Tuesday, August 28, 2018 2:31 PM
To: Kokkilagadda, Kiran ; Ferruh Yigit 
; Jacob, Jerin 
Cc: dev@dpdk.org; Honnappa Nagarahalli ; nd 
; Ola Liljedahl ; Steve Capper 

Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization



Assuming reader and writer may execute on different CPU's, this become standard 
multithreaded programming.

We are concerned about that update the reader pointer too early(weak ordering 
may reorder it before reading from the slots), that means the slots are 
released and may immediately overwritten by the writer then you get “too new” 
data and get lost of the old data.



From: Kokkilagadda, Kiran 
mailto:kiran.kokkilaga...@cavium.com>>
Sent: Tuesday, August 28, 2018 6:44 PM
To: Gavin Hu mailto:gavin...@arm.com>>; Ferruh Yigit 
mailto:ferruh.yi...@intel.com>>; Jacob, Jerin 
mailto:jerin.jacobkollanukka...@cavium.com>>
Cc: dev@dpdk.org; Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>>
Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization



In this instance there won't be any problem, as until the value of fifo->write 
changes, this loop won't get executed. As of now we didn't see any issue with 
it and for performance reasons, we don't want to keep read barrier.







From: Gavin Hu mailto:gavin...@arm.com>>
Sent: Monday, August 27, 2018 9:10 PM
To: Ferruh Yigit; Kokkilagadda, Kiran; Jacob, Jerin
Cc: dev@dpdk.org; Honnappa Nagarahalli
Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization



External Email

This fix is not complete, kni_fifo_get requires a read fence also, otherwise it 
probably gets stale data on a weak ordering platform.

> -Original Message-
> From: dev mailto:dev-boun...@dpdk.org>> On Behalf Of 
> Ferruh Yigit
> Sent: Monday, August 27, 2018 10:08 PM
> To: Kiran Kumar 
> mailto:kkokkilaga...@caviumnetworks.com>>;
> jerin.ja...@caviumnetworks.com
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer
> synchronization
>
> On 8/16/2018 10:55 AM, Kiran Kumar wrote:
> > With existing code in kni_fifo_put, rx_q values are not being updated
> > before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> > This is causing the sync issue on other core. So adding a write
> > barrier to make sure the values being synced before updating fifo_write.
> >
> > Fixes: 3fc5ca2f6352 ("kni: initial import")
> >
> > Signed-off-by: Kiran Kumar 
> > mailto:kkokkilaga...@caviumnetworks.com>>
> > Acked-by: Jerin Jacob 
> > mailto:jerin.ja...@caviumnetworks.com>>
>
> Acked-by: Ferruh Yigit mailto:ferruh.yi...@intel.com>>
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


[dpdk-dev] [PATCH 1/2] net/sfc: support runtime Rx queue setup

2018-08-29 Thread Andrew Rybchenko
From: Igor Romanov 

Signed-off-by: Igor Romanov 
Signed-off-by: Andrew Rybchenko 
---
 doc/guides/nics/features/sfc_efx.ini | 1 +
 drivers/net/sfc/sfc_ethdev.c | 6 ++
 drivers/net/sfc/sfc_rx.c | 6 --
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/features/sfc_efx.ini 
b/doc/guides/nics/features/sfc_efx.ini
index 8a497ee05..5d2e90102 100644
--- a/doc/guides/nics/features/sfc_efx.ini
+++ b/doc/guides/nics/features/sfc_efx.ini
@@ -9,6 +9,7 @@ Link status  = Y
 Link status event= Y
 Fast mbuf free   = Y
 Queue start/stop = Y
+Runtime Rx queue setup = Y
 MTU update   = Y
 Jumbo frame  = Y
 Scattered Rx = Y
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 9decbf5af..9b5324ca6 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -171,6 +171,8 @@ sfc_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
sa->dp_rx->get_dev_info(dev_info);
if (sa->dp_tx->get_dev_info != NULL)
sa->dp_tx->get_dev_info(dev_info);
+
+   dev_info->dev_capa = RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP;
 }
 
 static const uint32_t *
@@ -1143,6 +1145,9 @@ sfc_rx_queue_start(struct rte_eth_dev *dev, uint16_t 
rx_queue_id)
if (sa->state != SFC_ADAPTER_STARTED)
goto fail_not_started;
 
+   if (sa->rxq_info[rx_queue_id].rxq == NULL)
+   goto fail_not_setup;
+
rc = sfc_rx_qstart(sa, rx_queue_id);
if (rc != 0)
goto fail_rx_qstart;
@@ -1154,6 +1159,7 @@ sfc_rx_queue_start(struct rte_eth_dev *dev, uint16_t 
rx_queue_id)
return 0;
 
 fail_rx_qstart:
+fail_not_setup:
 fail_not_started:
sfc_adapter_unlock(sa);
SFC_ASSERT(rc > 0);
diff --git a/drivers/net/sfc/sfc_rx.c b/drivers/net/sfc/sfc_rx.c
index d8503e201..c6321d174 100644
--- a/drivers/net/sfc/sfc_rx.c
+++ b/drivers/net/sfc/sfc_rx.c
@@ -673,6 +673,7 @@ sfc_rx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 
rxq_info = &sa->rxq_info[sw_index];
rxq = rxq_info->rxq;
+   SFC_ASSERT(rxq != NULL);
SFC_ASSERT(rxq->state == SFC_RXQ_INITIALIZED);
 
evq = rxq->evq;
@@ -763,7 +764,7 @@ sfc_rx_qstop(struct sfc_adapter *sa, unsigned int sw_index)
rxq_info = &sa->rxq_info[sw_index];
rxq = rxq_info->rxq;
 
-   if (rxq->state == SFC_RXQ_INITIALIZED)
+   if (rxq == NULL || rxq->state == SFC_RXQ_INITIALIZED)
return;
SFC_ASSERT(rxq->state & SFC_RXQ_STARTED);
 
@@ -1363,7 +1364,8 @@ sfc_rx_start(struct sfc_adapter *sa)
goto fail_rss_config;
 
for (sw_index = 0; sw_index < sa->rxq_count; ++sw_index) {
-   if ((!sa->rxq_info[sw_index].deferred_start ||
+   if (sa->rxq_info[sw_index].rxq != NULL &&
+   (!sa->rxq_info[sw_index].deferred_start ||
 sa->rxq_info[sw_index].deferred_started)) {
rc = sfc_rx_qstart(sa, sw_index);
if (rc != 0)
-- 
2.17.1



[dpdk-dev] [PATCH 2/2] net/sfc: support runtime Tx queue setup

2018-08-29 Thread Andrew Rybchenko
From: Igor Romanov 

Signed-off-by: Igor Romanov 
Signed-off-by: Andrew Rybchenko 
---
 doc/guides/nics/features/sfc_efx.ini | 1 +
 drivers/net/sfc/sfc_ethdev.c | 7 ++-
 drivers/net/sfc/sfc_tx.c | 8 +---
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/doc/guides/nics/features/sfc_efx.ini 
b/doc/guides/nics/features/sfc_efx.ini
index 5d2e90102..d1aa83313 100644
--- a/doc/guides/nics/features/sfc_efx.ini
+++ b/doc/guides/nics/features/sfc_efx.ini
@@ -10,6 +10,7 @@ Link status event= Y
 Fast mbuf free   = Y
 Queue start/stop = Y
 Runtime Rx queue setup = Y
+Runtime Tx queue setup = Y
 MTU update   = Y
 Jumbo frame  = Y
 Scattered Rx = Y
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 9b5324ca6..435bde67f 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -172,7 +172,8 @@ sfc_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
if (sa->dp_tx->get_dev_info != NULL)
sa->dp_tx->get_dev_info(dev_info);
 
-   dev_info->dev_capa = RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP;
+   dev_info->dev_capa = RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP |
+RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP;
 }
 
 static const uint32_t *
@@ -1197,6 +1198,9 @@ sfc_tx_queue_start(struct rte_eth_dev *dev, uint16_t 
tx_queue_id)
if (sa->state != SFC_ADAPTER_STARTED)
goto fail_not_started;
 
+   if (sa->txq_info[tx_queue_id].txq == NULL)
+   goto fail_not_setup;
+
rc = sfc_tx_qstart(sa, tx_queue_id);
if (rc != 0)
goto fail_tx_qstart;
@@ -1208,6 +1212,7 @@ sfc_tx_queue_start(struct rte_eth_dev *dev, uint16_t 
tx_queue_id)
 
 fail_tx_qstart:
 
+fail_not_setup:
 fail_not_started:
sfc_adapter_unlock(sa);
SFC_ASSERT(rc > 0);
diff --git a/drivers/net/sfc/sfc_tx.c b/drivers/net/sfc/sfc_tx.c
index 6d42a1a65..8af08b37c 100644
--- a/drivers/net/sfc/sfc_tx.c
+++ b/drivers/net/sfc/sfc_tx.c
@@ -421,6 +421,7 @@ sfc_tx_qstart(struct sfc_adapter *sa, unsigned int sw_index)
 
txq = txq_info->txq;
 
+   SFC_ASSERT(txq != NULL);
SFC_ASSERT(txq->state == SFC_TXQ_INITIALIZED);
 
evq = txq->evq;
@@ -501,7 +502,7 @@ sfc_tx_qstop(struct sfc_adapter *sa, unsigned int sw_index)
 
txq = txq_info->txq;
 
-   if (txq->state == SFC_TXQ_INITIALIZED)
+   if (txq == NULL || txq->state == SFC_TXQ_INITIALIZED)
return;
 
SFC_ASSERT(txq->state & SFC_TXQ_STARTED);
@@ -578,8 +579,9 @@ sfc_tx_start(struct sfc_adapter *sa)
goto fail_efx_tx_init;
 
for (sw_index = 0; sw_index < sa->txq_count; ++sw_index) {
-   if (!(sa->txq_info[sw_index].deferred_start) ||
-   sa->txq_info[sw_index].deferred_started) {
+   if (sa->txq_info[sw_index].txq != NULL &&
+   (!(sa->txq_info[sw_index].deferred_start) ||
+sa->txq_info[sw_index].deferred_started)) {
rc = sfc_tx_qstart(sa, sw_index);
if (rc != 0)
goto fail_tx_qstart;
-- 
2.17.1



[dpdk-dev] [PATCH] net/failsafe: limit device capabilities to really supported

2018-08-29 Thread Andrew Rybchenko
From: Igor Romanov 

Failsafe driver does not support any device capabilities yet.
Make fs_dev_infos_get() consider default ones to limit advertised
device capabilities to really supported instead of unconditional
inheritance from sub-devices.

Fixes: cac923cfea47 ("ethdev: support runtime queue setup")
Cc: sta...@dpdk.org

Signed-off-by: Igor Romanov 
Signed-off-by: Andrew Rybchenko 
---
 drivers/net/failsafe/failsafe_ops.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/failsafe/failsafe_ops.c 
b/drivers/net/failsafe/failsafe_ops.c
index 24e91c931..2df8b55d9 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -716,6 +716,8 @@ fs_stats_reset(struct rte_eth_dev *dev)
  *  all sub_devices and the default capabilities.
  *  Uses a logical AND of TX capabilities among
  *  the active probed sub_device and the default capabilities.
+ *  Uses a logical AND of device capabilities among
+ *  all sub_devices and the default capabilities.
  *
  */
 static void
@@ -734,10 +736,12 @@ fs_dev_infos_get(struct rte_eth_dev *dev,
uint64_t rx_offload_capa;
uint64_t rxq_offload_capa;
uint64_t rss_hf_offload_capa;
+   uint64_t dev_capa;
 
rx_offload_capa = default_infos.rx_offload_capa;
rxq_offload_capa = default_infos.rx_queue_offload_capa;
rss_hf_offload_capa = default_infos.flow_type_rss_offloads;
+   dev_capa = default_infos.dev_capa;
FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_PROBED) {
rte_eth_dev_info_get(PORT_ID(sdev),
&PRIV(dev)->infos);
@@ -746,12 +750,14 @@ fs_dev_infos_get(struct rte_eth_dev *dev,
PRIV(dev)->infos.rx_queue_offload_capa;
rss_hf_offload_capa &=
PRIV(dev)->infos.flow_type_rss_offloads;
+   dev_capa &= PRIV(dev)->infos.dev_capa;
}
sdev = TX_SUBDEV(dev);
rte_eth_dev_info_get(PORT_ID(sdev), &PRIV(dev)->infos);
PRIV(dev)->infos.rx_offload_capa = rx_offload_capa;
PRIV(dev)->infos.rx_queue_offload_capa = rxq_offload_capa;
PRIV(dev)->infos.flow_type_rss_offloads = rss_hf_offload_capa;
+   PRIV(dev)->infos.dev_capa = dev_capa;
PRIV(dev)->infos.tx_offload_capa &=
default_infos.tx_offload_capa;
PRIV(dev)->infos.tx_queue_offload_capa &=
-- 
2.17.1



[dpdk-dev] [PATCH] net/bonding: use evenly distributed default RSS RETA

2018-08-29 Thread Andrew Rybchenko
From: Igor Romanov 

Default Redirection Table that is set in bonding driver is distributed
evenly over all Rx queues only within every RETA group (the first RETA
entries in every group are always start with zero). But in the most
drivers, default RETA is distributed over all Rx queues without sequence
resets in the beginning of a new group, which implies more balanced
per-core load.

Change the default RETA to be evenly distributed over all Rx queues
considering the whole table.

Fixes: 734ce47f71e0 ("bonding: support RSS dynamic configuration")
Cc: sta...@dpdk.org

Signed-off-by: Igor Romanov 
Signed-off-by: Andrew Rybchenko 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index b84f32263..0f5ab09e3 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -3293,7 +3293,9 @@ bond_ethdev_configure(struct rte_eth_dev *dev)
for (i = 0; i < RTE_DIM(internals->reta_conf); i++) {
internals->reta_conf[i].mask = ~0LL;
for (j = 0; j < RTE_RETA_GROUP_SIZE; j++)
-   internals->reta_conf[i].reta[j] = j % 
dev->data->nb_rx_queues;
+   internals->reta_conf[i].reta[j] =
+   (i * RTE_RETA_GROUP_SIZE + j) %
+   dev->data->nb_rx_queues;
}
}
 
-- 
2.17.1



[dpdk-dev] [PATCH] net/bonding: don't ignore RSS key on device configuration

2018-08-29 Thread Andrew Rybchenko
From: Igor Romanov 

Bonding driver ignores the value of RSS key (that is set in the port RSS
configuration) in bond_ethdev_configure(). So the only way to set
non-default RSS key is by using rss_hash_update(). This is not an
expected behaviour.

Make the bond_ethdev_configure() set default RSS key only if
requested key is set to NULL.

Fixes: 734ce47f71e0 ("bonding: support RSS dynamic configuration")
Cc: sta...@dpdk.org

Signed-off-by: Igor Romanov 
Signed-off-by: Andrew Rybchenko 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 27 ++
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index b84f32263..ad670cc20 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1778,12 +1778,11 @@ slave_configure(struct rte_eth_dev *bonded_eth_dev,
 
/* If RSS is enabled for bonding, try to enable it for slaves  */
if (bonded_eth_dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG) 
{
-   if 
(bonded_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key_len
-   != 0) {
+   if (internals->rss_key_len != 0) {

slave_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key_len =
-   
bonded_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key_len;
+   internals->rss_key_len;

slave_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key =
-   
bonded_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key;
+   internals->rss_key;
} else {

slave_eth_dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key = NULL;
}
@@ -3284,11 +3283,23 @@ bond_ethdev_configure(struct rte_eth_dev *dev)
 
unsigned i, j;
 
-   /* If RSS is enabled, fill table and key with default values */
+   /*
+* If RSS is enabled, fill table with default values and
+* set key to the the value specified in port RSS configuration.
+* Fall back to default RSS key if the key is not specified
+*/
if (dev->data->dev_conf.rxmode.mq_mode & ETH_MQ_RX_RSS) {
-   dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key = 
internals->rss_key;
-   dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key_len = 0;
-   memcpy(internals->rss_key, default_rss_key, 40);
+   if (dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key != NULL) {
+   internals->rss_key_len =
+   
dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key_len;
+   memcpy(internals->rss_key,
+  dev->data->dev_conf.rx_adv_conf.rss_conf.rss_key,
+  internals->rss_key_len);
+   } else {
+   internals->rss_key_len = sizeof(default_rss_key);
+   memcpy(internals->rss_key, default_rss_key,
+  internals->rss_key_len);
+   }
 
for (i = 0; i < RTE_DIM(internals->reta_conf); i++) {
internals->reta_conf[i].mask = ~0LL;
-- 
2.17.1



Re: [dpdk-dev] [PATCH 01/11] telemetry: initial telemetry infrastructure

2018-08-29 Thread Gaëtan Rivet
On Tue, Aug 28, 2018 at 04:54:33PM +, Van Haaren, Harry wrote:
> Hi Gaetan,
> 
> > -Original Message-
> > From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com]
> > Sent: Tuesday, August 28, 2018 12:47 PM
> > To: Power, Ciara 
> > Cc: Van Haaren, Harry ; Archbold, Brian
> > ; Kenny, Emma ; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 01/11] telemetry: initial telemetry
> > infrastructure
> > 
> > Hi Ciara,
> > 
> > On Thu, Aug 23, 2018 at 01:08:03PM +0100, Ciara Power wrote:
> > > This patch adds the infrastructure and initial code for the
> > > telemetry library.
> > >
> > > A virtual device is used for telemetry, which is included in
> > > this patch. To use telemetry, a command-line argument must be
> > > used - "--vdev=telemetry".
> > >
> > 
> > Why use a virtual device?
> > 
> > It seems that you are only using the device semantic as a way to carry
> > around a tag telling the DPDK framework to init your library once it has
> > finished its initialization.
> > 
> > I guess you wanted to avoid having to add the call to rte_telemetry_init
> > to all applications. In the absence of a proper EAL option framework,
> > you can workaround by adding a --telemetry EAL parameter, setting a flag
> > on, and checking this flag from librte_telemetry, within a routine
> > declared with RTE_INIT_PRIO.
> 
> I suppose that an EAL --flag could work too, it would mean that EAL would
> depend on this library. The --vdev trick keeps the library standalone.
> 
> I don't have a strong opinion either way. :)
> 

This was done already for specific EAL configuration items such as
vfio intr_mode or PCI uio configuration.

Of course this is ugly, but the --telemetry parameter can exist without
compiling the lib. You can add a warning if the TELEMETRY Mconfig
item is not set to mitigate. The main issue is that you need to add
getters because you cannot declare an external *struct internal_config*
reference.

I agree this is awkward, and this is exactly the reason we need a
way for libraries to register options in the EAL, but this is not
yet done.

The virtual device solution however is a crutch used to emulate this
absent framework. This will complicate developping the proper solution
and its adoption once done. I would not be clear then to the dev that they
can translate the telemetry shim parameter to the new framework, without
having to rework the whole infrastructure of the lib (and this is without
talking about reworking the build system to remove the telemetry driver).

Even having to add a new driver subsection only for telemetry is awkward.

So we might certainly wait for second or third opinions, but I am firmly
convinced it would be easier in order to maintain the project (both from EAL
and systems standpoint and library standpoint) without the vdev trick.

> 
> > I only see afterward the selftest being triggered via kvargs. I haven't
> > yet looked at the testing done, but if it is only unit test, the "test"
> > app would be better suited. If it is integration testing to verify the
> > behavior of the library with other PMDs, you probably need specific
> > context, thus selftest being insufficient on its own and useless for
> > other users.
> 
> Correct, self tests are triggered by kvargs. This same model is used
> in eg: eventdev PMDs to run selftests, where the tests are pretty complex
> and specific to the device under test.
> 
> Again, I don't have a strong opinion but I don't see any issue with it
> being included in the vdev / telemetry library. We could write a shim
> test that the "official" test binary runs the telemetry tests if that is
> your concern?
> 
> 

Okay, I have no strong opinion about this (actually I prefer having the
test code close to the code-under-test), but eventdev can spawn device
objects to drive the test and provide configuration.

It would be more complicated using the same logic with a pure library,
without the vdev.

> > > Control threads are used to get CPU cycles for telemetry, which
> > > are configured in this patch also.
> > >
> > > Signed-off-by: Ciara Power 
> > > Signed-off-by: Brian Archbold 
> > 
> > Regards,
> > --
> > Gaëtan Rivet
> > 6WIND
> 
> Thanks for review, and there's a lightning talk at Userspace so please
> do provide input there too :) -Harry

-- 
Gaëtan Rivet
6WIND


Re: [dpdk-dev] 18.08 build error on ppc64el - bool as vector type

2018-08-29 Thread Christian Ehrhardt
On Tue, Aug 28, 2018 at 5:02 PM Adrien Mazarguil 
wrote:

> On Tue, Aug 28, 2018 at 02:38:35PM +0200, Christian Ehrhardt wrote:
> > On Tue, Aug 28, 2018 at 1:44 PM Adrien Mazarguil <
> adrien.mazarg...@6wind.com>
> > wrote:
> >
> > > On Tue, Aug 28, 2018 at 01:30:12PM +0200, Christian Ehrhardt wrote:
> > > > On Mon, Aug 27, 2018 at 2:22 PM Adrien Mazarguil <
> > > adrien.mazarg...@6wind.com>
> > > > wrote:
> > > >
> > > > > Hi Christian,
> > > > >
> > > > > On Wed, Aug 22, 2018 at 05:11:41PM +0200, Christian Ehrhardt wrote:
> > > > > > Just FYI the simple change hits similar issues later on.
> > > > > >
> > > > > > The (not really) proposed patch would have to be extended to be
> as
> > > > > > following.
> > > > > > We really need a better solution (or somebody has to convince me
> > > that my
> > > > > > change is better than a band aid).
> > > > >
> > > > > Thanks for reporting. I've made a quick investigation on my own and
> > > believe
> > > > > it's a toolchain issue which may affect more than this PMD;
> > > potentially all
> > > > > users of stdbool.h (C11) on this platform.
> > > > >
> > > >
> > > > Yeah I assumed as much, which is why I was hoping that some of the
> arch
> > > > experts would jump in and say "yeah this is a common thing and
> correctly
> > > > handled like "
> > > > I'll continue trying to reach out to people that should know better
> still
> > > > ...
> > > >
> > > >
> > > > > C11's stdbool.h defines a bool macro as _Bool (big B) along with
> > > > > true/false. On PPC targets, another file (altivec.h) defines bool
> as
> > > _bool
> > > > > (small b) but not true/false:
> > > > >
> > > > >  #if !defined(__APPLE_ALTIVEC__)
> > > > >  /* You are allowed to undef these for C++ compatibility.  */
> > > > >  #define vector __vector
> > > > >  #define pixel __pixel
> > > > >  #define bool __bool
> > > > >  #endif
> > > > >
> > > > > mlx5_nl.c explicitly includes stdbool.h to get the above
> definitions
> > > then
> > > > > includes mlx5.h -> rte_ether.h -> ppc_64/rte_memcpy.h -> altivec.h.
> > > > >
> > > > > For some reason the conflicting bool redefinition doesn't seem to
> > > raise any
> > > > > warnings, but results in mismatching bool and true/false
> definitions;
> > > an
> > > > > integer value cannot be assigned to a bool variable anymore, hence
> the
> > > > > build
> > > > > failure.
> > > > >
> > > > > The inability to assign integer values to bool is, in my opinion, a
> > > > > fundamental issue caused by altivec.h. If there is no way to fix
> this
> > > on
> > > > > the
> > > > > system, there are a couple of workarounds for DPDK, by order of
> > > preference:
> > > > >
> > > > > 1. Always #undef bool after including altivec.h in
> > > > >lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h. I do not
> > > think
> > > > >anyone expects this type to be unusable with true/false or
> integer
> > > > > values
> > > > >anyway. The version of altivec.h I have doesn't rely on this
> macro
> > > at
> > > > >all so it's probably not a big loss.
> > > > >
> > > >
> > > > The undef of a definition in header A by hedaer B can lead to most
> > > > interesting, still broken effects.
> > > > If e.g. one does
> > > > #include 
> > > > #include "mlx5.h"
> > > >
> > > > or similar then it would undefine that of stdbool as well right?
> > > > In any case, the undefine not only would be suspicious it also fails
> > > right
> > > > away:
> > > >
> > > > In file included from
> > > > /home/ubuntu/deb_dpdk/lib/librte_eal/common/malloc_heap.c:27:
> > > > /home/ubuntu/deb_dpdk/lib/librte_eal/common/eal_memalloc.h:30:15:
> > > > error: unknown
> > > > type name ‘bool’; did you mean ‘_Bool’?
> > > >   int socket, bool exact);
> > > >   ^~~~
> > > >   _Bool
> > > > [...]
> > > >
> > > >
> > > >
> > > > >Ditto for "pixel" and "vector" keywords. Alternatively you could
> > > #define
> > > > >__APPLE_ALTIVEC__ before including altivec.h to prevent them
> from
> > > > > getting
> > > > >defined in the first place.
> > > > >
> > > >
> > > > Interesting I got plenty of these:
> > > > In file included from
> > > > /home/ubuntu/deb_dpdk/lib/librte_eal/common/eal_common_options.c:25:
> > > >
> /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_memcpy.h:39:
> > > > warning:
> > > > "__APPLE_ALTIVEC__" redefined
> > > > #define __APPLE_ALTIVEC__
> > > >
> > > > With a few of it being even errors, but the position of the original
> > > define
> > > > is interesting.
> > > >
> /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_memcpy.h:39:
> > > error:
> > > > "__APPLE_ALTIVEC__" redefined [-Werror]
> > > > #define __APPLE_ALTIVEC__
> > > > : note: this is the location of the previous definition
> > > >
> > > > So if being a built-in, shouldn't it ALWAYS be defined and never
> > > > over-declare the bool type?
> > > >
> > > > Checking GCC on the platform:
> > > > $ gcc -dM -E - < /dev/null | grep ALTI
> > > > #define __ALTIVEC__ 1
> > > > #d

Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization

2018-08-29 Thread Jerin Jacob
-Original Message-
> Date: Wed, 29 Aug 2018 07:34:34 +
> From: Ola Liljedahl 
> To: "Kokkilagadda, Kiran" , Honnappa
>  Nagarahalli , Gavin Hu ,
>  Ferruh Yigit , "Jacob,  Jerin"
>  
> CC: "dev@dpdk.org" , nd , Steve Capper
>  
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer
>  synchronization
> user-agent: Microsoft-MacOutlook/10.10.0.180812
> 
> Is the rte_kni kernel/user binary interface subject to backwards 
> compatibility requirements? Or can we change it for a new DPDK release?

What would be the change in interface? Is it removing the volatile for
C11 case, Then you can use anonymous union OR #define to keep the size 
and offset of the element intact.

struct rte_kni_fifo { 
#ifndef RTE_C11...
volatile unsigned write; /**< Next position to be written*/
volatile unsigned read;  /**< Next position to be read */
#else
unsigned write; /**< Next position to be written*/
unsigned read;  /**< Next position to be read */
#endif
unsigned len;/**< Circular buffer length */
unsigned elem_size;  /**< Pointer size - for 32/64 bitOS */
void *volatile buffer[]; /**< The buffer contains mbuf
pointers */
};

Anonymous union example:
https://git.dpdk.org/dpdk/tree/lib/librte_mbuf/rte_mbuf.h#n461

You can check the ABI breakage by devtools/validate-abi.sh

> 
> -- Ola
> 
> From: "Kokkilagadda, Kiran" 
> Date: Wednesday, 29 August 2018 at 07:50
> To: Honnappa Nagarahalli , Gavin Hu 
> , Ferruh Yigit , "Jacob, Jerin" 
> 
> Cc: "dev@dpdk.org" , nd , Ola Liljedahl 
> , Steve Capper 
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
> synchronization
> 
> 
> Agreed. Please go a head and make the changes. You need to make same change 
> in kernel side also. And please use c11 ring (see rte_ring) mechanism so that 
> it won't impact other platforms like intel. We need this change just for arm 
> and ppc.
> 
> 
> From: Honnappa Nagarahalli 
> Sent: Wednesday, August 29, 2018 10:29 AM
> To: Gavin Hu; Kokkilagadda, Kiran; Ferruh Yigit; Jacob, Jerin
> Cc: dev@dpdk.org; nd; Ola Liljedahl; Steve Capper
> Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
> synchronization
> 
> 
> External Email
> 
> I agree with Gavin here. Store to fifo->write and fifo->read can get hoisted 
> resulting in accessing invalid buffer array entries or over writing of the 
> buffer array entries.
> 
> IMO, we should solve this using c11 atomics. This will also help remove the 
> use of ‘volatile’ from ‘rte_kni_fifo’ structure.
> 
> 
> 
> If you want us to put together a patch with this idea, please let us know.
> 
> 
> 
> Thank you,
> 
> Honnappa
> 
> 
> 
> From: Gavin Hu
> Sent: Tuesday, August 28, 2018 2:31 PM
> To: Kokkilagadda, Kiran ; Ferruh Yigit 
> ; Jacob, Jerin 
> Cc: dev@dpdk.org; Honnappa Nagarahalli ; nd 
> ; Ola Liljedahl ; Steve Capper 
> 
> Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
> synchronization
> 
> 
> 
> Assuming reader and writer may execute on different CPU's, this become 
> standard multithreaded programming.
> 
> We are concerned about that update the reader pointer too early(weak ordering 
> may reorder it before reading from the slots), that means the slots are 
> released and may immediately overwritten by the writer then you get “too new” 
> data and get lost of the old data.
> 
> 
> 
> From: Kokkilagadda, Kiran 
> mailto:kiran.kokkilaga...@cavium.com>>
> Sent: Tuesday, August 28, 2018 6:44 PM
> To: Gavin Hu mailto:gavin...@arm.com>>; Ferruh Yigit 
> mailto:ferruh.yi...@intel.com>>; Jacob, Jerin 
> mailto:jerin.jacobkollanukka...@cavium.com>>
> Cc: dev@dpdk.org; Honnappa Nagarahalli 
> mailto:honnappa.nagaraha...@arm.com>>
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
> synchronization
> 
> 
> 
> In this instance there won't be any problem, as until the value of 
> fifo->write changes, this loop won't get executed. As of now we didn't see 
> any issue with it and for performance reasons, we don't want to keep read 
> barrier.
> 
> 
> 
> 
> 
> 
> 
> From: Gavin Hu mailto:gavin...@arm.com>>
> Sent: Monday, August 27, 2018 9:10 PM
> To: Ferruh Yigit; Kokkilagadda, Kiran; Jacob, Jerin
> Cc: dev@dpdk.org; Honnappa Nagarahalli
> Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
> synchronization
> 
> 
> 
> External Email
> 
> This fix is not complete, kni_fifo_get requires a read fence also, otherwise 
> it probably gets stale data on a weak ordering platform.
> 
> > -Original Message-
> > From: dev mailto:dev-boun...@dpdk.org>> On Behalf Of 
> > Ferruh Yigit
> > Sent: Monday, August 27, 2018 10:08 PM
> > To: Kiran Kumar 
> > mailto:kkokkilaga...@caviumnetworks.com>>;
> > jerin.ja...@caviumnetworks.com
> > Cc: dev@dpdk.org
> > Subject:

Re: [dpdk-dev] [RFC] ethdev: add action to swap source and destination MAC to flow API

2018-08-29 Thread Rahul Lakkireddy
On Tuesday, August 08/28/18, 2018 at 16:27:43 +0530, Andrew Rybchenko wrote:
>On 08/27/2018 03:54 PM, Rahul Lakkireddy wrote:
> 
>  From: Shagun Agrawal [1]
> 
>  This action is useful for offloading loopback mode, where the hardware
>  will swap source and destination MAC address before looping back the
>  packet. This action can be used in conjunction with other rewrite
>  actions to achieve MAC layer transparent NAT where the MAC addresses
>  are swapped before either the source or destination MAC address
>  is rewritten and NAT is performed.
> 
>  Signed-off-by: Shagun Agrawal [2]
>  Signed-off-by: Rahul Lakkireddy [3]
>  ---
>   app/test-pmd/cmdline_flow.c |  9 +
>   app/test-pmd/config.c   |  1 +
>   doc/guides/prog_guide/rte_flow.rst  | 15 +++
>   doc/guides/testpmd_app_ug/testpmd_funcs.rst |  2 ++
>   lib/librte_ethdev/rte_flow.c|  1 +
>   lib/librte_ethdev/rte_flow.h|  7 +++
>   6 files changed, 35 insertions(+)
> 
>  diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
>  index f9260600e..4b83b55c4 100644
>  --- a/app/test-pmd/cmdline_flow.c
>  +++ b/app/test-pmd/cmdline_flow.c
>  @@ -243,6 +243,7 @@ enum index {
>  ACTION_VXLAN_DECAP,
>  ACTION_NVGRE_ENCAP,
>  ACTION_NVGRE_DECAP,
>  +   ACTION_MAC_SWAP,
>   };
> 
>   /** Maximum size for pattern in struct rte_flow_item_raw. */
>  @@ -816,6 +817,7 @@ static const enum index next_action[] = {
>  ACTION_VXLAN_DECAP,
>  ACTION_NVGRE_ENCAP,
>  ACTION_NVGRE_DECAP,
>  +   ACTION_MAC_SWAP,
>  ZERO,
>   };
> 
>  @@ -2470,6 +2472,13 @@ static const struct token token_list[] = {
>  .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
>  .call = parse_vc,
>  },
>  +   [ACTION_MAC_SWAP] = {
>  +   .name = "mac_swap",
>  +   .help = "swap source and destination mac address",
>  +   .priv = PRIV_ACTION(MAC_SWAP, 0),
>  +   .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
>  +   .call = parse_vc,
>  +   },
>   };
> 
>   /** Remove and return last entry from argument stack. */
>  diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
>  index 14ccd6864..b7393967a 100644
>  --- a/app/test-pmd/config.c
>  +++ b/app/test-pmd/config.c
>  @@ -1153,6 +1153,7 @@ static const struct {
> sizeof(struct rte_flow_action_of_pop_mpls)),
>  MK_FLOW_ACTION(OF_PUSH_MPLS,
> sizeof(struct rte_flow_action_of_push_mpls)),
>  +   MK_FLOW_ACTION(MAC_SWAP, 0),
>   };
> 
>   /** Compute storage space needed by action configuration and copy it. */
>  diff --git a/doc/guides/prog_guide/rte_flow.rst 
> b/doc/guides/prog_guide/rte_flow.rst
>  index b305a72a5..530dbc504 100644
>  --- a/doc/guides/prog_guide/rte_flow.rst
>  +++ b/doc/guides/prog_guide/rte_flow.rst
>  @@ -2076,6 +2076,21 @@ RTE_FLOW_ERROR_TYPE_ACTION error should be returned.
> 
>   This action modifies the payload of matched flows.
> 
>  +Action: ``MAC_SWAP``
>  +^
>  +
>  +Swap source and destination mac address.
>  +
>  +.. _table_rte_flow_action_mac_swap:
>  +
>  +.. table:: MAC_SWAP
>  +
>  +   +---+
>  +   | Field |
>  +   +===+
>  +   | no properties |
>  +   +---+
>  +
>   Negative types
>   ~~
> 
>  diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>  index dde205a2b..4f0da4fb6 100644
>  --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>  +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>  @@ -3697,6 +3697,8 @@ This section lists supported actions and their 
> attributes, if any.
>   - ``nvgre_decap``: Performs a decapsulation action by stripping all headers 
> of
> the NVGRE tunnel network overlay from the matched flow.
> 
>  +- ``mac_swap``: Swap source and destination mac address.
>  +
>   Destroying flow rules
>   ~
> 
>  diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
>  index cff4b5209..04b0b40ea 100644
>  --- a/lib/librte_ethdev/rte_flow.c
>  +++ b/lib/librte_ethdev/rte_flow.c
>  @@ -109,6 +109,7 @@ static const struct rte_flow_desc_data 
> rte_flow_desc_action[] = {
> sizeof(struct rte_flow_action_of_pop_mpls)),
>  MK_FLOW_ACTION(OF_PUSH_MPLS,
> sizeof(struct rte_flow_action_of_push_mpls)),
>  +   MK_FLOW_ACTION(MAC_SWAP, 0),
>   };
> 
>   static int
>  diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
>  index f8ba71cdb..e1fa17b7e 100644
>  --- a/lib/librte_ethdev/rte_flow.h
>  +++ b/lib/librte_ethdev/rte_flow.h
>  @@ -1505,6 +1505,13 @@ enum rte_flow_action_type {
>   * error.
>   */
>  RTE_FLOW_ACTION_TYPE_NVGRE_DECAP,
>  +
>  +   /**
>  +* swap the source and destination mac address in ethernet header
> 
>Swap the source and 

Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization

2018-08-29 Thread Ola Liljedahl
There was a mention of rte_ring which is a different data structure. But 
perhaps I misunderstood why this was mentioned and the idea was only to use the 
C11 memory model as is also used in rte_ring nowadays.

But why would we have different code for x86 and for other architectures (ARM, 
Power)? If we use the C11 memory model (and e.g. GCC __atomic builtins), the 
code generated for x86 will be the same. __atomic_load(__ATOMIC_ACQUIRE) and 
__atomic_store(__ATOMIC_RELEASE) should translate to plain loads and stores on 
x86?

-- Ola

On 29/08/2018, 10:28, "Jerin Jacob"  wrote:

-Original Message-
> Date: Wed, 29 Aug 2018 07:34:34 +
> From: Ola Liljedahl 
> To: "Kokkilagadda, Kiran" , Honnappa
>  Nagarahalli , Gavin Hu ,
>  Ferruh Yigit , "Jacob,  Jerin"
>  
> CC: "dev@dpdk.org" , nd , Steve Capper
>  
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer
>  synchronization
> user-agent: Microsoft-MacOutlook/10.10.0.180812
> 
> Is the rte_kni kernel/user binary interface subject to backwards 
compatibility requirements? Or can we change it for a new DPDK release?

What would be the change in interface? Is it removing the volatile for
C11 case, Then you can use anonymous union OR #define to keep the size 
and offset of the element intact.

struct rte_kni_fifo { 
#ifndef RTE_C11...
volatile unsigned write; /**< Next position to be written*/
volatile unsigned read;  /**< Next position to be read */
#else
unsigned write; /**< Next position to be written*/
unsigned read;  /**< Next position to be read */
#endif
unsigned len;/**< Circular buffer length */
unsigned elem_size;  /**< Pointer size - for 32/64 bitOS */
void *volatile buffer[]; /**< The buffer contains mbuf
pointers */
};

Anonymous union example:
https://git.dpdk.org/dpdk/tree/lib/librte_mbuf/rte_mbuf.h#n461

You can check the ABI breakage by devtools/validate-abi.sh

> 
> -- Ola
> 
> From: "Kokkilagadda, Kiran" 
> Date: Wednesday, 29 August 2018 at 07:50
> To: Honnappa Nagarahalli , Gavin Hu 
, Ferruh Yigit , "Jacob, Jerin" 

> Cc: "dev@dpdk.org" , nd , Ola Liljedahl 
, Steve Capper 
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
synchronization
> 
> 
> Agreed. Please go a head and make the changes. You need to make same 
change in kernel side also. And please use c11 ring (see rte_ring) mechanism so 
that it won't impact other platforms like intel. We need this change just for 
arm and ppc.
> 
> 
> From: Honnappa Nagarahalli 
> Sent: Wednesday, August 29, 2018 10:29 AM
> To: Gavin Hu; Kokkilagadda, Kiran; Ferruh Yigit; Jacob, Jerin
> Cc: dev@dpdk.org; nd; Ola Liljedahl; Steve Capper
> Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
synchronization
> 
> 
> External Email
> 
> I agree with Gavin here. Store to fifo->write and fifo->read can get 
hoisted resulting in accessing invalid buffer array entries or over writing of 
the buffer array entries.
> 
> IMO, we should solve this using c11 atomics. This will also help remove 
the use of ‘volatile’ from ‘rte_kni_fifo’ structure.
> 
> 
> 
> If you want us to put together a patch with this idea, please let us know.
> 
> 
> 
> Thank you,
> 
> Honnappa
> 
> 
> 
> From: Gavin Hu
> Sent: Tuesday, August 28, 2018 2:31 PM
> To: Kokkilagadda, Kiran ; Ferruh Yigit 
; Jacob, Jerin 
> Cc: dev@dpdk.org; Honnappa Nagarahalli ; nd 
; Ola Liljedahl ; Steve Capper 

> Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
synchronization
> 
> 
> 
> Assuming reader and writer may execute on different CPU's, this become 
standard multithreaded programming.
> 
> We are concerned about that update the reader pointer too early(weak 
ordering may reorder it before reading from the slots), that means the slots 
are released and may immediately overwritten by the writer then you get “too 
new” data and get lost of the old data.
> 
> 
> 
> From: Kokkilagadda, Kiran 
mailto:kiran.kokkilaga...@cavium.com>>
> Sent: Tuesday, August 28, 2018 6:44 PM
> To: Gavin Hu mailto:gavin...@arm.com>>; Ferruh Yigit 
mailto:ferruh.yi...@intel.com>>; Jacob, Jerin 
mailto:jerin.jacobkollanukka...@cavium.com>>
> Cc: dev@dpdk.org; Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>>
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
synchronization
> 
> 
> 
> In this instance there won't be any problem, as until the value of 
fifo->write changes, this loop won't get executed. As of now we didn't see any 
issue w

Re: [dpdk-dev] [PATCH] checkpatches: don't assume bash syntax

2018-08-29 Thread Hunt, David

Hi Stephen,


On 13/8/2018 4:47 PM, Stephen Hemminger wrote:

The read -d option is a bash extension and not avaiable in other
shells. On Debian, /bin/sh is dash and checktpatches would
fail with:
./devtools/checkpatches.sh: 52: read: Illegal option -d

Fix by using awk -e and adding necessary double backslash.

Fixes: 7413e7f2aeb3 ("devtools: alert on new calls to exit from libs")
Signed-off-by: Stephen Hemminger 


--snip--

The flavour of awk that's installed by default on Ubuntu is 'mawk' which 
does
not seem to have a '-e' option. However, for anyone tryung to run 
checkpatch

on Ubuntu with this patch, the quickest workaround is to install 'gawk',
which does have the -e option, then this patch works great.

Rgds,
Dave.




[dpdk-dev] [RFC v2] ethdev: add action to swap source and destination MAC to flow API

2018-08-29 Thread Rahul Lakkireddy
From: Shagun Agrawal 

This action is useful for offloading loopback mode, where the hardware
will swap source and destination MAC addresses in the outermost Ethernet
header before looping back the packet. This action can be used in
conjunction with other rewrite actions to achieve MAC layer transparent
NAT where the MAC addresses are swapped before either the source or
destination MAC address is rewritten and NAT is performed.

Must be used with a valid RTE_FLOW_ITEM_TYPE_ETH flow pattern item.
Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error should be returned by the
PMDs.

Signed-off-by: Shagun Agrawal 
Signed-off-by: Rahul Lakkireddy 
---
v2:
- Updated all comments and doc to indicate outermost Ethernet header's
  source and destination MAC addresses are swapped and that a valid
  RTE_FLOW_ITEM_TYPE_ETH must be specified. Otherwise,
  RTE_FLOW_ERROR_TYPE_ACTION error should be returned by the PMDs.

 app/test-pmd/cmdline_flow.c | 10 ++
 app/test-pmd/config.c   |  1 +
 doc/guides/prog_guide/rte_flow.rst  | 19 +++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  3 +++
 lib/librte_ethdev/rte_flow.c|  1 +
 lib/librte_ethdev/rte_flow.h| 11 +++
 6 files changed, 45 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index f9260600e..196c76de1 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -243,6 +243,7 @@ enum index {
ACTION_VXLAN_DECAP,
ACTION_NVGRE_ENCAP,
ACTION_NVGRE_DECAP,
+   ACTION_MAC_SWAP,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -816,6 +817,7 @@ static const enum index next_action[] = {
ACTION_VXLAN_DECAP,
ACTION_NVGRE_ENCAP,
ACTION_NVGRE_DECAP,
+   ACTION_MAC_SWAP,
ZERO,
 };
 
@@ -2470,6 +2472,14 @@ static const struct token token_list[] = {
.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
.call = parse_vc,
},
+   [ACTION_MAC_SWAP] = {
+   .name = "mac_swap",
+   .help = "Swap the source and destination MAC addresses"
+   " in the outermost Ethernet header",
+   .priv = PRIV_ACTION(MAC_SWAP, 0),
+   .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+   .call = parse_vc,
+   },
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 14ccd6864..b7393967a 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1153,6 +1153,7 @@ static const struct {
   sizeof(struct rte_flow_action_of_pop_mpls)),
MK_FLOW_ACTION(OF_PUSH_MPLS,
   sizeof(struct rte_flow_action_of_push_mpls)),
+   MK_FLOW_ACTION(MAC_SWAP, 0),
 };
 
 /** Compute storage space needed by action configuration and copy it. */
diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index b305a72a5..d09806d38 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2076,6 +2076,25 @@ RTE_FLOW_ERROR_TYPE_ACTION error should be returned.
 
 This action modifies the payload of matched flows.
 
+Action: ``MAC_SWAP``
+^
+
+Swap the source and destination MAC addresses in the outermost Ethernet
+header.
+
+It must be used with a valid RTE_FLOW_ITEM_TYPE_ETH flow pattern item.
+Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
+
+.. _table_rte_flow_action_mac_swap:
+
+.. table:: MAC_SWAP
+
+   +---+
+   | Field |
+   +===+
+   | no properties |
+   +---+
+
 Negative types
 ~~
 
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index dde205a2b..f32c6d11e 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3697,6 +3697,9 @@ This section lists supported actions and their 
attributes, if any.
 - ``nvgre_decap``: Performs a decapsulation action by stripping all headers of
   the NVGRE tunnel network overlay from the matched flow.
 
+- ``mac_swap``: Swap the source and destination MAC addresses in the outermost
+  Ethernet header.
+
 Destroying flow rules
 ~
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index cff4b5209..04b0b40ea 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -109,6 +109,7 @@ static const struct rte_flow_desc_data 
rte_flow_desc_action[] = {
   sizeof(struct rte_flow_action_of_pop_mpls)),
MK_FLOW_ACTION(OF_PUSH_MPLS,
   sizeof(struct rte_flow_action_of_push_mpls)),
+   MK_FLOW_ACTION(MAC_SWAP, 0),
 };
 
 static int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f8ba71cdb..c743f818e 100644
--- a/lib/librte_ethdev/rte_fl

Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization

2018-08-29 Thread Jerin Jacob
-Original Message-
> Date: Wed, 29 Aug 2018 08:47:56 +
> From: Ola Liljedahl 
> To: Jerin Jacob 
> CC: "Kokkilagadda, Kiran" , Honnappa
>  Nagarahalli , Gavin Hu ,
>  Ferruh Yigit , "Jacob,  Jerin"
>  , "dev@dpdk.org" , nd
>  , Steve Capper 
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer
>  synchronization
> user-agent: Microsoft-MacOutlook/10.10.0.180812
> 
> 
> There was a mention of rte_ring which is a different data structure. But 
> perhaps I misunderstood why this was mentioned and the idea was only to use 
> the C11 memory model as is also used in rte_ring nowadays.
> 
> But why would we have different code for x86 and for other architectures 
> (ARM, Power)? If we use the C11 memory model (and e.g. GCC __atomic 
> builtins), the code generated for x86 will be the same. 
> __atomic_load(__ATOMIC_ACQUIRE) and __atomic_store(__ATOMIC_RELEASE) should 
> translate to plain loads and stores on x86?

# One reason was __atomic builtins  primitives were implemented in gcc 4.7 and 
x86 would
like to support < gcc 4.7 and ICC compiler.
# The theme was no change in the existing code for x86.I am not sure about the 
code generation for x86 with __atomic builtins,
I let x86 maintainers to comments on this.


> 
> -- Ola
> 
> On 29/08/2018, 10:28, "Jerin Jacob"  wrote:
> 
> -Original Message-
> > Date: Wed, 29 Aug 2018 07:34:34 +
> > From: Ola Liljedahl 
> > To: "Kokkilagadda, Kiran" , Honnappa
> >  Nagarahalli , Gavin Hu 
> ,
> >  Ferruh Yigit , "Jacob,  Jerin"
> >  
> > CC: "dev@dpdk.org" , nd , Steve Capper
> >  
> > Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer
> >  synchronization
> > user-agent: Microsoft-MacOutlook/10.10.0.180812
> >
> > Is the rte_kni kernel/user binary interface subject to backwards 
> compatibility requirements? Or can we change it for a new DPDK release?
> 
> What would be the change in interface? Is it removing the volatile for
> C11 case, Then you can use anonymous union OR #define to keep the size
> and offset of the element intact.
> 
> struct rte_kni_fifo {
> #ifndef RTE_C11...
> volatile unsigned write; /**< Next position to be written*/
> volatile unsigned read;  /**< Next position to be read */
> #else
> unsigned write; /**< Next position to be written*/
> unsigned read;  /**< Next position to be read */
> #endif
> unsigned len;/**< Circular buffer length */
> unsigned elem_size;  /**< Pointer size - for 32/64 bitOS 
> */
> void *volatile buffer[]; /**< The buffer contains mbuf
> pointers */
> };
> 
> Anonymous union example:
> https://git.dpdk.org/dpdk/tree/lib/librte_mbuf/rte_mbuf.h#n461
> 
> You can check the ABI breakage by devtools/validate-abi.sh
> 
> >
> > -- Ola
> >
> > From: "Kokkilagadda, Kiran" 
> > Date: Wednesday, 29 August 2018 at 07:50
> > To: Honnappa Nagarahalli , Gavin Hu 
> , Ferruh Yigit , "Jacob, Jerin" 
> 
> > Cc: "dev@dpdk.org" , nd , Ola Liljedahl 
> , Steve Capper 
> > Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
> synchronization
> >
> >
> > Agreed. Please go a head and make the changes. You need to make same 
> change in kernel side also. And please use c11 ring (see rte_ring) mechanism 
> so that it won't impact other platforms like intel. We need this change just 
> for arm and ppc.
> >
> > 
> > From: Honnappa Nagarahalli 
> > Sent: Wednesday, August 29, 2018 10:29 AM
> > To: Gavin Hu; Kokkilagadda, Kiran; Ferruh Yigit; Jacob, Jerin
> > Cc: dev@dpdk.org; nd; Ola Liljedahl; Steve Capper
> > Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
> synchronization
> >
> >
> > External Email
> >
> > I agree with Gavin here. Store to fifo->write and fifo->read can get 
> hoisted resulting in accessing invalid buffer array entries or over writing 
> of the buffer array entries.
> >
> > IMO, we should solve this using c11 atomics. This will also help remove 
> the use of ‘volatile’ from ‘rte_kni_fifo’ structure.
> >
> >
> >
> > If you want us to put together a patch with this idea, please let us 
> know.
> >
> >
> >
> > Thank you,
> >
> > Honnappa
> >
> >
> >
> > From: Gavin Hu
> > Sent: Tuesday, August 28, 2018 2:31 PM
> > To: Kokkilagadda, Kiran ; Ferruh Yigit 
> ; Jacob, Jerin 
> > Cc: dev@dpdk.org; Honnappa Nagarahalli ; 
> nd ; Ola Liljedahl ; Steve Capper 
> 
> > Subject: RE: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer 
> synchronization
> >
> >
> >
> > Assuming reader and writer may execute on different CPU's, this become 
> standard multithreaded programming.
> >
> > We are concerned about that 

Re: [dpdk-dev] [PATCH 02/10] qat: update code to use __rte_weak macro

2018-08-29 Thread Jozwiak, TomaszX



> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Keith Wiles
> Sent: Friday, August 3, 2018 4:06 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH 02/10] qat: update code to use __rte_weak
> macro
> 
> Signed-off-by: Keith Wiles 
Acked-by: tomaszx.jozw...@intel.com


Re: [dpdk-dev] [PATCH v2] net/mlx: add meson build support

2018-08-29 Thread Nélio Laranjeiro
Hi Bruce,

Thanks for your comments I have address almost all of them in the v3 by
doing what you suggest, I still have some comments, please see below,

On Tue, Aug 28, 2018 at 04:45:00PM +0100, Bruce Richardson wrote:
> Thanks for this, comments inline below.
> 
> /Bruce
> 
> On Mon, Aug 27, 2018 at 02:42:25PM +0200, Nelio Laranjeiro wrote:
> > Mellanox drivers remains un-compiled by default due to third party
> > libraries dependencies.  They can be enabled through:
> > - enable_driver_mlx{4,5}=true or
> > - enable_driver_mlx{4,5}_glue=true
> > depending on the needs.
> 
> The big reason why we wanted a new build system was to move away from this
> sort of static configuration. Instead, detect if the requirements as
> present and build the driver if you can.

Ok, I am letting only the glue option for both drivers as suggested at
the end of your answer.

> > To avoid modifying the whole sources and keep the compatibility with
> > current build systems (e.g. make), the mlx{4,5}_autoconf.h is still
> > generated by invoking DPDK scripts though meson's run_command() instead
> > of using has_types, has_members, ... commands.
> > 
> > Meson will try to find the required external libraries.  When they are
> > not installed system wide, they can be provided though CFLAGS, LDFLAGS
> > and LD_LIBRARY_PATH environment variables, example (considering
> > RDMA-Core is installed in /tmp/rdma-core):
> > 
> >  # CLFAGS=-I/tmp/rdma-core/build/include \
> >LDFLAGS=-L/tmp/rdma-core/build/lib \
> >LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> >meson -Denable_driver_mlx4=true output
> > 
> >  # CLFAGS=-I/tmp/rdma-core/build/include \
> >LDFLAGS=-L/tmp/rdma-core/build/lib \
> >LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> >ninja -C output install
> 
> Once the CFLAGS/LDFLAGS are passed to meson, they should not be needed for
> ninja. The LD_LIBRARY_PATH might be - I'm not sure about that one! :-)

CFLAGS/LDFLAGS are correctly evaluated and inserted in the build.ninja
file, for the LD_LIBRARY_PATH, it is necessary for the run_command stuff
generating the mlx*_autoconf.h

>[...] 
> Rather than having your own separate debug option flag, why not set these
> based on the "buildtype" option e.g. if buildtype is set to "debug".
> 
> > +# To maintain the compatibility with the make build system
> > +# mlx4_autoconf.h file is still generated.
> > +r = run_command('sh', '../../../buildtools/auto-config-h.sh',
> > +'mlx4_autoconf.h',
> > +'HAVE_IBV_MLX4_WQE_LSO_SEG',
> > +'infiniband/mlx4dv.h',
> > +'type', 'struct mlx4_wqe_lso_seg')
> > +if r.returncode() != 0
> > +error('autoconfiguration fail')
> > +endif
> 
> Just to check that you are ok with this only being run at configure time?
> If any changes are made to the inputs, ninja won't pick them up. To have it
> tracked for input changes, "custom_target" should be used instead of
> run_command.

It seems to not be possible to have several custom_target on the same
output file has this last is used as the target identifier in ninja.

This limitation is acceptable for now, when meson will be the default
build system, then such autoconf can be removed to use meson built-in
functions.

> > +endif
> > +# Build Glue Library
> > +if pmd_dlopen
> > +dlopen_name = 'mlx4_glue'
> > +dlopen_lib_name = driver_name_fmt.format(dlopen_name)
> > +dlopen_so_version = LIB_GLUE_VERSION
> > +dlopen_sources = files('mlx4_glue.c')
> > +dlopen_install_dir = [ eal_pmd_path + '-glue' ]
> > +shared_lib = shared_library(
> > +   dlopen_lib_name,
> > +   dlopen_sources,
> > +   include_directories: global_inc,
> > +   c_args: cflags,
> > +   link_args: [
> > +   '-Wl,-export-dynamic',
> > +   '-Wl,-h,@0@'.format(LIB_GLUE),
> > +   '-lmlx4',
> > +   '-libverbs',
> 
> While this works, the recommended approach is to save the return value from
> cc.find_library() above, and pass that as a dependency directly, rather
> than as a linker flag.

I tried it, but:

 drivers/net/mlx5/meson.build:216:8: ERROR:  Link_args arguments must be
 strings.

find_library returns a compiler object, I did not found anyway to use
directly the output of the find_library which works in places.

Thanks,

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [PATCH 0/4] net/failsafe: support deferred queue start

2018-08-29 Thread Gaëtan Rivet
Hello Ian, Andrew,

Clean patchset.
Only nitpick would be on eth_dev_ops ordering, but this is secondary.

For the series:
Acked-by: Gaetan Rivet 

On Wed, Aug 29, 2018 at 08:16:02AM +0100, Andrew Rybchenko wrote:
> Ian Dolzhansky (4):
>   app/testpmd: add queue deferred start switch
>   net/failsafe: add checks for deferred queue setup
>   net/failsafe: add Rx queue start and stop functions
>   net/failsafe: add Tx queue start and stop functions
> 
>  app/test-pmd/cmdline.c |  91 ++
>  doc/guides/nics/features/failsafe.ini  |   1 +
>  doc/guides/rel_notes/release_18_11.rst |  13 ++
>  drivers/net/failsafe/failsafe_ether.c  |  88 +
>  drivers/net/failsafe/failsafe_ops.c| 167 -
>  5 files changed, 359 insertions(+), 1 deletion(-)
> 
> -- 
> 2.17.1
> 

-- 
Gaëtan Rivet
6WIND


[dpdk-dev] [PATCH v2] app/testpmd: add new command for show port info

2018-08-29 Thread Emma Finn
existing testpmd command "show port info" is too verbose.
Added a new summary command to print brief information on ports.

console output:
testpmd> show port summary all
Number of available ports: 2
Port MAC Address   Name  Driver   Status Link
011:22:33:44:55:66 :07:00.0, net_i40e, up, 4 Mbps
166:55:44:33:22:11 :07:00.1, net_i40e, up, 4 Mbps

Signed-off-by: Emma Finn 

---

v2: droped off redundant information added
a single header line. (Stephen Hemminger)
---
 app/test-pmd/cmdline.c  | 19 +++
 app/test-pmd/config.c   | 38 +
 app/test-pmd/testpmd.h  |  2 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 ++-
 4 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 589121d..bb15338 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -167,7 +167,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"Display:\n"
"\n\n"
 
-   "show port 
(info|stats|xstats|fdir|stat_qmap|dcb_tc|cap) (port_id|all)\n"
+   "show port 
(info|stats|summary|xstats|fdir|stat_qmap|dcb_tc|cap) (port_id|all)\n"
"Display information for port_id, or all.\n\n"
 
"show port X rss reta (size) (mask0,mask1,...)\n"
@@ -7073,6 +7073,11 @@ static void cmd_showportall_parsed(void *parsed_result,
} else if (!strcmp(res->what, "info"))
RTE_ETH_FOREACH_DEV(i)
port_infos_display(i);
+   else if (!strcmp(res->what, "summary")) {
+   port_number_display();
+   RTE_ETH_FOREACH_DEV(i)
+   port_summary_display(i);
+   }
else if (!strcmp(res->what, "stats"))
RTE_ETH_FOREACH_DEV(i)
nic_stats_display(i);
@@ -7100,14 +7105,14 @@ cmdline_parse_token_string_t cmd_showportall_port =
TOKEN_STRING_INITIALIZER(struct cmd_showportall_result, port, "port");
 cmdline_parse_token_string_t cmd_showportall_what =
TOKEN_STRING_INITIALIZER(struct cmd_showportall_result, what,
-"info#stats#xstats#fdir#stat_qmap#dcb_tc#cap");
+
"info#summary#stats#xstats#fdir#stat_qmap#dcb_tc#cap");
 cmdline_parse_token_string_t cmd_showportall_all =
TOKEN_STRING_INITIALIZER(struct cmd_showportall_result, all, "all");
 cmdline_parse_inst_t cmd_showportall = {
.f = cmd_showportall_parsed,
.data = NULL,
.help_str = "show|clear port "
-   "info|stats|xstats|fdir|stat_qmap|dcb_tc|cap all",
+   "info|summary|stats|xstats|fdir|stat_qmap|dcb_tc|cap all",
.tokens = {
(void *)&cmd_showportall_show,
(void *)&cmd_showportall_port,
@@ -7137,6 +7142,10 @@ static void cmd_showport_parsed(void *parsed_result,
nic_xstats_clear(res->portnum);
} else if (!strcmp(res->what, "info"))
port_infos_display(res->portnum);
+   else if (!strcmp(res->what, "summary")) {
+   port_number_display();
+   port_summary_display(res->portnum);
+   }
else if (!strcmp(res->what, "stats"))
nic_stats_display(res->portnum);
else if (!strcmp(res->what, "xstats"))
@@ -7158,7 +7167,7 @@ cmdline_parse_token_string_t cmd_showport_port =
TOKEN_STRING_INITIALIZER(struct cmd_showport_result, port, "port");
 cmdline_parse_token_string_t cmd_showport_what =
TOKEN_STRING_INITIALIZER(struct cmd_showport_result, what,
-"info#stats#xstats#fdir#stat_qmap#dcb_tc#cap");
+
"info#summary#stats#xstats#fdir#stat_qmap#dcb_tc#cap");
 cmdline_parse_token_num_t cmd_showport_portnum =
TOKEN_NUM_INITIALIZER(struct cmd_showport_result, portnum, UINT16);
 
@@ -7166,7 +7175,7 @@ cmdline_parse_inst_t cmd_showport = {
.f = cmd_showport_parsed,
.data = NULL,
.help_str = "show|clear port "
-   "info|stats|xstats|fdir|stat_qmap|dcb_tc|cap "
+   "info|summary|stats|xstats|fdir|stat_qmap|dcb_tc|cap "
"",
.tokens = {
(void *)&cmd_showport_show,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 14ccd68..cf436ba 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -518,6 +518,44 @@ port_infos_display(portid_t port_id)
 }
 
 void
+port_number_display(void)
+{
+   uint16_t port_number;
+   port_number = rte_eth_dev_count();
+   printf("Number of available ports: %i\n", port_number);
+   printf("%s %s %10s %15s %8s %s\n", "Port", "MAC Address", "Name",
+   "Driver", "Status", "Link");
+}
+
+void
+port_summary_display(p

Re: [dpdk-dev] [PATCH] kni: dynamically allocate memory for each KNI

2018-08-29 Thread Igor Ryzhov
Hello Ferruh,

Thanks for the review, comments inline.

On Mon, Aug 27, 2018 at 8:06 PM, Ferruh Yigit 
wrote:

> On 8/2/2018 3:25 PM, Igor Ryzhov wrote:
> > Long time ago preallocation of memory for KNI was introduced in commit
> > 0c6bc8e. It was done because of lack of ability to free previously
> > allocated memzones, which led to memzone exhaustion. Currently memzones
> > can be freed and this patch uses this ability for dynamic KNI memory
> > allocation.
>
> Hi Igor,
>
> It is good to be able to allocate memory dynamically and get rid of the
> "max_kni_ifaces" and "kni_memzone_pool", thanks for the patch.
>
> Overall looks good, a few comments below.
>
> >
> > Signed-off-by: Igor Ryzhov 
> > ---
> >  lib/librte_kni/rte_kni.c | 392 ---
> >  lib/librte_kni/rte_kni.h |   6 +-
> >  test/test/test_kni.c |   6 -
> >  3 files changed, 128 insertions(+), 276 deletions(-)
> >
> > diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
> > index 8a8f6c1cc..028b44bfd 100644
> > --- a/lib/librte_kni/rte_kni.c
> > +++ b/lib/librte_kni/rte_kni.c
> > @@ -36,24 +36,33 @@
> >   * KNI context
> >   */
> >  struct rte_kni {
> > + const struct rte_memzone *mz;   /**< KNI context memzone */
>
> I was thinking remove the context memzone and use rte_zmalloc() to create
> kni
> objects but updated rte_kni_get() API seems relaying this.
> If you see any other way to get kni object from name in rte_kni_get(), I
> am for
> removing above *mz variable from rte_kni struct.
>

I had absolutely the same thought but didn't find a way to save
rte_kni_get() API.
Maybe someone has any ideas?
Or maybe this API can be marked deprecated and deleted in future?


>
> <...>
>
> > +static void
> > +kni_ctx_release_mz(struct rte_kni *ctx)
> > +{
> > + rte_memzone_free(ctx->m_tx_q);
> > + rte_memzone_free(ctx->m_rx_q);
> > + rte_memzone_free(ctx->m_alloc_q);
> > + rte_memzone_free(ctx->m_free_q);
> > + rte_memzone_free(ctx->m_req_q);
> > + rte_memzone_free(ctx->m_resp_q);
> > + rte_memzone_free(ctx->m_sync_addr);
>
>
> "ctx" sounds confusing to me, isn't this "rte_kni" object instance, why
> not just
> call it "kni" or if it is too generic "kni_obj" or similar? For other APIs
> as well.
>

"ctx" was already used in the code, so I didn't change it.
I also think that it's better to use "kni" – will change it in v2.


>
> And this is just a detail but about order of APIs would you mind having
> first
> reserve() one, later release() one?
>

Ok.


>
> <...>
>
> > -/* Shall be called before any allocation happens */
> > -void
> > -rte_kni_init(unsigned int max_kni_ifaces)
> > +static struct rte_kni *
> > +kni_ctx_reserve(const char *name)
> >  {
> > - uint32_t i;
> > - struct rte_kni_memzone_slot *it;
> > + struct rte_kni *ctx;
> >   const struct rte_memzone *mz;
> > -#define OBJNAMSIZ 32
> > - char obj_name[OBJNAMSIZ];
> >   char mz_name[RTE_MEMZONE_NAMESIZE];
> >
> > - /* Immediately return if KNI is already initialized */
> > - if (kni_memzone_pool.initialized) {
> > - RTE_LOG(WARNING, KNI, "Double call to rte_kni_init()");
> > - return;
> > - }
> > + snprintf(mz_name, RTE_MEMZONE_NAMESIZE, "kni_info_%s", name);
>
> Can you please convert memzone names, like "kni_info" to defines, for all
> of them?
>

Ok.


>
> <...>
>
> > @@ -81,8 +81,12 @@ struct rte_kni_conf {
> >   *
> >   * @param max_kni_ifaces
> >   *  The maximum number of KNI interfaces that can coexist concurrently
> > + *
> > + * @return
> > + *  - 0 indicates success.
> > + *  - negative value indicates failure.
> >   */
> > -void rte_kni_init(unsigned int max_kni_ifaces);
> > +int rte_kni_init(unsigned int max_kni_ifaces);
>
> This changes the API. Return type changes from "void" to "int". I agree
> "int"
> makes more sense since API can fail, but this changes the ABI/API.
>
> Since existing binaries doesn't check the return type at all there may be
> no
> issue from ABI point of view but from API point of view some apps may get
> return
> value not checked warnings, not sure though.
>
> And the need of the API is questionable at this stage, it may be possible
> to
> move rte_kni_alloc() where it already has "kni_fd" check.
>
> What do you think keep API signature same for now, but add a deprecation
> notice
> to remove the API. Next release (v19.02) remove rte_kni_init() completely?
>

As I know, warnings can only be returned if the warn_unused_result
attribute is used, which is not the case here.
So I think that changing from void to int should not break anything. Can
change it back in v2 if I'm wrong.

Regarding the API removal – I think it's better to save that function, to
have a more clear API.
As we have rte_kni_close to close KNI device, we should have a function to
open it.
Maybe it should be renamed to rte_kni_open :)


>
> <...>
>
> >  /**
> > diff --git a/test/test/test_kni.c b/test/test/test_kni.c
> > index 1b876719a..56c98

Re: [dpdk-dev] 16.11.8 (LTS) patches review and test

2018-08-29 Thread Luca Boccassi
On Mon, 2018-08-27 at 17:17 +0100, Luca Boccassi wrote:
> On Thu, 2018-08-23 at 09:55 +0100, Luca Boccassi wrote:
> > On Mon, 2018-08-13 at 19:21 +0100, luca.bocca...@gmail.com wrote:
> > > Hi all,
> > > 
> > > Here is a list of patches targeted for LTS release 16.11.8.
> > > Please
> > > help review and test. The planned date for the final release is
> > > August
> > > the 23rd.
> > > Before that, please shout if anyone has objections with these
> > > patches being applied.
> > > 
> > > Also for the companies committed to running regression tests,
> > > please run the tests and report any issue before the release
> > > date.
> > > 
> > > A release candidate tarball can be found at:
> > > 
> > > https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc1
> > > 
> > > These patches are located at branch 16.11 of dpdk-stable repo:
> > > https://dpdk.org/browse/dpdk-stable/
> > > 
> > > Thanks.
> > > 
> > > Luca Boccassi
> > 
> > Hi,
> > 
> > Regression tests from Intel have highlighted a possible issue with
> > the
> > changes (unidentified as of now), so while investigation is in
> > progress
> > we decided to postpone the release to Monday the 27th to be on the
> > safe
> > side.
> > Apologies for any issues this might cause.
> 
> Hi,
> 
> Unfortunately triaging is still in progress, so it's better to
> postpone
> again, to Wednesday the 29th of August.
> Apologies again for any issues due to this delay.

Hello all,

I've pushed an -rc2 with the following additional changes:

Luca Boccassi (1):
  Revert "net/i40e: fix packet count for PF"

Radu Nicolau (3):
  net/null: add MAC address setting fake operation
  test/virtual_pmd: add MAC address setting fake op
  test/bonding: assign non-zero MAC to null devices

Radu, I cherry-picked the following 3 patches that you got merged in
18.02 as they are necessary to fix bonding regression tests from Intel:

c5ac7748fd6bfd86b6fb4432b6792733cf32c94c
c23fc36284e26fca9b52641118ad76a4da99d7af
e8df563bac263e55b7dd9d45a00417aa92ef66cb

Qi, I have reverted the following patch that was backported to 16.11.4
as it breaks a Fortville regression test from Intel:

4bf705a7d74b0b4c1d82ad0821c43e32be15a5e5.

Marco, is there any chance you've got time today to re-run your tests?
These changes in rc2 have been blessed by Intel and AT&T, so if it
works for you as well I can then release later tonight.

A release candidate tarball can be found at:

https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc2

These patches are located at branch 16.11 of dpdk-stable repo:
https://dpdk.org/browse/dpdk-stable/

Thanks!

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH v2] net/mlx: add meson build support

2018-08-29 Thread Luca Boccassi
On Tue, 2018-08-28 at 16:45 +0100, Bruce Richardson wrote:
> Thanks for this, comments inline below.
> 
> /Bruce
> 
> On Mon, Aug 27, 2018 at 02:42:25PM +0200, Nelio Laranjeiro wrote:
> > Mellanox drivers remains un-compiled by default due to third party
> > libraries dependencies.  They can be enabled through:
> > - enable_driver_mlx{4,5}=true or
> > - enable_driver_mlx{4,5}_glue=true
> > depending on the needs.
> 
> The big reason why we wanted a new build system was to move away from
> this
> sort of static configuration. Instead, detect if the requirements as
> present and build the driver if you can.
> 
> > 
> > To avoid modifying the whole sources and keep the compatibility
> > with
> > current build systems (e.g. make), the mlx{4,5}_autoconf.h is still
> > generated by invoking DPDK scripts though meson's run_command()
> > instead
> > of using has_types, has_members, ... commands.
> > 
> > Meson will try to find the required external libraries.  When they
> > are
> > not installed system wide, they can be provided though CFLAGS,
> > LDFLAGS
> > and LD_LIBRARY_PATH environment variables, example (considering
> > RDMA-Core is installed in /tmp/rdma-core):
> > 
> >  # CLFAGS=-I/tmp/rdma-core/build/include \
> >    LDFLAGS=-L/tmp/rdma-core/build/lib \
> >    LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> >    meson -Denable_driver_mlx4=true output
> > 
> >  # CLFAGS=-I/tmp/rdma-core/build/include \
> >    LDFLAGS=-L/tmp/rdma-core/build/lib \
> >    LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> >    ninja -C output install
> 
> Once the CFLAGS/LDFLAGS are passed to meson, they should not be
> needed for
> ninja. The LD_LIBRARY_PATH might be - I'm not sure about that one! :-
> )
> 
> > 
> > Signed-off-by: Nelio Laranjeiro 
> > 
> > ---
> > 
> > Changes in v2:
> > 
> > - dropped patch https://patches.dpdk.org/patch/43897/
> > - remove extra_{cflags,ldflags} as already honored by meson through
> > environment variables.
> > ---
> >  drivers/net/meson.build  |   2 +
> >  drivers/net/mlx4/meson.build |  94 ++
> >  drivers/net/mlx5/meson.build | 545
> > +++
> >  meson_options.txt|   8 +
> >  4 files changed, 649 insertions(+)
> >  create mode 100644 drivers/net/mlx4/meson.build
> >  create mode 100644 drivers/net/mlx5/meson.build
> > 
> > diff --git a/drivers/net/meson.build b/drivers/net/meson.build
> > index 9c28ed4da..c7a2d0e7d 100644
> > --- a/drivers/net/meson.build
> > +++ b/drivers/net/meson.build
> > @@ -18,6 +18,8 @@ drivers = ['af_packet',
> >     'ixgbe',
> >     'kni',
> >     'liquidio',
> > +   'mlx4',
> > +   'mlx5',
> >     'mvpp2',
> >     'netvsc',
> >     'nfp',
> > diff --git a/drivers/net/mlx4/meson.build
> > b/drivers/net/mlx4/meson.build
> > new file mode 100644
> > index 0..debaca5b6
> > --- /dev/null
> > +++ b/drivers/net/mlx4/meson.build
> > @@ -0,0 +1,94 @@
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright 2018 6WIND S.A.
> > +# Copyright 2018 Mellanox Technologies, Ltd
> > +
> > +# As there is no more configuration file to activate/configure the
> > PMD it will
> > +# use some variables here to configure it.
> > +pmd_dlopen = get_option('enable_driver_mlx4_glue')
> > +build = get_option('enable_driver_mlx4') or pmd_dlopen
> 
> As stated above, I believe this should be based upon whether you find
> the
> "mnl", "mlx4" and "ibverbs" libraries. If we start adding back in
> static
> options for every driver, then we'll be back to having a mass of
> config
> options like we had before.

BTW, slightly related to that: ibverbs doesn't ship pkg-config files at
the moment which makes the detection slightly more awkward that it
could be, so I've sent a PR upstream to add that:

https://github.com/linux-rdma/rdma-core/pull/373

Hope this can be useful!

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH v2] net/mlx: add meson build support

2018-08-29 Thread Bruce Richardson
On Wed, Aug 29, 2018 at 11:34:10AM +0200, Nélio Laranjeiro wrote:
> Hi Bruce,
> 
> Thanks for your comments I have address almost all of them in the v3 by
> doing what you suggest, I still have some comments, please see below,
> 

Thanks.

> On Tue, Aug 28, 2018 at 04:45:00PM +0100, Bruce Richardson wrote:
> > Thanks for this, comments inline below.
> > 
> > /Bruce
> > 
> > On Mon, Aug 27, 2018 at 02:42:25PM +0200, Nelio Laranjeiro wrote:
> > > Mellanox drivers remains un-compiled by default due to third party
> > > libraries dependencies.  They can be enabled through:
> > > - enable_driver_mlx{4,5}=true or
> > > - enable_driver_mlx{4,5}_glue=true
> > > depending on the needs.
> > 
> > The big reason why we wanted a new build system was to move away from this
> > sort of static configuration. Instead, detect if the requirements as
> > present and build the driver if you can.
> 
> Ok, I am letting only the glue option for both drivers as suggested at
> the end of your answer.
> 
> > > To avoid modifying the whole sources and keep the compatibility with
> > > current build systems (e.g. make), the mlx{4,5}_autoconf.h is still
> > > generated by invoking DPDK scripts though meson's run_command() instead
> > > of using has_types, has_members, ... commands.
> > > 
> > > Meson will try to find the required external libraries.  When they are
> > > not installed system wide, they can be provided though CFLAGS, LDFLAGS
> > > and LD_LIBRARY_PATH environment variables, example (considering
> > > RDMA-Core is installed in /tmp/rdma-core):
> > > 
> > >  # CLFAGS=-I/tmp/rdma-core/build/include \
> > >LDFLAGS=-L/tmp/rdma-core/build/lib \
> > >LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> > >meson -Denable_driver_mlx4=true output
> > > 
> > >  # CLFAGS=-I/tmp/rdma-core/build/include \
> > >LDFLAGS=-L/tmp/rdma-core/build/lib \
> > >LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> > >ninja -C output install
> > 
> > Once the CFLAGS/LDFLAGS are passed to meson, they should not be needed for
> > ninja. The LD_LIBRARY_PATH might be - I'm not sure about that one! :-)
> 
> CFLAGS/LDFLAGS are correctly evaluated and inserted in the build.ninja
> file, for the LD_LIBRARY_PATH, it is necessary for the run_command stuff
> generating the mlx*_autoconf.h
> 

Just realised there is another issue which you should address. The
mlx*_autoconf.h files are being written into a source folder rather than
into the destination folder. This could cause problems if we are enabling
mlx for cross-compile builds. Perhaps inside the auto-config-h.sh script
you can check for $MESON_BUILD_ROOT value, and use that (and possibly
$MESON_SUBDIR) to put the header file in the build directory.

> >[...] 
> > Rather than having your own separate debug option flag, why not set these
> > based on the "buildtype" option e.g. if buildtype is set to "debug".
> > 
> > > +# To maintain the compatibility with the make build system
> > > +# mlx4_autoconf.h file is still generated.
> > > +r = run_command('sh', '../../../buildtools/auto-config-h.sh',
> > > +'mlx4_autoconf.h',
> > > +'HAVE_IBV_MLX4_WQE_LSO_SEG',
> > > +'infiniband/mlx4dv.h',
> > > +'type', 'struct mlx4_wqe_lso_seg')
> > > +if r.returncode() != 0
> > > +error('autoconfiguration fail')
> > > +endif
> > 
> > Just to check that you are ok with this only being run at configure time?
> > If any changes are made to the inputs, ninja won't pick them up. To have it
> > tracked for input changes, "custom_target" should be used instead of
> > run_command.
> 
> It seems to not be possible to have several custom_target on the same
> output file has this last is used as the target identifier in ninja.
> 
> This limitation is acceptable for now, when meson will be the default
> build system, then such autoconf can be removed to use meson built-in
> functions.
> 
> > > +endif
> > > +# Build Glue Library
> > > +if pmd_dlopen
> > > +dlopen_name = 'mlx4_glue'
> > > +dlopen_lib_name = driver_name_fmt.format(dlopen_name)
> > > +dlopen_so_version = LIB_GLUE_VERSION
> > > +dlopen_sources = files('mlx4_glue.c')
> > > +dlopen_install_dir = [ eal_pmd_path + '-glue' ]
> > > +shared_lib = shared_library(
> > > +   dlopen_lib_name,
> > > +   dlopen_sources,
> > > +   include_directories: global_inc,
> > > +   c_args: cflags,
> > > +   link_args: [
> > > +   '-Wl,-export-dynamic',
> > > +   '-Wl,-h,@0@'.format(LIB_GLUE),
> > > +   '-lmlx4',
> > > +   '-libverbs',
> > 
> > While this works, the recommended approach is to save the return value from
> > cc.find_library() above, and pass that as a dependency directly, rather
> > than as a linker flag.
> 
> I tried it, but:
> 
>  drivers

Re: [dpdk-dev] [PATCH v2 01/10] kni: remove unused variables from struct kni_dev

2018-08-29 Thread Ferruh Yigit
On 6/29/2018 2:54 AM, Dan Gora wrote:
> Remove the unused fields 'status' and 'synchro' from the struct
> kni_dev.
> 
> Signed-off-by: Dan Gora 

Acked-by: Ferruh Yigit 


[dpdk-dev] [PATCH v3] app/testpmd: add new command for show port info

2018-08-29 Thread Emma Finn
existing testpmd command "show port info" is too verbose.
Added a new summary command to print brief information on ports.

console output:
testpmd> show port summary all
Number of available ports: 2
Port MAC Address   Name  Driver   Status Link
011:22:33:44:55:66 :07:00.0, net_i40e, up, 4 Mbps
166:55:44:33:22:11 :07:00.1, net_i40e, up, 4 Mbps

Signed-off-by: Emma Finn 

---

v2: droped off redundant information added
a single header line. (Stephen Hemminger)

v3: removed deprecated function and refactored code.
---
 app/test-pmd/cmdline.c  | 19 +++
 app/test-pmd/config.c   | 37 +
 app/test-pmd/testpmd.h  |  2 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++-
 4 files changed, 56 insertions(+), 6 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 589121d..e54e2e5 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -167,7 +167,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"Display:\n"
"\n\n"
 
-   "show port 
(info|stats|xstats|fdir|stat_qmap|dcb_tc|cap) (port_id|all)\n"
+   "show port 
(info|stats|summary|xstats|fdir|stat_qmap|dcb_tc|cap) (port_id|all)\n"
"Display information for port_id, or all.\n\n"
 
"show port X rss reta (size) (mask0,mask1,...)\n"
@@ -7073,6 +7073,11 @@ static void cmd_showportall_parsed(void *parsed_result,
} else if (!strcmp(res->what, "info"))
RTE_ETH_FOREACH_DEV(i)
port_infos_display(i);
+   else if (!strcmp(res->what, "summary")) {
+   port_summary_header_display();
+   RTE_ETH_FOREACH_DEV(i)
+   port_summary_display(i);
+   }
else if (!strcmp(res->what, "stats"))
RTE_ETH_FOREACH_DEV(i)
nic_stats_display(i);
@@ -7100,14 +7105,14 @@ cmdline_parse_token_string_t cmd_showportall_port =
TOKEN_STRING_INITIALIZER(struct cmd_showportall_result, port, "port");
 cmdline_parse_token_string_t cmd_showportall_what =
TOKEN_STRING_INITIALIZER(struct cmd_showportall_result, what,
-"info#stats#xstats#fdir#stat_qmap#dcb_tc#cap");
+
"info#summary#stats#xstats#fdir#stat_qmap#dcb_tc#cap");
 cmdline_parse_token_string_t cmd_showportall_all =
TOKEN_STRING_INITIALIZER(struct cmd_showportall_result, all, "all");
 cmdline_parse_inst_t cmd_showportall = {
.f = cmd_showportall_parsed,
.data = NULL,
.help_str = "show|clear port "
-   "info|stats|xstats|fdir|stat_qmap|dcb_tc|cap all",
+   "info|summary|stats|xstats|fdir|stat_qmap|dcb_tc|cap all",
.tokens = {
(void *)&cmd_showportall_show,
(void *)&cmd_showportall_port,
@@ -7137,6 +7142,10 @@ static void cmd_showport_parsed(void *parsed_result,
nic_xstats_clear(res->portnum);
} else if (!strcmp(res->what, "info"))
port_infos_display(res->portnum);
+   else if (!strcmp(res->what, "summary")) {
+   port_summary_header_display();
+   port_summary_display(res->portnum);
+   }
else if (!strcmp(res->what, "stats"))
nic_stats_display(res->portnum);
else if (!strcmp(res->what, "xstats"))
@@ -7158,7 +7167,7 @@ cmdline_parse_token_string_t cmd_showport_port =
TOKEN_STRING_INITIALIZER(struct cmd_showport_result, port, "port");
 cmdline_parse_token_string_t cmd_showport_what =
TOKEN_STRING_INITIALIZER(struct cmd_showport_result, what,
-"info#stats#xstats#fdir#stat_qmap#dcb_tc#cap");
+
"info#summary#stats#xstats#fdir#stat_qmap#dcb_tc#cap");
 cmdline_parse_token_num_t cmd_showport_portnum =
TOKEN_NUM_INITIALIZER(struct cmd_showport_result, portnum, UINT16);
 
@@ -7166,7 +7175,7 @@ cmdline_parse_inst_t cmd_showport = {
.f = cmd_showport_parsed,
.data = NULL,
.help_str = "show|clear port "
-   "info|stats|xstats|fdir|stat_qmap|dcb_tc|cap "
+   "info|summary|stats|xstats|fdir|stat_qmap|dcb_tc|cap "
"",
.tokens = {
(void *)&cmd_showport_show,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 14ccd68..237efc3 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -518,6 +518,43 @@ port_infos_display(portid_t port_id)
 }
 
 void
+port_summary_header_display(void)
+{
+   uint16_t port_number;
+
+   port_number = rte_eth_dev_count_avail();
+   printf("Number of available ports: %i\n", port_number);
+   printf("%s %s %10s %15s %8s %s\n", "Port", "MAC Ad

[dpdk-dev] [PATCH 02/13] net/dpaa: fix jumbo buffer config

2018-08-29 Thread Hemant Agrawal
Avoid return after the jumbo buffer config in dev config API

Fixes: 9658ac3a4ef6 ("net/dpaa: set the correct frame size in device MTU")
Cc: sta...@dpdk.org

Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa/dpaa_ethdev.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index 009ef84..dd1bc90 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -196,11 +196,17 @@ dpaa_eth_dev_configure(struct rte_eth_dev *dev)
if (rx_offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) {
if (dev->data->dev_conf.rxmode.max_rx_pkt_len <=
DPAA_MAX_RX_PKT_LEN) {
+   DPAA_PMD_DEBUG("enabling jumbo");
fman_if_set_maxfrm(dpaa_intf->fif,
dev->data->dev_conf.rxmode.max_rx_pkt_len);
-   return 0;
+   dev->data->mtu =
+   dev->data->dev_conf.rxmode.max_rx_pkt_len -
+   ETHER_HDR_LEN - ETHER_CRC_LEN - VLAN_TAG_SIZE;
} else {
-   return -1;
+   DPAA_PMD_ERR("enabling jumbo err conf max len=%d "
+   "supported is %d",
+   dev->data->dev_conf.rxmode.max_rx_pkt_len,
+   DPAA_MAX_RX_PKT_LEN);
}
}
return 0;
-- 
2.7.4



[dpdk-dev] [PATCH 03/13] net/dpaa: implement scatter offload support

2018-08-29 Thread Hemant Agrawal
Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa/dpaa_ethdev.c | 62 +++---
 drivers/net/dpaa/dpaa_ethdev.h |  3 +-
 drivers/net/dpaa/dpaa_rxtx.c   |  8 +++---
 drivers/net/dpaa/dpaa_rxtx.h   |  2 --
 4 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index dd1bc90..a0e3f24 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -47,7 +47,8 @@
 
 /* Supported Rx offloads */
 static uint64_t dev_rx_offloads_sup =
-   DEV_RX_OFFLOAD_JUMBO_FRAME;
+   DEV_RX_OFFLOAD_JUMBO_FRAME |
+   DEV_RX_OFFLOAD_SCATTER;
 
 /* Rx offloads which cannot be disabled */
 static uint64_t dev_rx_offloads_nodis =
@@ -55,8 +56,7 @@ static uint64_t dev_rx_offloads_nodis =
DEV_RX_OFFLOAD_UDP_CKSUM |
DEV_RX_OFFLOAD_TCP_CKSUM |
DEV_RX_OFFLOAD_OUTER_IPV4_CKSUM |
-   DEV_RX_OFFLOAD_CRC_STRIP |
-   DEV_RX_OFFLOAD_SCATTER;
+   DEV_RX_OFFLOAD_CRC_STRIP;
 
 /* Supported Tx offloads */
 static uint64_t dev_tx_offloads_sup;
@@ -148,11 +148,30 @@ dpaa_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
struct dpaa_if *dpaa_intf = dev->data->dev_private;
uint32_t frame_size = mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
+ VLAN_TAG_SIZE;
+   uint32_t buffsz = dev->data->min_rx_buf_size - RTE_PKTMBUF_HEADROOM;
 
PMD_INIT_FUNC_TRACE();
 
if (mtu < ETHER_MIN_MTU || frame_size > DPAA_MAX_RX_PKT_LEN)
return -EINVAL;
+   /*
+* Refuse mtu that requires the support of scattered packets
+* when this feature has not been enabled before.
+*/
+   if (dev->data->min_rx_buf_size &&
+   !dev->data->scattered_rx && frame_size > buffsz) {
+   DPAA_PMD_ERR("SG not enabled, will not fit in one buffer");
+   return -EINVAL;
+   }
+
+   /* check  *   >= max_frame */
+   if (dev->data->min_rx_buf_size && dev->data->scattered_rx &&
+   (frame_size > buffsz * DPAA_SGT_MAX_ENTRIES)) {
+   DPAA_PMD_ERR("Too big to fit for Max SG list %d",
+   buffsz * DPAA_SGT_MAX_ENTRIES);
+   return -EINVAL;
+   }
+
if (frame_size > ETHER_MAX_LEN)
dev->data->dev_conf.rxmode.offloads &=
DEV_RX_OFFLOAD_JUMBO_FRAME;
@@ -209,6 +228,13 @@ dpaa_eth_dev_configure(struct rte_eth_dev *dev)
DPAA_MAX_RX_PKT_LEN);
}
}
+
+   if (rx_offloads & DEV_RX_OFFLOAD_SCATTER) {
+   DPAA_PMD_DEBUG("enabling scatter mode");
+   fman_if_set_sg(dpaa_intf->fif, 1);
+   dev->data->scattered_rx = 1;
+   }
+
return 0;
 }
 
@@ -306,7 +332,6 @@ static void dpaa_eth_dev_info(struct rte_eth_dev *dev,
 
dev_info->max_rx_queues = dpaa_intf->nb_rx_queues;
dev_info->max_tx_queues = dpaa_intf->nb_tx_queues;
-   dev_info->min_rx_bufsize = DPAA_MIN_RX_BUF_SIZE;
dev_info->max_rx_pktlen = DPAA_MAX_RX_PKT_LEN;
dev_info->max_mac_addrs = DPAA_MAX_MAC_FILTER;
dev_info->max_hash_mac_addrs = 0;
@@ -520,6 +545,7 @@ int dpaa_eth_rx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
struct qm_mcc_initfq opts = {0};
u32 flags = 0;
int ret;
+   u32 buffsz = rte_pktmbuf_data_room_size(mp) - RTE_PKTMBUF_HEADROOM;
 
PMD_INIT_FUNC_TRACE();
 
@@ -533,6 +559,28 @@ int dpaa_eth_rx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
DPAA_PMD_INFO("Rx queue setup for queue index: %d fq_id (0x%x)",
queue_idx, rxq->fqid);
 
+   /* Max packet can fit in single buffer */
+   if (dev->data->dev_conf.rxmode.max_rx_pkt_len <= buffsz) {
+   ;
+   } else if (dev->data->dev_conf.rxmode.offloads &
+   DEV_RX_OFFLOAD_SCATTER) {
+   if (dev->data->dev_conf.rxmode.max_rx_pkt_len >
+   buffsz * DPAA_SGT_MAX_ENTRIES) {
+   DPAA_PMD_ERR("max RxPkt size %d too big to fit "
+   "MaxSGlist %d",
+   dev->data->dev_conf.rxmode.max_rx_pkt_len,
+   buffsz * DPAA_SGT_MAX_ENTRIES);
+   rte_errno = EOVERFLOW;
+   return -rte_errno;
+   }
+   } else {
+   DPAA_PMD_WARN("The requested maximum Rx packet size (%u) is"
+" larger than a single mbuf (%u) and scattered"
+" mode has not been requested",
+dev->data->dev_conf.rxmode.max_rx_pkt_len,
+buffsz - RTE_PKTMBUF_HEADROOM);
+   }
+
if (!dpaa_intf->bp_info || dpaa_intf->bp_info->mp != mp) {
struct fman_if_ic_

[dpdk-dev] [PATCH 00/13] driver/net: NXP DPAA driver enhancements

2018-08-29 Thread Hemant Agrawal


Hemant Agrawal (9):
  net/dpaa: configure frame queue on MAC ID basis
  net/dpaa: fix jumbo buffer config
  net/dpaa: implement scatter offload support
  net/dpaa: minor debug log enhancements
  bus/dpaa: add interrupt based portal fd support
  net/dpaa: separate Rx function for LS1046
  net/dpaa: tune prefetch in Rx path
  bus/dpaa: add check for re-definition in compat
  mempool/dpaa: change the debug log level to DP

Nipun Gupta (2):
  bus/dpaa: avoid tag Set for eqcr in Tx path
  bus/dpaa: avoid using be conversions for contextb

Sachin Saxena (1):
  net/dpaa: set correct speed based on MAC type

Sunil Kumar Kori (1):
  net/dpaa: rearranging of atomic queue support code

 drivers/bus/dpaa/base/qbman/bman_driver.c |  17 ++--
 drivers/bus/dpaa/base/qbman/qman.c|  72 +
 drivers/bus/dpaa/base/qbman/qman_driver.c |   7 +-
 drivers/bus/dpaa/include/compat.h |  20 +++--
 drivers/bus/dpaa/include/fsl_qman.h   |  20 +
 drivers/bus/dpaa/include/fsl_usd.h|   6 ++
 drivers/bus/dpaa/rte_bus_dpaa_version.map |  17 +++-
 drivers/mempool/dpaa/dpaa_mempool.c   |   2 +-
 drivers/net/dpaa/dpaa_ethdev.c| 126 --
 drivers/net/dpaa/dpaa_ethdev.h|   5 +-
 drivers/net/dpaa/dpaa_rxtx.c  | 100 
 drivers/net/dpaa/dpaa_rxtx.h  |   5 +-
 12 files changed, 306 insertions(+), 91 deletions(-)

-- 
2.7.4



[dpdk-dev] [PATCH 01/13] net/dpaa: configure frame queue on MAC ID basis

2018-08-29 Thread Hemant Agrawal
Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa/dpaa_ethdev.c | 25 ++---
 drivers/net/dpaa/dpaa_ethdev.h |  2 +-
 2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index 7a950ac..009ef84 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -1012,7 +1012,7 @@ static int dpaa_rx_queue_init(struct qman_fq *fq, struct 
qman_cgr *cgr_rx,
 {
struct qm_mcc_initfq opts = {0};
int ret;
-   u32 flags = 0;
+   u32 flags = QMAN_FQ_FLAG_NO_ENQUEUE;
struct qm_mcc_initcgr cgr_opts = {
.we_mask = QM_CGR_WE_CS_THRES |
QM_CGR_WE_CSTD_EN |
@@ -1025,15 +1025,18 @@ static int dpaa_rx_queue_init(struct qman_fq *fq, 
struct qman_cgr *cgr_rx,
 
PMD_INIT_FUNC_TRACE();
 
-   ret = qman_reserve_fqid(fqid);
-   if (ret) {
-   DPAA_PMD_ERR("reserve rx fqid 0x%x failed with ret: %d",
-fqid, ret);
-   return -EINVAL;
+   if (fqid) {
+   ret = qman_reserve_fqid(fqid);
+   if (ret) {
+   DPAA_PMD_ERR("reserve rx fqid 0x%x failed with ret: %d",
+fqid, ret);
+   return -EINVAL;
+   }
+   } else {
+   flags |= QMAN_FQ_FLAG_DYNAMIC_FQID;
}
-
DPAA_PMD_DEBUG("creating rx fq %p, fqid 0x%x", fq, fqid);
-   ret = qman_create_fq(fqid, QMAN_FQ_FLAG_NO_ENQUEUE, fq);
+   ret = qman_create_fq(fqid, flags, fq);
if (ret) {
DPAA_PMD_ERR("create rx fqid 0x%x failed with ret: %d",
fqid, ret);
@@ -1052,7 +1055,7 @@ static int dpaa_rx_queue_init(struct qman_fq *fq, struct 
qman_cgr *cgr_rx,
if (ret) {
DPAA_PMD_WARN(
"rx taildrop init fail on rx fqid 0x%x(ret=%d)",
-   fqid, ret);
+   fq->fqid, ret);
goto without_cgr;
}
opts.we_mask |= QM_INITFQ_WE_CGID;
@@ -1060,7 +1063,7 @@ static int dpaa_rx_queue_init(struct qman_fq *fq, struct 
qman_cgr *cgr_rx,
opts.fqd.fq_ctrl |= QM_FQCTRL_CGE;
}
 without_cgr:
-   ret = qman_init_fq(fq, flags, &opts);
+   ret = qman_init_fq(fq, 0, &opts);
if (ret)
DPAA_PMD_ERR("init rx fqid 0x%x failed with ret:%d", fqid, ret);
return ret;
@@ -1213,7 +1216,7 @@ dpaa_dev_init(struct rte_eth_dev *eth_dev)
if (default_q)
fqid = cfg->rx_def;
else
-   fqid = DPAA_PCD_FQID_START + dpaa_intf->ifid *
+   fqid = DPAA_PCD_FQID_START + dpaa_intf->fif->mac_idx *
DPAA_PCD_FQID_MULTIPLIER + loop;
 
if (dpaa_intf->cgr_rx)
diff --git a/drivers/net/dpaa/dpaa_ethdev.h b/drivers/net/dpaa/dpaa_ethdev.h
index c79b9f8..2c38c34 100644
--- a/drivers/net/dpaa/dpaa_ethdev.h
+++ b/drivers/net/dpaa/dpaa_ethdev.h
@@ -63,7 +63,7 @@
 #define DPAA_PCD_FQID_START0x400
 #define DPAA_PCD_FQID_MULTIPLIER   0x100
 #define DPAA_DEFAULT_NUM_PCD_QUEUES1
-#define DPAA_MAX_NUM_PCD_QUEUES32
+#define DPAA_MAX_NUM_PCD_QUEUES4
 
 #define DPAA_IF_TX_PRIORITY3
 #define DPAA_IF_RX_PRIORITY0
-- 
2.7.4



[dpdk-dev] [PATCH 05/13] net/dpaa: minor debug log enhancements

2018-08-29 Thread Hemant Agrawal
Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa/dpaa_ethdev.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index 4e5cc0f..df72510 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -614,9 +614,9 @@ int dpaa_eth_rx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
fman_if_set_bp(dpaa_intf->fif, mp->size,
   dpaa_intf->bp_info->bpid, bp_size);
dpaa_intf->valid = 1;
-   DPAA_PMD_INFO("if =%s - fd_offset = %d offset = %d",
-   dpaa_intf->name, fd_offset,
-   fman_if_get_fdoff(dpaa_intf->fif));
+   DPAA_PMD_DEBUG("if:%s fd_offset = %d offset = %d",
+   dpaa_intf->name, fd_offset,
+   fman_if_get_fdoff(dpaa_intf->fif));
}
DPAA_PMD_DEBUG("if:%s sg_on = %d, max_frm =%d", dpaa_intf->name,
fman_if_get_sg_enable(dpaa_intf->fif),
@@ -694,7 +694,8 @@ dpaa_eth_eventq_attach(const struct rte_eth_dev *dev,
struct qm_mcc_initfq opts = {0};
 
if (dpaa_push_mode_max_queue)
-   DPAA_PMD_WARN("PUSH mode already enabled for first %d queues.\n"
+   DPAA_PMD_WARN("PUSH mode q and EVENTDEV are not compatible\n"
+ "PUSH mode already enabled for first %d queues.\n"
  "To disable set DPAA_PUSH_QUEUES_NUMBER to 0\n",
  dpaa_push_mode_max_queue);
 
-- 
2.7.4



[dpdk-dev] [PATCH 04/13] net/dpaa: set correct speed based on MAC type

2018-08-29 Thread Hemant Agrawal
From: Sachin Saxena 

Fixes: 799db4568c76 ("net/dpaa: support device info and speed capability")
Cc: shreyansh.j...@nxp.com
Cc: sta...@dpdk.org

Signed-off-by: Sachin Saxena 
---
 drivers/net/dpaa/dpaa_ethdev.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index a0e3f24..4e5cc0f 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -338,8 +338,15 @@ static void dpaa_eth_dev_info(struct rte_eth_dev *dev,
dev_info->max_vfs = 0;
dev_info->max_vmdq_pools = ETH_16_POOLS;
dev_info->flow_type_rss_offloads = DPAA_RSS_OFFLOAD_ALL;
-   dev_info->speed_capa = (ETH_LINK_SPEED_1G |
-   ETH_LINK_SPEED_10G);
+
+   if (dpaa_intf->fif->mac_type == fman_mac_1g)
+   dev_info->speed_capa = ETH_LINK_SPEED_1G;
+   else if (dpaa_intf->fif->mac_type == fman_mac_10g)
+   dev_info->speed_capa = (ETH_LINK_SPEED_1G | ETH_LINK_SPEED_10G);
+   else
+   DPAA_PMD_ERR("invalid link_speed: %s, %d",
+dpaa_intf->name, dpaa_intf->fif->mac_type);
+
dev_info->rx_offload_capa = dev_rx_offloads_sup |
dev_rx_offloads_nodis;
dev_info->tx_offload_capa = dev_tx_offloads_sup |
-- 
2.7.4



[dpdk-dev] [PATCH 07/13] bus/dpaa: avoid tag Set for eqcr in Tx path

2018-08-29 Thread Hemant Agrawal
From: Nipun Gupta 

Signed-off-by: Nipun Gupta 
---
 drivers/bus/dpaa/base/qbman/qman.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/bus/dpaa/base/qbman/qman.c 
b/drivers/bus/dpaa/base/qbman/qman.c
index 8730550..71da275 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -2238,11 +2238,6 @@ int qman_enqueue_multi(struct qman_fq *fq,
/* try to send as many frames as possible */
while (eqcr->available && frames_to_send--) {
eq->fqid = fq->fqid_le;
-#ifdef CONFIG_FSL_QMAN_FQ_LOOKUP
-   eq->tag = cpu_to_be32(fq->key);
-#else
-   eq->tag = cpu_to_be32((u32)(uintptr_t)fq);
-#endif
eq->fd.opaque_addr = fd->opaque_addr;
eq->fd.addr = cpu_to_be40(fd->addr);
eq->fd.status = cpu_to_be32(fd->status);
-- 
2.7.4



[dpdk-dev] [PATCH 06/13] bus/dpaa: add interrupt based portal fd support

2018-08-29 Thread Hemant Agrawal
Signed-off-by: Hemant Agrawal 
---
 drivers/bus/dpaa/base/qbman/bman_driver.c | 17 ++
 drivers/bus/dpaa/base/qbman/qman.c| 52 +++
 drivers/bus/dpaa/base/qbman/qman_driver.c |  7 -
 drivers/bus/dpaa/include/fsl_qman.h   | 20 
 drivers/bus/dpaa/include/fsl_usd.h|  6 
 drivers/bus/dpaa/rte_bus_dpaa_version.map | 17 +-
 6 files changed, 111 insertions(+), 8 deletions(-)

diff --git a/drivers/bus/dpaa/base/qbman/bman_driver.c 
b/drivers/bus/dpaa/base/qbman/bman_driver.c
index b14b590..750b756 100644
--- a/drivers/bus/dpaa/base/qbman/bman_driver.c
+++ b/drivers/bus/dpaa/base/qbman/bman_driver.c
@@ -23,7 +23,7 @@ static void *bman_ccsr_map;
 /* Portal driver */
 /*/
 
-static __thread int fd = -1;
+static __thread int bmfd = -1;
 static __thread struct bm_portal_config pcfg;
 static __thread struct dpaa_ioctl_portal_map map = {
.type = dpaa_portal_bman
@@ -70,14 +70,14 @@ static int fsl_bman_portal_init(uint32_t idx, int is_shared)
pcfg.index = map.index;
bman_depletion_fill(&pcfg.mask);
 
-   fd = open(BMAN_PORTAL_IRQ_PATH, O_RDONLY);
-   if (fd == -1) {
+   bmfd = open(BMAN_PORTAL_IRQ_PATH, O_RDONLY);
+   if (bmfd == -1) {
pr_err("BMan irq init failed");
process_portal_unmap(&map.addr);
return -EBUSY;
}
/* Use the IRQ FD as a unique IRQ number */
-   pcfg.irq = fd;
+   pcfg.irq = bmfd;
 
portal = bman_create_affine_portal(&pcfg);
if (!portal) {
@@ -90,7 +90,7 @@ static int fsl_bman_portal_init(uint32_t idx, int is_shared)
/* Set the IRQ number */
irq_map.type = dpaa_portal_bman;
irq_map.portal_cinh = map.addr.cinh;
-   process_portal_irq_map(fd, &irq_map);
+   process_portal_irq_map(bmfd, &irq_map);
return 0;
 }
 
@@ -99,7 +99,7 @@ static int fsl_bman_portal_finish(void)
__maybe_unused const struct bm_portal_config *cfg;
int ret;
 
-   process_portal_irq_unmap(fd);
+   process_portal_irq_unmap(bmfd);
 
cfg = bman_destroy_affine_portal();
DPAA_BUG_ON(cfg != &pcfg);
@@ -109,6 +109,11 @@ static int fsl_bman_portal_finish(void)
return ret;
 }
 
+int bman_thread_fd(void)
+{
+   return bmfd;
+}
+
 int bman_thread_init(void)
 {
/* Convert from contiguous/virtual cpu numbering to real cpu when
diff --git a/drivers/bus/dpaa/base/qbman/qman.c 
b/drivers/bus/dpaa/base/qbman/qman.c
index 7c17027..8730550 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -1040,6 +1040,50 @@ static inline unsigned int __poll_portal_fast(struct 
qman_portal *p,
return limit;
 }
 
+int qman_irqsource_add(u32 bits)
+{
+   struct qman_portal *p = get_affine_portal();
+
+   bits = bits & QM_PIRQ_VISIBLE;
+
+   /* Clear any previously remaining interrupt conditions in
+* QCSP_ISR. This prevents raising a false interrupt when
+* interrupt conditions are enabled in QCSP_IER.
+*/
+   qm_isr_status_clear(&p->p, bits);
+   dpaa_set_bits(bits, &p->irq_sources);
+   qm_isr_enable_write(&p->p, p->irq_sources);
+
+
+   return 0;
+}
+
+int qman_irqsource_remove(u32 bits)
+{
+   struct qman_portal *p = get_affine_portal();
+   u32 ier;
+
+   /* Our interrupt handler only processes+clears status register bits that
+* are in p->irq_sources. As we're trimming that mask, if one of them
+* were to assert in the status register just before we remove it from
+* the enable register, there would be an interrupt-storm when we
+* release the IRQ lock. So we wait for the enable register update to
+* take effect in h/w (by reading it back) and then clear all other bits
+* in the status register. Ie. we clear them from ISR once it's certain
+* IER won't allow them to reassert.
+*/
+
+   bits &= QM_PIRQ_VISIBLE;
+   dpaa_clear_bits(bits, &p->irq_sources);
+   qm_isr_enable_write(&p->p, p->irq_sources);
+   ier = qm_isr_enable_read(&p->p);
+   /* Using "~ier" (rather than "bits" or "~p->irq_sources") creates a
+* data-dependency, ie. to protect against re-ordering.
+*/
+   qm_isr_status_clear(&p->p, ~ier);
+   return 0;
+}
+
 u16 qman_affine_channel(int cpu)
 {
if (cpu < 0) {
@@ -1114,6 +1158,14 @@ unsigned int qman_portal_poll_rx(unsigned int poll_limit,
return rx_number;
 }
 
+void qman_clear_irq(void)
+{
+   struct qman_portal *p = get_affine_portal();
+   u32 clear = QM_DQAVAIL_MASK | (p->irq_sources &
+   ~(QM_PIRQ_CSCI | QM_PIRQ_CCSCI));
+   qm_isr_status_clear(&p->p, clear);
+}
+
 u32 qman_portal_dequeue(struct rte_event ev[], unsigned int poll_limit,
void **bufs)
 {
diff --git a/drivers/bus/dpaa/base/qbman/qman_driver.c 
b/drivers/bus/dpaa/base/qbman/qman_driver.c

[dpdk-dev] [PATCH 09/13] net/dpaa: rearranging of atomic queue support code

2018-08-29 Thread Hemant Agrawal
From: Sunil Kumar Kori 

Signed-off-by: Sunil Kumar Kori 
---
 drivers/net/dpaa/dpaa_rxtx.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index 3a3a048..6698c97 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -870,6 +870,19 @@ dpaa_eth_queue_tx(void *q, struct rte_mbuf **bufs, 
uint16_t nb_bufs)
DPAA_TX_BURST_SIZE : nb_bufs;
for (loop = 0; loop < frames_to_send; loop++) {
mbuf = *(bufs++);
+   seqn = mbuf->seqn;
+   if (seqn != DPAA_INVALID_MBUF_SEQN) {
+   index = seqn - 1;
+   if (DPAA_PER_LCORE_DQRR_HELD & (1 << index)) {
+   flags[loop] =
+  ((index & QM_EQCR_DCA_IDXMASK) << 8);
+   flags[loop] |= QMAN_ENQUEUE_FLAG_DCA;
+   DPAA_PER_LCORE_DQRR_SIZE--;
+   DPAA_PER_LCORE_DQRR_HELD &=
+   ~(1 << index);
+   }
+   }
+
if (likely(RTE_MBUF_DIRECT(mbuf))) {
mp = mbuf->pool;
bp_info = DPAA_MEMPOOL_TO_POOL_INFO(mp);
@@ -916,18 +929,6 @@ dpaa_eth_queue_tx(void *q, struct rte_mbuf **bufs, 
uint16_t nb_bufs)
goto send_pkts;
}
}
-   seqn = mbuf->seqn;
-   if (seqn != DPAA_INVALID_MBUF_SEQN) {
-   index = seqn - 1;
-   if (DPAA_PER_LCORE_DQRR_HELD & (1 << index)) {
-   flags[loop] =
-  ((index & QM_EQCR_DCA_IDXMASK) << 8);
-   flags[loop] |= QMAN_ENQUEUE_FLAG_DCA;
-   DPAA_PER_LCORE_DQRR_SIZE--;
-   DPAA_PER_LCORE_DQRR_HELD &=
-   ~(1 << index);
-   }
-   }
}
 
 send_pkts:
-- 
2.7.4



[dpdk-dev] [PATCH 08/13] bus/dpaa: avoid using be conversions for contextb

2018-08-29 Thread Hemant Agrawal
From: Nipun Gupta 

Signed-off-by: Nipun Gupta 
---
 drivers/bus/dpaa/base/qbman/qman.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/bus/dpaa/base/qbman/qman.c 
b/drivers/bus/dpaa/base/qbman/qman.c
index 71da275..dc64d08 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -852,11 +852,9 @@ static u32 __poll_portal_slow(struct qman_portal *p, u32 
is)
case QM_MR_VERB_FQPN:
/* Parked */
 #ifdef CONFIG_FSL_QMAN_FQ_LOOKUP
-   fq = get_fq_table_entry(
-   be32_to_cpu(msg->fq.contextB));
+   fq = get_fq_table_entry(msg->fq.contextB);
 #else
-   fq = (void *)(uintptr_t)
-   be32_to_cpu(msg->fq.contextB);
+   fq = (void *)(uintptr_t)msg->fq.contextB;
 #endif
fq_state_change(p, fq, msg, verb);
if (fq->cb.fqs)
@@ -967,7 +965,6 @@ static inline unsigned int __poll_portal_fast(struct 
qman_portal *p,
*shadow = *dq;
dq = shadow;
shadow->fqid = be32_to_cpu(shadow->fqid);
-   shadow->contextB = be32_to_cpu(shadow->contextB);
shadow->seqnum = be16_to_cpu(shadow->seqnum);
hw_fd_to_cpu(&shadow->fd);
 #endif
@@ -1136,9 +1133,9 @@ unsigned int qman_portal_poll_rx(unsigned int poll_limit,
 
/* SDQCR: context_b points to the FQ */
 #ifdef CONFIG_FSL_QMAN_FQ_LOOKUP
-   fq = qman_fq_lookup_table[be32_to_cpu(dq[rx_number]->contextB)];
+   fq = qman_fq_lookup_table[dq[rx_number]->contextB];
 #else
-   fq = (void *)be32_to_cpu(dq[rx_number]->contextB);
+   fq = (void *)dq[rx_number]->contextB;
 #endif
if (fq->cb.dqrr_prepare)
fq->cb.dqrr_prepare(shadow[rx_number],
@@ -1195,7 +1192,6 @@ u32 qman_portal_dequeue(struct rte_event ev[], unsigned 
int poll_limit,
*shadow = *dq;
dq = shadow;
shadow->fqid = be32_to_cpu(shadow->fqid);
-   shadow->contextB = be32_to_cpu(shadow->contextB);
shadow->seqnum = be16_to_cpu(shadow->seqnum);
hw_fd_to_cpu(&shadow->fd);
 #endif
@@ -1260,7 +1256,6 @@ struct qm_dqrr_entry *qman_dequeue(struct qman_fq *fq)
*shadow = *dq;
dq = shadow;
shadow->fqid = be32_to_cpu(shadow->fqid);
-   shadow->contextB = be32_to_cpu(shadow->contextB);
shadow->seqnum = be16_to_cpu(shadow->seqnum);
hw_fd_to_cpu(&shadow->fd);
 #endif
@@ -1556,7 +1551,7 @@ int qman_init_fq(struct qman_fq *fq, u32 flags, struct 
qm_mcc_initfq *opts)
 
mcc->initfq.we_mask |= QM_INITFQ_WE_CONTEXTB;
 #ifdef CONFIG_FSL_QMAN_FQ_LOOKUP
-   mcc->initfq.fqd.context_b = fq->key;
+   mcc->initfq.fqd.context_b = cpu_to_be32(fq->key);
 #else
mcc->initfq.fqd.context_b = (u32)(uintptr_t)fq;
 #endif
-- 
2.7.4



[dpdk-dev] [PATCH 10/13] net/dpaa: separate Rx function for LS1046

2018-08-29 Thread Hemant Agrawal
This is to avoid the checks in datapath

Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa/dpaa_ethdev.c |  9 +--
 drivers/net/dpaa/dpaa_rxtx.c   | 60 +-
 drivers/net/dpaa/dpaa_rxtx.h   |  3 +++
 3 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index df72510..76cd0f7 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -658,8 +658,13 @@ int dpaa_eth_rx_queue_setup(struct rte_eth_dev *dev, 
uint16_t queue_idx,
"ret:%d(%s)", rxq->fqid, ret, strerror(ret));
return ret;
}
-   rxq->cb.dqrr_dpdk_pull_cb = dpaa_rx_cb;
-   rxq->cb.dqrr_prepare = dpaa_rx_cb_prepare;
+   if (dpaa_svr_family == SVR_LS1043A_FAMILY) {
+   rxq->cb.dqrr_dpdk_pull_cb = dpaa_rx_cb_no_prefetch;
+   } else {
+   rxq->cb.dqrr_dpdk_pull_cb = dpaa_rx_cb;
+   rxq->cb.dqrr_prepare = dpaa_rx_cb_prepare;
+   }
+
rxq->is_static = true;
}
dev->data->rx_queues[queue_idx] = rxq;
diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index 6698c97..2c57741 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -398,8 +398,9 @@ dpaa_eth_fd_to_mbuf(const struct qm_fd *fd, uint32_t ifid)
return mbuf;
 }
 
+/* Specific for LS1043 */
 void
-dpaa_rx_cb(struct qman_fq **fq, struct qm_dqrr_entry **dqrr,
+dpaa_rx_cb_no_prefetch(struct qman_fq **fq, struct qm_dqrr_entry **dqrr,
   void **bufs, int num_bufs)
 {
struct rte_mbuf *mbuf;
@@ -411,17 +412,13 @@ dpaa_rx_cb(struct qman_fq **fq, struct qm_dqrr_entry 
**dqrr,
uint32_t length;
uint8_t format;
 
-   if (dpaa_svr_family != SVR_LS1046A_FAMILY) {
-   bp_info = DPAA_BPID_TO_POOL_INFO(dqrr[0]->fd.bpid);
-   ptr = rte_dpaa_mem_ptov(qm_fd_addr(&dqrr[0]->fd));
-   rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
-   bufs[0] = (struct rte_mbuf *)((char *)ptr -
-   bp_info->meta_data_size);
-   }
+   bp_info = DPAA_BPID_TO_POOL_INFO(dqrr[0]->fd.bpid);
+   ptr = rte_dpaa_mem_ptov(qm_fd_addr(&dqrr[0]->fd));
+   rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
+   bufs[0] = (struct rte_mbuf *)((char *)ptr - bp_info->meta_data_size);
 
for (i = 0; i < num_bufs; i++) {
-   if (dpaa_svr_family != SVR_LS1046A_FAMILY &&
-   i < num_bufs - 1) {
+   if (i < num_bufs - 1) {
bp_info = DPAA_BPID_TO_POOL_INFO(dqrr[i + 1]->fd.bpid);
ptr = rte_dpaa_mem_ptov(qm_fd_addr(&dqrr[i + 1]->fd));
rte_prefetch0((void *)((uint8_t *)ptr +
@@ -458,6 +455,46 @@ dpaa_rx_cb(struct qman_fq **fq, struct qm_dqrr_entry 
**dqrr,
}
 }
 
+void
+dpaa_rx_cb(struct qman_fq **fq, struct qm_dqrr_entry **dqrr,
+  void **bufs, int num_bufs)
+{
+   struct rte_mbuf *mbuf;
+   const struct qm_fd *fd;
+   struct dpaa_if *dpaa_intf;
+   uint16_t offset, i;
+   uint32_t length;
+   uint8_t format;
+
+   for (i = 0; i < num_bufs; i++) {
+   fd = &dqrr[i]->fd;
+   dpaa_intf = fq[0]->dpaa_intf;
+
+   format = (fd->opaque & DPAA_FD_FORMAT_MASK) >>
+   DPAA_FD_FORMAT_SHIFT;
+   if (unlikely(format == qm_fd_sg)) {
+   bufs[i] = dpaa_eth_sg_to_mbuf(fd, dpaa_intf->ifid);
+   continue;
+   }
+
+   offset = (fd->opaque & DPAA_FD_OFFSET_MASK) >>
+   DPAA_FD_OFFSET_SHIFT;
+   length = fd->opaque & DPAA_FD_LENGTH_MASK;
+
+   mbuf = bufs[i];
+   mbuf->data_off = offset;
+   mbuf->data_len = length;
+   mbuf->pkt_len = length;
+   mbuf->port = dpaa_intf->ifid;
+
+   mbuf->nb_segs = 1;
+   mbuf->ol_flags = 0;
+   mbuf->next = NULL;
+   rte_mbuf_refcnt_set(mbuf, 1);
+   dpaa_eth_packet_info(mbuf, mbuf->buf_addr);
+   }
+}
+
 void dpaa_rx_cb_prepare(struct qm_dqrr_entry *dq, void **bufs)
 {
struct dpaa_bp_info *bp_info = DPAA_BPID_TO_POOL_INFO(dq->fd.bpid);
@@ -468,8 +505,7 @@ void dpaa_rx_cb_prepare(struct qm_dqrr_entry *dq, void 
**bufs)
 * So we prefetch the annoation beforehand, so that it is available
 * in cache when accessed.
 */
-   if (dpaa_svr_family == SVR_LS1046A_FAMILY)
-   rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
+   rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
 
*bufs = (struct rte_mbuf *)((char *)ptr - bp_info->meta_data_size);
 }
diff --git a/d

[dpdk-dev] [PATCH 11/13] net/dpaa: tune prefetch in Rx path

2018-08-29 Thread Hemant Agrawal
Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa/dpaa_rxtx.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index 2c57741..c4471c2 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -370,10 +370,6 @@ dpaa_eth_fd_to_mbuf(const struct qm_fd *fd, uint32_t ifid)
if (unlikely(format == qm_fd_sg))
return dpaa_eth_sg_to_mbuf(fd, ifid);
 
-   ptr = DPAA_MEMPOOL_PTOV(bp_info, qm_fd_addr(fd));
-
-   rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
-
offset = (fd->opaque & DPAA_FD_OFFSET_MASK) >> DPAA_FD_OFFSET_SHIFT;
length = fd->opaque & DPAA_FD_LENGTH_MASK;
 
@@ -381,8 +377,11 @@ dpaa_eth_fd_to_mbuf(const struct qm_fd *fd, uint32_t ifid)
 
/* Ignoring case when format != qm_fd_contig */
dpaa_display_frame(fd);
+   ptr = DPAA_MEMPOOL_PTOV(bp_info, qm_fd_addr(fd));
 
mbuf = (struct rte_mbuf *)((char *)ptr - bp_info->meta_data_size);
+   /* Prefetch the Parse results and packet data to L1 */
+   rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
 
mbuf->data_off = offset;
mbuf->data_len = length;
-- 
2.7.4



[dpdk-dev] [PATCH 12/13] bus/dpaa: add check for re-definition in compat

2018-08-29 Thread Hemant Agrawal
Signed-off-by: Hemant Agrawal 
---
 drivers/bus/dpaa/include/compat.h | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/bus/dpaa/include/compat.h 
b/drivers/bus/dpaa/include/compat.h
index 92241d2..4122657 100644
--- a/drivers/bus/dpaa/include/compat.h
+++ b/drivers/bus/dpaa/include/compat.h
@@ -57,8 +57,9 @@
 #ifndef __packed
 #define __packed   __rte_packed
 #endif
+#ifndef noinline
 #define noinline   __attribute__((noinline))
-
+#endif
 #define L1_CACHE_BYTES 64
 #define cacheline_aligned __attribute__((aligned(L1_CACHE_BYTES)))
 #define __stringify_1(x) #x
@@ -75,20 +76,25 @@
printf(fmt, ##args); \
fflush(stdout); \
} while (0)
-
+#ifndef pr_crit
 #define pr_crit(fmt, args...)   prflush("CRIT:" fmt, ##args)
+#endif
+#ifndef pr_err
 #define pr_err(fmt, args...)prflush("ERR:" fmt, ##args)
+#endif
+#ifndef pr_warn
 #define pr_warn(fmt, args...)   prflush("WARN:" fmt, ##args)
+#endif
+#ifndef pr_info
 #define pr_info(fmt, args...)   prflush(fmt, ##args)
-
-#ifdef RTE_LIBRTE_DPAA_DEBUG_BUS
-#ifdef pr_debug
-#undef pr_debug
 #endif
+#ifndef pr_debug
+#ifdef RTE_LIBRTE_DPAA_DEBUG_BUS
 #define pr_debug(fmt, args...) printf(fmt, ##args)
 #else
 #define pr_debug(fmt, args...) {}
 #endif
+#endif
 
 #define DPAA_BUG_ON(x) RTE_ASSERT(x)
 
@@ -256,7 +262,9 @@ __bswap_24(uint32_t x)
 #define be16_to_cpu(x) rte_be_to_cpu_16(x)
 
 #define cpu_to_be64(x) rte_cpu_to_be_64(x)
+#if !defined(cpu_to_be32)
 #define cpu_to_be32(x) rte_cpu_to_be_32(x)
+#endif
 #define cpu_to_be16(x) rte_cpu_to_be_16(x)
 
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
-- 
2.7.4



[dpdk-dev] [PATCH 13/13] mempool/dpaa: change the debug log level to DP

2018-08-29 Thread Hemant Agrawal
Signed-off-by: Hemant Agrawal 
---
 drivers/mempool/dpaa/dpaa_mempool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mempool/dpaa/dpaa_mempool.c 
b/drivers/mempool/dpaa/dpaa_mempool.c
index 10c536b..1c12122 100644
--- a/drivers/mempool/dpaa/dpaa_mempool.c
+++ b/drivers/mempool/dpaa/dpaa_mempool.c
@@ -122,7 +122,7 @@ dpaa_buf_free(struct dpaa_bp_info *bp_info, uint64_t addr)
struct bm_buffer buf;
int ret;
 
-   DPAA_MEMPOOL_DEBUG("Free 0x%" PRIx64 " to bpid: %d",
+   DPAA_MEMPOOL_DPDEBUG("Free 0x%" PRIx64 " to bpid: %d",
   addr, bp_info->bpid);
 
bm_buffer_set64(&buf, addr);
-- 
2.7.4



Re: [dpdk-dev] 16.11.8 (LTS) patches review and test

2018-08-29 Thread Marco Varlese
Hi Luca,

On Wed, 2018-08-29 at 10:55 +0100, Luca Boccassi wrote:
> On Mon, 2018-08-27 at 17:17 +0100, Luca Boccassi wrote:
> > On Thu, 2018-08-23 at 09:55 +0100, Luca Boccassi wrote:
> > > On Mon, 2018-08-13 at 19:21 +0100, luca.bocca...@gmail.com wrote:
> > > > Hi all,
> > > > 
> > > > Here is a list of patches targeted for LTS release 16.11.8.
> > > > Please
> > > > help review and test. The planned date for the final release is
> > > > August
> > > > the 23rd.
> > > > Before that, please shout if anyone has objections with these
> > > > patches being applied.
> > > > 
> > > > Also for the companies committed to running regression tests,
> > > > please run the tests and report any issue before the release
> > > > date.
> > > > 
> > > > A release candidate tarball can be found at:
> > > > 
> > > > https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc1
> > > > 
> > > > These patches are located at branch 16.11 of dpdk-stable repo:
> > > > https://dpdk.org/browse/dpdk-stable/
> > > > 
> > > > Thanks.
> > > > 
> > > > Luca Boccassi
> > > 
> > > Hi,
> > > 
> > > Regression tests from Intel have highlighted a possible issue with
> > > the
> > > changes (unidentified as of now), so while investigation is in
> > > progress
> > > we decided to postpone the release to Monday the 27th to be on the
> > > safe
> > > side.
> > > Apologies for any issues this might cause.
> > 
> > Hi,
> > 
> > Unfortunately triaging is still in progress, so it's better to
> > postpone
> > again, to Wednesday the 29th of August.
> > Apologies again for any issues due to this delay.
> 
> Hello all,
> 
> I've pushed an -rc2 with the following additional changes:
> 
> Luca Boccassi (1):
>   Revert "net/i40e: fix packet count for PF"
> 
> Radu Nicolau (3):
>   net/null: add MAC address setting fake operation
>   test/virtual_pmd: add MAC address setting fake op
>   test/bonding: assign non-zero MAC to null devices
> 
> Radu, I cherry-picked the following 3 patches that you got merged in
> 18.02 as they are necessary to fix bonding regression tests from Intel:
> 
> c5ac7748fd6bfd86b6fb4432b6792733cf32c94c
> c23fc36284e26fca9b52641118ad76a4da99d7af
> e8df563bac263e55b7dd9d45a00417aa92ef66cb
> 
> Qi, I have reverted the following patch that was backported to 16.11.4
> as it breaks a Fortville regression test from Intel:
> 
> 4bf705a7d74b0b4c1d82ad0821c43e32be15a5e5.
> 
> Marco, is there any chance you've got time today to re-run your tests?
> These changes in rc2 have been blessed by Intel and AT&T, so if it
> works for you as well I can then release later tonight.
Sure; give me few hours and will let you know how it goes.
> 
> A release candidate tarball can be found at:
> 
> https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc2
> 
> These patches are located at branch 16.11 of dpdk-stable repo:
> https://dpdk.org/browse/dpdk-stable/
> 
> Thanks!
> 


Re: [dpdk-dev] [PATCH v2 02/10] kni: separate releasing netdev from freeing KNI interface

2018-08-29 Thread Ferruh Yigit
On 6/29/2018 2:55 AM, Dan Gora wrote:
> Currently the rte_kni kernel driver suffers from a problem where
> when the interface is released, it generates a callback to the DPDK
> application to change the interface state to Down.  However, after the
> DPDK application handles the callback and generates a response back to
> the kernel, the rte_kni driver cannot wake the thread which is asleep
> waiting for the response, because it is holding the kni_link_lock
> semaphore and it has already removed the 'struct kni_dev' from the
> list of interfaces to poll for responses.
> 
> This means that if the KNI interface is in the Up state when
> rte_kni_release() is called, it will always sleep for three seconds
> until kni_net_release gives up waiting for a response from the DPDK
> application.
> 
> To fix this, we must separate the step to release the kernel network
> interface from the steps to remove the KNI interface from the list
> of interfaces to poll.
> 
> When the kernel network interface is removed with unregister_netdev(),
> if the interface is up, it will generate a callback to mark the
> interface down, which calls kni_net_release().  kni_net_release() will
> block waiting for the DPDK application to call rte_kni_handle_request()
> to handle the callback, but it also needs the thread in the KNI driver
> (either the per-dev thread for multi-thread or the per-driver thread)
> to call kni_net_poll_resp() in order to wake the thread sleeping in
> kni_net_release (actually kni_net_process_request()).
> 
> So now, KNI interfaces should be removed as such:
> 
> 1) The user calls rte_kni_release().  This only unregisters the
> netdev in the kernel, but touches nothing else.  This allows all the
> threads to run which are necessary to handle the callback into the
> DPDK application to mark the interface down.
> 
> 2) The user stops the thread running rte_kni_handle_request().
> After rte_kni_release() has been called, there will be no more
> callbacks for that interface so it is not necessary.  It cannot be
> running at the same time that rte_kni_free() frees all of the FIFOs
> and DPDK memory for that KNI interface.
> 
> 3) The user calls rte_kni_free().  This performs the RTE_KNI_IOCTL_FREE
> ioctl which calls kni_ioctl_free().  This function removes the struct
> kni_dev from the list of interfaces to poll (and kills the per-dev
> kthread, if configured for multi-thread), then frees the memory in
> the FIFOs.
> 
> Signed-off-by: Dan Gora 


You are right, that problem exits.
Although I don't see problem related to holding the kni_list_lock, polling
thread terminated before unregister interface cause the problem.

And it has a reason to terminate polling thread first, because it uses device
resources.

Separating unregister and free steps looks good, but I am not sure if this
should be reflected to the user, with a new ioctl and API.
When user done with interface it calls rte_kni_release() to release them, does
user really need a rte_kni_free() API or need to know the difference of two, is
there any action to take in userspace between these two APIs? I think no.

What about keeping single rte_kni_release() API and solve the issue internally
in KNI?

Previously it was doing:
- Stop threads (also there is another single/multi thread error [1])
- kni_dev_remove()
- unregister and free netdev() [2]
- kni_net_release_fifo_phy() [3]

Instead internally can we do:
a- Unregister kernel interfaces, rte_kni_unregister()?
b- stop threads
c- kni_net_release_fifo_phy
d- free netdev

The challenge I can see is some time required between a) and b) to let userspace
app to response, we need a way to know response received before stopping the 
thread.

Another thing is there are two release path, kni_release() and
kni_ioctl_release() both should be fixed.



[1]
If multi thread enabled they have been stopped, but if single thread used it has
not been stopped that is why you don't see the 3 seconds delay for default
single thread case, but not stopping the polling thread but removing the
interface is wrong.

[2]
unregistering netdev will trigger a userspace request but response won't be read
because polling thread also polls the response queue, and that thread is already
stopped at this stage.

[3]
This is also wrong as you have pointed in later patch in your series,
kni_net_release_fifo_phy() moves packets from rxq/alloq queue to free queue,
queues are still allocated but the references kept in kernel may be invalid at
this stage because of free netdev()



Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface

2018-08-29 Thread Ferruh Yigit
On 6/29/2018 2:55 AM, Dan Gora wrote:
> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> applications to update the link state for the KNI network interfaces
> in the linux kernel.
> 
> Note that the default carrier state is set to off when the interface
> is opened.

Why set carrier off when interface opened? Although I don't see any difference
in interface state with or without this call.

> 
> Signed-off-by: Dan Gora 

Overall looks good to me.



[dpdk-dev] [PATCH 1/7] linuxapp: build with _GNU_SOURCE defined by default

2018-08-29 Thread Anatoly Burakov
We use _GNU_SOURCE all over the place, but often times we miss
defining it, resulting in broken builds on musl. Rather than
fixing every library's and driver's and application's makefile,
fix it by simply defining _GNU_SOURCE by default for all
Linuxapp builds.

Signed-off-by: Anatoly Burakov 
---
 app/meson.build  |  9 -
 drivers/bus/pci/linux/Makefile   |  2 --
 drivers/meson.build  |  6 ++
 drivers/net/softnic/conn.c   |  1 -
 examples/meson.build |  6 ++
 lib/librte_eal/linuxapp/eal/Makefile | 16 
 lib/meson.build  |  6 ++
 mk/exec-env/linuxapp/rte.vars.mk |  2 ++
 test/test/meson.build|  5 +
 9 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/app/meson.build b/app/meson.build
index 99e0b93ec..c9a52a22b 100644
--- a/app/meson.build
+++ b/app/meson.build
@@ -11,13 +11,20 @@ apps = ['pdump',
 # for BSD only
 lib_execinfo = cc.find_library('execinfo', required: false)
 
+default_cflags = machine_args
+
+# on Linux, specify -D_GNU_SOURCE unconditionally
+if host_machine.system() == 'linux'
+   default_cflags += '-D_GNU_SOURCE'
+endif
+
 foreach app:apps
build = true
name = app
allow_experimental_apis = false
sources = []
includes = []
-   cflags = machine_args
+   cflags = default_cflags
objs = [] # other object files to link against, used e.g. for
  # instruction-set optimized versions of code
 
diff --git a/drivers/bus/pci/linux/Makefile b/drivers/bus/pci/linux/Makefile
index 96ea1d540..90404468b 100644
--- a/drivers/bus/pci/linux/Makefile
+++ b/drivers/bus/pci/linux/Makefile
@@ -4,5 +4,3 @@
 SRCS += pci.c
 SRCS += pci_uio.c
 SRCS += pci_vfio.c
-
-CFLAGS += -D_GNU_SOURCE
diff --git a/drivers/meson.build b/drivers/meson.build
index f94e2fe67..7ff97ef4e 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -15,6 +15,12 @@ default_cflags = machine_args
 if cc.has_argument('-Wno-format-truncation')
default_cflags += '-Wno-format-truncation'
 endif
+
+# on Linux, specify -D_GNU_SOURCE unconditionally
+if host_machine.system() == 'linux'
+   default_cflags += '-D_GNU_SOURCE'
+endif
+
 foreach class:driver_classes
drivers = []
std_deps = []
diff --git a/drivers/net/softnic/conn.c b/drivers/net/softnic/conn.c
index 990cf40fc..8b6658088 100644
--- a/drivers/net/softnic/conn.c
+++ b/drivers/net/softnic/conn.c
@@ -8,7 +8,6 @@
 #include 
 #include 
 
-#define __USE_GNU
 #include 
 
 #include 
diff --git a/examples/meson.build b/examples/meson.build
index 4ee7a1114..70c22eb62 100644
--- a/examples/meson.build
+++ b/examples/meson.build
@@ -22,6 +22,12 @@ default_cflags = machine_args
 if cc.has_argument('-Wno-format-truncation')
default_cflags += '-Wno-format-truncation'
 endif
+
+# on Linux, specify -D_GNU_SOURCE unconditionally
+if host_machine.system() == 'linux'
+   default_cflags += '-D_GNU_SOURCE'
+endif
+
 foreach example: examples
name = example
build = true
diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
b/lib/librte_eal/linuxapp/eal/Makefile
index fd92c75c2..bfee453bc 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -85,22 +85,6 @@ SRCS-y += rte_cycles.c
 
 CFLAGS_eal_common_cpuflags.o := $(CPUFLAGS_LIST)
 
-CFLAGS_eal.o := -D_GNU_SOURCE
-CFLAGS_eal_interrupts.o := -D_GNU_SOURCE
-CFLAGS_eal_vfio_mp_sync.o := -D_GNU_SOURCE
-CFLAGS_eal_timer.o := -D_GNU_SOURCE
-CFLAGS_eal_lcore.o := -D_GNU_SOURCE
-CFLAGS_eal_memalloc.o := -D_GNU_SOURCE
-CFLAGS_eal_thread.o := -D_GNU_SOURCE
-CFLAGS_eal_log.o := -D_GNU_SOURCE
-CFLAGS_eal_common_log.o := -D_GNU_SOURCE
-CFLAGS_eal_hugepage_info.o := -D_GNU_SOURCE
-CFLAGS_eal_common_whitelist.o := -D_GNU_SOURCE
-CFLAGS_eal_common_options.o := -D_GNU_SOURCE
-CFLAGS_eal_common_thread.o := -D_GNU_SOURCE
-CFLAGS_eal_common_lcore.o := -D_GNU_SOURCE
-CFLAGS_rte_cycles.o := -D_GNU_SOURCE
-
 # workaround for a gcc bug with noreturn attribute
 # http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
 ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
diff --git a/lib/meson.build b/lib/meson.build
index eb91f100b..4c1577571 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -30,6 +30,12 @@ default_cflags = machine_args
 if cc.has_argument('-Wno-format-truncation')
default_cflags += '-Wno-format-truncation'
 endif
+
+# on Linux, specify -D_GNU_SOURCE unconditionally
+if host_machine.system() == 'linux'
+   default_cflags += '-D_GNU_SOURCE'
+endif
+
 foreach l:libraries
build = true
name = l
diff --git a/mk/exec-env/linuxapp/rte.vars.mk b/mk/exec-env/linuxapp/rte.vars.mk
index 3129edc8c..91b778fcc 100644
--- a/mk/exec-env/linuxapp/rte.vars.mk
+++ b/mk/exec-env/linuxapp/rte.vars.mk
@@ -17,6 +17,8 @@ else
 EXECENV_CFLAGS  = -pthread
 endif
 
+EXECENV_CFLAGS += -D_GNU_SOURCE
+
 EXECENV_LDLIBS  =
 EXECENV_ASFLAGS =
 
diff --git a/test/test/mes

[dpdk-dev] [PATCH 3/7] fbarray: improve musl compatibility

2018-08-29 Thread Anatoly Burakov
When built against musl, fcntl.h doesn't silently get included.
Fix by including it explicitly.

Bugzilla ID: 34

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/common/eal_common_fbarray.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/common/eal_common_fbarray.c 
b/lib/librte_eal/common/eal_common_fbarray.c
index 43caf3ced..6f0169a5c 100644
--- a/lib/librte_eal/common/eal_common_fbarray.c
+++ b/lib/librte_eal/common/eal_common_fbarray.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2017-2018 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 #include 
-- 
2.17.1


[dpdk-dev] [PATCH 5/7] mem: improve musl compatibility

2018-08-29 Thread Anatoly Burakov
When built against musl, fcntl.h doesn't silently get included.
Fix by including it explicitly.

Bugzilla ID: 31

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/common/eal_common_memory.c | 1 +
 lib/librte_eal/linuxapp/eal/eal_memory.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_memory.c 
b/lib/librte_eal/common/eal_common_memory.c
index fbfb1b055..b061f7528 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2010-2014 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 #include 
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index dbf19499e..a62665257 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -5,6 +5,7 @@
 
 #define _FILE_OFFSET_BITS 64
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.17.1


[dpdk-dev] [PATCH 0/7] Improve core EAL musl compatibility

2018-08-29 Thread Anatoly Burakov
This patchset fixes numerous issues with musl compatibility
in the core EAL libraries. It does not fix anything beyond
core EAL (so, PCI driver is still broken, so are a few other
drivers), but it's a good start.

Tested on container with Alpine Linux. Alpine dependencies:

build-base bsd-compat-headers libexecinfo-dev linux-headers numactl-dev

For numactl-dev, testing repository needs to be enabled:

echo "http://dl-cdn.alpinelinux.org/alpine/edge/testing"; >> 
/etc/apk/repositories

If successful (using a very broad definition of "success"),
the build should fail somewhere in PCI bus driver in UIO.

Anatoly Burakov (7):
  linuxapp: build with _GNU_SOURCE defined by default
  pci/vfio: improve musl compatibility
  fbarray: improve musl compatibility
  eal/hugepage_info: improve musl compatibility
  mem: improve musl compatibility
  string_fns: improve musl compatibility
  eal: improve musl compatibility

 app/meson.build |  9 -
 drivers/bus/pci/linux/Makefile  |  2 --
 drivers/bus/pci/linux/pci_vfio.c|  8 
 drivers/meson.build |  6 ++
 drivers/net/softnic/conn.c  |  1 -
 examples/meson.build|  6 ++
 lib/librte_eal/common/eal_common_fbarray.c  |  1 +
 lib/librte_eal/common/eal_common_memory.c   |  1 +
 lib/librte_eal/common/include/rte_string_fns.h  |  1 +
 lib/librte_eal/linuxapp/eal/Makefile| 16 
 lib/librte_eal/linuxapp/eal/eal.c   |  5 +++--
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c |  1 +
 lib/librte_eal/linuxapp/eal/eal_memory.c|  1 +
 lib/librte_eal/linuxapp/eal/eal_thread.c|  5 +++--
 lib/meson.build |  6 ++
 mk/exec-env/linuxapp/rte.vars.mk|  2 ++
 test/test/meson.build   |  5 +
 17 files changed, 48 insertions(+), 28 deletions(-)

-- 
2.17.1


[dpdk-dev] [PATCH 6/7] string_fns: improve musl compatibility

2018-08-29 Thread Anatoly Burakov
Musl wraps various string functions such as strlcpy in order to
harden them. However, the fortify wrappers are included without
including the actual string functions being wrapped, which
throws missing definition compile errors. Fix by including
string.h in string functions header.

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/common/include/rte_string_fns.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/common/include/rte_string_fns.h 
b/lib/librte_eal/common/include/rte_string_fns.h
index 97597a148..ffeed2cd2 100644
--- a/lib/librte_eal/common/include/rte_string_fns.h
+++ b/lib/librte_eal/common/include/rte_string_fns.h
@@ -16,6 +16,7 @@ extern "C" {
 #endif
 
 #include 
+#include 
 
 /**
  * Takes string "string" parameter and splits it at character "delim"
-- 
2.17.1


[dpdk-dev] [PATCH 2/7] pci/vfio: improve musl compatibility

2018-08-29 Thread Anatoly Burakov
Musl already has PAGE_SIZE defined, and our define clashed with it.
Rename our define to SYS_PAGE_SIZE.

Bugzilla ID: 36

Signed-off-by: Anatoly Burakov 
---
 drivers/bus/pci/linux/pci_vfio.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index 686386d6a..88bcfb88b 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -35,8 +35,8 @@
 
 #ifdef VFIO_PRESENT
 
-#define PAGE_SIZE   (sysconf(_SC_PAGESIZE))
-#define PAGE_MASK   (~(PAGE_SIZE - 1))
+#define SYS_PAGE_SIZE   (sysconf(_SC_PAGESIZE))
+#define SYS_PAGE_MASK   (~(SYS_PAGE_SIZE - 1))
 
 static struct rte_tailq_elem rte_vfio_tailq = {
.name = "VFIO_RESOURCE_LIST",
@@ -344,8 +344,8 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct 
mapped_pci_resource *vfio_res,
 */
uint32_t table_start = msix_table->offset;
uint32_t table_end = table_start + msix_table->size;
-   table_end = (table_end + ~PAGE_MASK) & PAGE_MASK;
-   table_start &= PAGE_MASK;
+   table_end = (table_end + ~SYS_PAGE_MASK) & SYS_PAGE_MASK;
+   table_start &= SYS_PAGE_MASK;
 
if (table_start == 0 && table_end >= bar->size) {
/* Cannot map this BAR */
-- 
2.17.1


[dpdk-dev] [PATCH 4/7] eal/hugepage_info: improve musl compatibility

2018-08-29 Thread Anatoly Burakov
When built against musl, fcntl.h doesn't silently get included.
Fix by including it explicitly.

Bugzilla ID: 33

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c 
b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 3a7d4b222..0eab1cf71 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.17.1


[dpdk-dev] [PATCH 7/7] eal: improve musl compatibility

2018-08-29 Thread Anatoly Burakov
Musl complains about pthread id being of wrong size. Fix it by
casting to 64-bit and printing 64-bit hex unconditionally.

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal.c| 5 +++--
 lib/librte_eal/linuxapp/eal/eal_thread.c | 5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index e59ac6577..abd61d346 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -3,6 +3,7 @@
  * Copyright(c) 2012-2014 6WIND S.A.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -979,8 +980,8 @@ rte_eal_init(int argc, char **argv)
 
ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));
 
-   RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%x;cpuset=[%s%s])\n",
-   rte_config.master_lcore, (int)thread_id, cpuset,
+   RTE_LOG(DEBUG, EAL, "Master lcore %u is ready (tid=%" PRIx64" 
;cpuset=[%s%s])\n",
+   rte_config.master_lcore, (uint64_t)thread_id, cpuset,
ret == 0 ? "" : "...");
 
RTE_LCORE_FOREACH_SLAVE(i) {
diff --git a/lib/librte_eal/linuxapp/eal/eal_thread.c 
b/lib/librte_eal/linuxapp/eal/eal_thread.c
index b496fc711..c818375d9 100644
--- a/lib/librte_eal/linuxapp/eal/eal_thread.c
+++ b/lib/librte_eal/linuxapp/eal/eal_thread.c
@@ -3,6 +3,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -121,8 +122,8 @@ eal_thread_loop(__attribute__((unused)) void *arg)
 
ret = eal_thread_dump_affinity(cpuset, sizeof(cpuset));
 
-   RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%x;cpuset=[%s%s])\n",
-   lcore_id, (int)thread_id, cpuset, ret == 0 ? "" : "...");
+   RTE_LOG(DEBUG, EAL, "lcore %u is ready (tid=%" PRIx64 
";cpuset=[%s%s])\n",
+   lcore_id, (uint64_t)thread_id, cpuset, ret == 0 ? "" : "...");
 
/* read on our pipe to get commands */
while (1) {
-- 
2.17.1


Re: [dpdk-dev] [PATCH v2] net/mlx: add meson build support

2018-08-29 Thread Nélio Laranjeiro
On Wed, Aug 29, 2018 at 11:00:54AM +0100, Luca Boccassi wrote:
> On Tue, 2018-08-28 at 16:45 +0100, Bruce Richardson wrote:
> > Thanks for this, comments inline below.
> > 
> > /Bruce
> > 
> > On Mon, Aug 27, 2018 at 02:42:25PM +0200, Nelio Laranjeiro wrote:
> > > Mellanox drivers remains un-compiled by default due to third party
> > > libraries dependencies.  They can be enabled through:
> > > - enable_driver_mlx{4,5}=true or
> > > - enable_driver_mlx{4,5}_glue=true
> > > depending on the needs.
> > 
> > The big reason why we wanted a new build system was to move away from
> > this
> > sort of static configuration. Instead, detect if the requirements as
> > present and build the driver if you can.
> > 
> > > 
> > > To avoid modifying the whole sources and keep the compatibility
> > > with
> > > current build systems (e.g. make), the mlx{4,5}_autoconf.h is still
> > > generated by invoking DPDK scripts though meson's run_command()
> > > instead
> > > of using has_types, has_members, ... commands.
> > > 
> > > Meson will try to find the required external libraries.  When they
> > > are
> > > not installed system wide, they can be provided though CFLAGS,
> > > LDFLAGS
> > > and LD_LIBRARY_PATH environment variables, example (considering
> > > RDMA-Core is installed in /tmp/rdma-core):
> > > 
> > >  # CLFAGS=-I/tmp/rdma-core/build/include \
> > >    LDFLAGS=-L/tmp/rdma-core/build/lib \
> > >    LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> > >    meson -Denable_driver_mlx4=true output
> > > 
> > >  # CLFAGS=-I/tmp/rdma-core/build/include \
> > >    LDFLAGS=-L/tmp/rdma-core/build/lib \
> > >    LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> > >    ninja -C output install
> > 
> > Once the CFLAGS/LDFLAGS are passed to meson, they should not be
> > needed for
> > ninja. The LD_LIBRARY_PATH might be - I'm not sure about that one! :-
> > )
> > 
> > > 
> > > Signed-off-by: Nelio Laranjeiro 
> > > 
> > > ---
> > > 
> > > Changes in v2:
> > > 
> > > - dropped patch https://patches.dpdk.org/patch/43897/
> > > - remove extra_{cflags,ldflags} as already honored by meson through
> > > environment variables.
> > > ---
> > >  drivers/net/meson.build  |   2 +
> > >  drivers/net/mlx4/meson.build |  94 ++
> > >  drivers/net/mlx5/meson.build | 545
> > > +++
> > >  meson_options.txt|   8 +
> > >  4 files changed, 649 insertions(+)
> > >  create mode 100644 drivers/net/mlx4/meson.build
> > >  create mode 100644 drivers/net/mlx5/meson.build
> > > 
> > > diff --git a/drivers/net/meson.build b/drivers/net/meson.build
> > > index 9c28ed4da..c7a2d0e7d 100644
> > > --- a/drivers/net/meson.build
> > > +++ b/drivers/net/meson.build
> > > @@ -18,6 +18,8 @@ drivers = ['af_packet',
> > >   'ixgbe',
> > >   'kni',
> > >   'liquidio',
> > > + 'mlx4',
> > > + 'mlx5',
> > >   'mvpp2',
> > >   'netvsc',
> > >   'nfp',
> > > diff --git a/drivers/net/mlx4/meson.build
> > > b/drivers/net/mlx4/meson.build
> > > new file mode 100644
> > > index 0..debaca5b6
> > > --- /dev/null
> > > +++ b/drivers/net/mlx4/meson.build
> > > @@ -0,0 +1,94 @@
> > > +# SPDX-License-Identifier: BSD-3-Clause
> > > +# Copyright 2018 6WIND S.A.
> > > +# Copyright 2018 Mellanox Technologies, Ltd
> > > +
> > > +# As there is no more configuration file to activate/configure the
> > > PMD it will
> > > +# use some variables here to configure it.
> > > +pmd_dlopen = get_option('enable_driver_mlx4_glue')
> > > +build = get_option('enable_driver_mlx4') or pmd_dlopen
> > 
> > As stated above, I believe this should be based upon whether you find
> > the
> > "mnl", "mlx4" and "ibverbs" libraries. If we start adding back in
> > static
> > options for every driver, then we'll be back to having a mass of
> > config
> > options like we had before.
> 
> BTW, slightly related to that: ibverbs doesn't ship pkg-config files at
> the moment which makes the detection slightly more awkward that it
> could be, so I've sent a PR upstream to add that:
> 
> https://github.com/linux-rdma/rdma-core/pull/373
> 
> Hope this can be useful!

Thanks Luca, I was also searching for it, you save me some time, I hope
this can be backported down to RDMA-Core's stable version v15 of
RDMA-Core it would fully help.

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [RFC v2] ethdev: support metadata as flow rule criteria

2018-08-29 Thread Somnath Kotur
Hi Dekel,
Could you please show with an example i.e how the corresponding
'flow create' cmd will look like in testpmd?
Also I'm guessing you would need to change the cmdline_parser logic in
testpmd application as well to recognize this new rte_flow_item?

Thanks
Som

On Wed, Aug 29, 2018 at 12:03 PM, Dekel Peled  wrote:

>
>
> > -Original Message-
> > From: Yongseok Koh
> > Sent: Tuesday, August 28, 2018 10:44 PM
> > To: Dekel Peled 
> > Cc: dev ; Shahaf Shuler ; Ori Kam
> > ; Andrew Rybchenko ;
> > Yigit, Ferruh ; Thomas Monjalon
> > ; Ananyev, Konstantin
> > ; Adrien Mazarguil
> > ; Olivier Matz ;
> > Alex Rosenbaum 
> > Subject: Re: [RFC v2] ethdev: support metadata as flow rule criteria
> >
> > > On Aug 26, 2018, at 7:09 AM, Dekel Peled  wrote:
> > >
> > > Current implementation of rte_flow allows match pattern of flow rule,
> > > based on packet data or header fields.
> > > This limits the application use of match patterns.
> > >
> > > For example, consider a vswitch application which controls a set of
> > > VMs, connected with virtio, in a fabric with overlay of VXLAN.
> > > Several VMs can have the same inner tuple, while the outer tuple is
> > > different and controlled by the vswitch (encap action).
> > > For the vswtich to be able to offload the rule to the NIC, it must use
> > > a unique match criteria, independent from the inner tuple, to perform
> > > the encap action.
> > >
> > > This RFC adds support for additional metadata to use as match pattern.
> > > The metadata is an opaque item, fully controlled by the application.
> > >
> > > The use of metadata is relevant for egress rules only.
> > > It can be set in the flow rule using the RTE_FLOW_ITEM_META.
> > >
> > > In order to avoid change in mbuf API, exisitng field mbuf.hash.fdir.hi
> > > will be used to carry the metadata item. This field is used only in
> > > ingress packets, so using it for egress metadata will not cause
> conflicts.
> > >
> > > Application should set the packet metdata in the mbuf dedicated field,
> > > and set the PKT_TX_METADATA flag in the mbuf->ol_flags.
> > > The NIC will use the packet metadata as match criteria for relevant
> > > flow rules.
> > >
> > > For example, to do an encap action depending on the VM id, the
> > > application needs to configure 'match on metadata' rte_flow rule with
> > > VM id as metadata, along with desired encap action.
> > > When preparing an egress data packet, application will set VM id data
> > > in mbuf dedicated field, and set PKT_TX_METADATA flag.
> > >
> > > PMD will send data packets to NIC, with VM id as metadata.
> > > Egress flow on NIC will match metadata as done with other criteria.
> > > Upon match on metadata (VM id) the appropriate encap action will be
> > > performed.
> > >
> > > This RFC introduces metadata item type for rte_flow
> > > RTE_FLOW_ITEM_META, along with corresponding struct
> > rte_flow_item_meta
> > > and ol_flag PKT_TX_METADATA.
> > >
> > > Comments are welcome.
> > >
> > > Signed-off-by: Dekel Peled 
> > > ---
> > > v2: Use existing field in mbuf for metadata item, as suggested, instead
> > >of adding a new field.
> > >Metadata item size adjusted to 32 bits.
> > > ---
> > > doc/guides/prog_guide/rte_flow.rst | 21 +
> > > lib/librte_ethdev/rte_flow.c   |  1 +
> > > lib/librte_ethdev/rte_flow.h   | 25 +
> > > lib/librte_mbuf/rte_mbuf.h | 13 +
> > > 4 files changed, 60 insertions(+)
> > >
> > > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > > b/doc/guides/prog_guide/rte_flow.rst
> > > index b305a72..560e45a 100644
> > > --- a/doc/guides/prog_guide/rte_flow.rst
> > > +++ b/doc/guides/prog_guide/rte_flow.rst
> > > @@ -1191,6 +1191,27 @@ Normally preceded by any of:
> > > - `Item: ICMP6_ND_NS`_
> > > - `Item: ICMP6_ND_OPT`_
> > >
> > > +Item: ``META``
> > > +^^
> > > +
> > > +Matches an application specific 32 bit metadata item.
> > > +
> > > +- Default ``mask`` matches any 32 bit value.
> > > +
> > > +.. _table_rte_flow_item_meta:
> > > +
> > > +.. table:: META
> > > +
> > > +   +--+--+---+
> > > +   | Field| Subfield | Value |
> > > +   +==+==+===+
> > > +   | ``spec`` | ``data`` | 32 bit metadata value |
> > > +   +--+--+
> > > +   | ``last`` | ``data`` | upper range value |
> > > +   +--+--+---+
> > > +   | ``mask`` | ``data`` | zeroed to match any value |
> > > +   +--+--+---+
> > > +
> > > Actions
> > > ~~~
> > >
> > > diff --git a/lib/librte_ethdev/rte_flow.c
> > > b/lib/librte_ethdev/rte_flow.c index cff4b52..54e5ef8 100644
> > > --- a/lib/librte_ethdev/rte_flow.c
> > > +++ b/lib/librte_ethdev/rte_flow.c
> > > @@ -66,6 +66,7 @@ struct rte_flow_desc_data {
> > >  sizeof(struct rte_flo

[dpdk-dev] [PATCH] mem: fix undefined behavior in NUMA code

2018-08-29 Thread Anatoly Burakov
When NUMA-aware hugepages config option is set, we rely on
libnuma to tell the kernel to allocate hugepages on a specific
NUMA node. However, we allocate node mask before we check if
NUMA is available in the first place, which, according to
the manpage [1], causes undefined behaviour.

Fix by only using nodemask when we have NUMA available.

[1] https://linux.die.net/man/3/numa_alloc_onnode

Bugzilla ID: 20

Fixes: 1b72605d2416 ("mem: balanced allocation of hugepages")
Cc: i.maxim...@samsung.com
Cc: sta...@dpdk.org

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 28 ++--
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index dbf19499e..4976eeacd 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -263,7 +263,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct 
hugepage_info *hpi,
int node_id = -1;
int essential_prev = 0;
int oldpolicy;
-   struct bitmask *oldmask = numa_allocate_nodemask();
+   struct bitmask *oldmask = NULL;
bool have_numa = true;
unsigned long maxnode = 0;
 
@@ -275,6 +275,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct 
hugepage_info *hpi,
 
if (have_numa) {
RTE_LOG(DEBUG, EAL, "Trying to obtain current memory 
policy.\n");
+   oldmask = numa_allocate_nodemask();
if (get_mempolicy(&oldpolicy, oldmask->maskp,
  oldmask->size + 1, 0, 0) < 0) {
RTE_LOG(ERR, EAL,
@@ -390,19 +391,22 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, 
struct hugepage_info *hpi,
 
 out:
 #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
-   if (maxnode) {
-   RTE_LOG(DEBUG, EAL,
-   "Restoring previous memory policy: %d\n", oldpolicy);
-   if (oldpolicy == MPOL_DEFAULT) {
-   numa_set_localalloc();
-   } else if (set_mempolicy(oldpolicy, oldmask->maskp,
-oldmask->size + 1) < 0) {
-   RTE_LOG(ERR, EAL, "Failed to restore mempolicy: %s\n",
-   strerror(errno));
-   numa_set_localalloc();
+   if (have_numa) {
+   if (maxnode) {
+   RTE_LOG(DEBUG, EAL,
+   "Restoring previous memory policy: %d\n",
+   oldpolicy);
+   if (oldpolicy == MPOL_DEFAULT) {
+   numa_set_localalloc();
+   } else if (set_mempolicy(oldpolicy, oldmask->maskp,
+oldmask->size + 1) < 0) {
+   RTE_LOG(ERR, EAL, "Failed to restore mempolicy: 
%s\n",
+   strerror(errno));
+   numa_set_localalloc();
+   }
}
+   numa_free_cpumask(oldmask);
}
-   numa_free_cpumask(oldmask);
 #endif
return i;
 }
-- 
2.17.1


Re: [dpdk-dev] [PATCH v2] net/mlx: add meson build support

2018-08-29 Thread Luca Boccassi
On Wed, 2018-08-29 at 13:59 +0200, Nélio Laranjeiro wrote:
> On Wed, Aug 29, 2018 at 11:00:54AM +0100, Luca Boccassi wrote:
> > On Tue, 2018-08-28 at 16:45 +0100, Bruce Richardson wrote:
> > > Thanks for this, comments inline below.
> > > 
> > > /Bruce
> > > 
> > > On Mon, Aug 27, 2018 at 02:42:25PM +0200, Nelio Laranjeiro wrote:
> > > > Mellanox drivers remains un-compiled by default due to third
> > > > party
> > > > libraries dependencies.  They can be enabled through:
> > > > - enable_driver_mlx{4,5}=true or
> > > > - enable_driver_mlx{4,5}_glue=true
> > > > depending on the needs.
> > > 
> > > The big reason why we wanted a new build system was to move away
> > > from
> > > this
> > > sort of static configuration. Instead, detect if the requirements
> > > as
> > > present and build the driver if you can.
> > > 
> > > > 
> > > > To avoid modifying the whole sources and keep the compatibility
> > > > with
> > > > current build systems (e.g. make), the mlx{4,5}_autoconf.h is
> > > > still
> > > > generated by invoking DPDK scripts though meson's run_command()
> > > > instead
> > > > of using has_types, has_members, ... commands.
> > > > 
> > > > Meson will try to find the required external libraries.  When
> > > > they
> > > > are
> > > > not installed system wide, they can be provided though CFLAGS,
> > > > LDFLAGS
> > > > and LD_LIBRARY_PATH environment variables, example (considering
> > > > RDMA-Core is installed in /tmp/rdma-core):
> > > > 
> > > >  # CLFAGS=-I/tmp/rdma-core/build/include \
> > > >    LDFLAGS=-L/tmp/rdma-core/build/lib \
> > > >    LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> > > >    meson -Denable_driver_mlx4=true output
> > > > 
> > > >  # CLFAGS=-I/tmp/rdma-core/build/include \
> > > >    LDFLAGS=-L/tmp/rdma-core/build/lib \
> > > >    LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> > > >    ninja -C output install
> > > 
> > > Once the CFLAGS/LDFLAGS are passed to meson, they should not be
> > > needed for
> > > ninja. The LD_LIBRARY_PATH might be - I'm not sure about that
> > > one! :-
> > > )
> > > 
> > > > 
> > > > Signed-off-by: Nelio Laranjeiro 
> > > > 
> > > > ---
> > > > 
> > > > Changes in v2:
> > > > 
> > > > - dropped patch https://patches.dpdk.org/patch/43897/
> > > > - remove extra_{cflags,ldflags} as already honored by meson
> > > > through
> > > > environment variables.
> > > > ---
> > > >  drivers/net/meson.build  |   2 +
> > > >  drivers/net/mlx4/meson.build |  94 ++
> > > >  drivers/net/mlx5/meson.build | 545
> > > > +++
> > > >  meson_options.txt|   8 +
> > > >  4 files changed, 649 insertions(+)
> > > >  create mode 100644 drivers/net/mlx4/meson.build
> > > >  create mode 100644 drivers/net/mlx5/meson.build
> > > > 
> > > > diff --git a/drivers/net/meson.build b/drivers/net/meson.build
> > > > index 9c28ed4da..c7a2d0e7d 100644
> > > > --- a/drivers/net/meson.build
> > > > +++ b/drivers/net/meson.build
> > > > @@ -18,6 +18,8 @@ drivers = ['af_packet',
> > > >     'ixgbe',
> > > >     'kni',
> > > >     'liquidio',
> > > > +   'mlx4',
> > > > +   'mlx5',
> > > >     'mvpp2',
> > > >     'netvsc',
> > > >     'nfp',
> > > > diff --git a/drivers/net/mlx4/meson.build
> > > > b/drivers/net/mlx4/meson.build
> > > > new file mode 100644
> > > > index 0..debaca5b6
> > > > --- /dev/null
> > > > +++ b/drivers/net/mlx4/meson.build
> > > > @@ -0,0 +1,94 @@
> > > > +# SPDX-License-Identifier: BSD-3-Clause
> > > > +# Copyright 2018 6WIND S.A.
> > > > +# Copyright 2018 Mellanox Technologies, Ltd
> > > > +
> > > > +# As there is no more configuration file to activate/configure
> > > > the
> > > > PMD it will
> > > > +# use some variables here to configure it.
> > > > +pmd_dlopen = get_option('enable_driver_mlx4_glue')
> > > > +build = get_option('enable_driver_mlx4') or pmd_dlopen
> > > 
> > > As stated above, I believe this should be based upon whether you
> > > find
> > > the
> > > "mnl", "mlx4" and "ibverbs" libraries. If we start adding back in
> > > static
> > > options for every driver, then we'll be back to having a mass of
> > > config
> > > options like we had before.
> > 
> > BTW, slightly related to that: ibverbs doesn't ship pkg-config
> > files at
> > the moment which makes the detection slightly more awkward that it
> > could be, so I've sent a PR upstream to add that:
> > 
> > https://github.com/linux-rdma/rdma-core/pull/373
> > 
> > Hope this can be useful!
> 
> Thanks Luca, I was also searching for it, you save me some time, I
> hope
> this can be backported down to RDMA-Core's stable version v15 of
> RDMA-Core it would fully help.

With a quick glance at the v15 branch, the CMake files look similar
enough that it should be pretty straightforward to backport.

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH 2/7] pci/vfio: improve musl compatibility

2018-08-29 Thread Bruce Richardson
On Wed, Aug 29, 2018 at 12:56:16PM +0100, Anatoly Burakov wrote:
> Musl already has PAGE_SIZE defined, and our define clashed with it.
> Rename our define to SYS_PAGE_SIZE.
> 
> Bugzilla ID: 36
> 
> Signed-off-by: Anatoly Burakov 
> ---

Would it not be easier to just do?

#ifndef PAGE_SIZE
#define PAGE_SIZE ...
#endif
>  drivers/bus/pci/linux/pci_vfio.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/bus/pci/linux/pci_vfio.c 
> b/drivers/bus/pci/linux/pci_vfio.c
> index 686386d6a..88bcfb88b 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -35,8 +35,8 @@
>  
>  #ifdef VFIO_PRESENT
>  
> -#define PAGE_SIZE   (sysconf(_SC_PAGESIZE))
> -#define PAGE_MASK   (~(PAGE_SIZE - 1))
> +#define SYS_PAGE_SIZE   (sysconf(_SC_PAGESIZE))
> +#define SYS_PAGE_MASK   (~(SYS_PAGE_SIZE - 1))
>  
>  static struct rte_tailq_elem rte_vfio_tailq = {
>   .name = "VFIO_RESOURCE_LIST",
> @@ -344,8 +344,8 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct 
> mapped_pci_resource *vfio_res,
>*/
>   uint32_t table_start = msix_table->offset;
>   uint32_t table_end = table_start + msix_table->size;
> - table_end = (table_end + ~PAGE_MASK) & PAGE_MASK;
> - table_start &= PAGE_MASK;
> + table_end = (table_end + ~SYS_PAGE_MASK) & SYS_PAGE_MASK;
> + table_start &= SYS_PAGE_MASK;
>  
>   if (table_start == 0 && table_end >= bar->size) {
>   /* Cannot map this BAR */
> -- 
> 2.17.1


Re: [dpdk-dev] [PATCH 7/7] eal: improve musl compatibility

2018-08-29 Thread Bruce Richardson
On Wed, Aug 29, 2018 at 12:56:21PM +0100, Anatoly Burakov wrote:
> Musl complains about pthread id being of wrong size. Fix it by
> casting to 64-bit and printing 64-bit hex unconditionally.
> 
> Signed-off-by: Anatoly Burakov 
> ---
Given that on linux pthread_t is a pointer type, will this not give other
warnings of casting from pointer to integer of a different type when
compiling 32-bit? For safety I suggest casting to long or uintptr_t
instead, to ensure we always get an int of the right size.

/Bruce


Re: [dpdk-dev] [PATCH 7/7] eal: improve musl compatibility

2018-08-29 Thread Bruce Richardson
On Wed, Aug 29, 2018 at 01:39:26PM +0100, Bruce Richardson wrote:
> On Wed, Aug 29, 2018 at 12:56:21PM +0100, Anatoly Burakov wrote:
> > Musl complains about pthread id being of wrong size. Fix it by
> > casting to 64-bit and printing 64-bit hex unconditionally.
> > 
> > Signed-off-by: Anatoly Burakov 
> > ---
> Given that on linux pthread_t is a pointer type, will this not give other
> warnings of casting from pointer to integer of a different type when

s/type/size/

> compiling 32-bit? For safety I suggest casting to long or uintptr_t
> instead, to ensure we always get an int of the right size.
> 
> /Bruce


Re: [dpdk-dev] [PATCH v2] net/mlx: add meson build support

2018-08-29 Thread Nélio Laranjeiro
On Wed, Aug 29, 2018 at 11:01:15AM +0100, Bruce Richardson wrote:
> On Wed, Aug 29, 2018 at 11:34:10AM +0200, Nélio Laranjeiro wrote:
> > Hi Bruce,
> > 
> > Thanks for your comments I have address almost all of them in the v3 by
> > doing what you suggest, I still have some comments, please see below,
> > 
> 
> Thanks.
> 
> > On Tue, Aug 28, 2018 at 04:45:00PM +0100, Bruce Richardson wrote:
> > > Thanks for this, comments inline below.
> > > 
> > > /Bruce
> > > 
> > > On Mon, Aug 27, 2018 at 02:42:25PM +0200, Nelio Laranjeiro wrote:
> > > > Mellanox drivers remains un-compiled by default due to third party
> > > > libraries dependencies.  They can be enabled through:
> > > > - enable_driver_mlx{4,5}=true or
> > > > - enable_driver_mlx{4,5}_glue=true
> > > > depending on the needs.
> > > 
> > > The big reason why we wanted a new build system was to move away from this
> > > sort of static configuration. Instead, detect if the requirements as
> > > present and build the driver if you can.
> > 
> > Ok, I am letting only the glue option for both drivers as suggested at
> > the end of your answer.
> > 
> > > > To avoid modifying the whole sources and keep the compatibility with
> > > > current build systems (e.g. make), the mlx{4,5}_autoconf.h is still
> > > > generated by invoking DPDK scripts though meson's run_command() instead
> > > > of using has_types, has_members, ... commands.
> > > > 
> > > > Meson will try to find the required external libraries.  When they are
> > > > not installed system wide, they can be provided though CFLAGS, LDFLAGS
> > > > and LD_LIBRARY_PATH environment variables, example (considering
> > > > RDMA-Core is installed in /tmp/rdma-core):
> > > > 
> > > >  # CLFAGS=-I/tmp/rdma-core/build/include \
> > > >LDFLAGS=-L/tmp/rdma-core/build/lib \
> > > >LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> > > >meson -Denable_driver_mlx4=true output
> > > > 
> > > >  # CLFAGS=-I/tmp/rdma-core/build/include \
> > > >LDFLAGS=-L/tmp/rdma-core/build/lib \
> > > >LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
> > > >ninja -C output install
> > > 
> > > Once the CFLAGS/LDFLAGS are passed to meson, they should not be needed for
> > > ninja. The LD_LIBRARY_PATH might be - I'm not sure about that one! :-)
> > 
> > CFLAGS/LDFLAGS are correctly evaluated and inserted in the build.ninja
> > file, for the LD_LIBRARY_PATH, it is necessary for the run_command stuff
> > generating the mlx*_autoconf.h
> > 
> 
> Just realised there is another issue which you should address. The
> mlx*_autoconf.h files are being written into a source folder rather than
> into the destination folder. This could cause problems if we are enabling
> mlx for cross-compile builds. Perhaps inside the auto-config-h.sh script
> you can check for $MESON_BUILD_ROOT value, and use that (and possibly
> $MESON_SUBDIR) to put the header file in the build directory.

Indeed, I was searching also for a solution, I finally found it without
modifying the shell script by using meson.current_build_dir() which
contains the same path has $MESON_BUILD_ROOT/$MESON_SUBDIR.

> > >[...] 
> > > Rather than having your own separate debug option flag, why not set these
> > > based on the "buildtype" option e.g. if buildtype is set to "debug".
> > > 
> > > > +# To maintain the compatibility with the make build system
> > > > +# mlx4_autoconf.h file is still generated.
> > > > +r = run_command('sh', '../../../buildtools/auto-config-h.sh',
> > > > +'mlx4_autoconf.h',
> > > > +'HAVE_IBV_MLX4_WQE_LSO_SEG',
> > > > +'infiniband/mlx4dv.h',
> > > > +'type', 'struct mlx4_wqe_lso_seg')
> > > > +if r.returncode() != 0
> > > > +error('autoconfiguration fail')
> > > > +endif
> > > 
> > > Just to check that you are ok with this only being run at configure time?
> > > If any changes are made to the inputs, ninja won't pick them up. To have 
> > > it
> > > tracked for input changes, "custom_target" should be used instead of
> > > run_command.
> > 
> > It seems to not be possible to have several custom_target on the same
> > output file has this last is used as the target identifier in ninja.
> > 
> > This limitation is acceptable for now, when meson will be the default
> > build system, then such autoconf can be removed to use meson built-in
> > functions.
> > 
> > > > +endif
> > > > +# Build Glue Library
> > > > +if pmd_dlopen
> > > > +dlopen_name = 'mlx4_glue'
> > > > +dlopen_lib_name = driver_name_fmt.format(dlopen_name)
> > > > +dlopen_so_version = LIB_GLUE_VERSION
> > > > +dlopen_sources = files('mlx4_glue.c')
> > > > +dlopen_install_dir = [ eal_pmd_path + '-glue' ]
> > > > +shared_lib = shared_library(
> > > > +   dlopen_lib_name,
> > > > +   dlopen_sources,
> > > > +   include_directories: global_inc,
> > > > +

Re: [dpdk-dev] [PATCH] mem: fix undefined behavior in NUMA code

2018-08-29 Thread Ilya Maximets
Hi.
Thanks for the fix.
Comments inline.

Best regards, Ilya Maximets.

On 29.08.2018 15:21, Anatoly Burakov wrote:
> When NUMA-aware hugepages config option is set, we rely on
> libnuma to tell the kernel to allocate hugepages on a specific
> NUMA node. However, we allocate node mask before we check if
> NUMA is available in the first place, which, according to
> the manpage [1], causes undefined behaviour.
> 
> Fix by only using nodemask when we have NUMA available.
> 
> [1] https://linux.die.net/man/3/numa_alloc_onnode
> 
> Bugzilla ID: 20
> 
> Fixes: 1b72605d2416 ("mem: balanced allocation of hugepages")
> Cc: i.maxim...@samsung.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Anatoly Burakov 
> ---
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 28 ++--
>  1 file changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
> b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index dbf19499e..4976eeacd 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -263,7 +263,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, 
> struct hugepage_info *hpi,
>   int node_id = -1;
>   int essential_prev = 0;
>   int oldpolicy;
> - struct bitmask *oldmask = numa_allocate_nodemask();
> + struct bitmask *oldmask = NULL;
>   bool have_numa = true;
>   unsigned long maxnode = 0;
>  
> @@ -275,6 +275,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, 
> struct hugepage_info *hpi,
>  
>   if (have_numa) {
>   RTE_LOG(DEBUG, EAL, "Trying to obtain current memory 
> policy.\n");
> + oldmask = numa_allocate_nodemask();
>   if (get_mempolicy(&oldpolicy, oldmask->maskp,
> oldmask->size + 1, 0, 0) < 0) {
>   RTE_LOG(ERR, EAL,
> @@ -390,19 +391,22 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, 
> struct hugepage_info *hpi,
>  
>  out:
>  #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
> - if (maxnode) {
> - RTE_LOG(DEBUG, EAL,
> - "Restoring previous memory policy: %d\n", oldpolicy);
> - if (oldpolicy == MPOL_DEFAULT) {
> - numa_set_localalloc();
> - } else if (set_mempolicy(oldpolicy, oldmask->maskp,
> -  oldmask->size + 1) < 0) {
> - RTE_LOG(ERR, EAL, "Failed to restore mempolicy: %s\n",
> - strerror(errno));
> - numa_set_localalloc();
> + if (have_numa) {
> + if (maxnode) {
> + RTE_LOG(DEBUG, EAL,
> + "Restoring previous memory policy: %d\n",
> + oldpolicy);
> + if (oldpolicy == MPOL_DEFAULT) {
> + numa_set_localalloc();
> + } else if (set_mempolicy(oldpolicy, oldmask->maskp,
> +  oldmask->size + 1) < 0) {
> + RTE_LOG(ERR, EAL, "Failed to restore mempolicy: 
> %s\n",
> + strerror(errno));
> + numa_set_localalloc();
> + }
>   }
> + numa_free_cpumask(oldmask);
>   }
> - numa_free_cpumask(oldmask);

The original intend was to avoid ugly nested 'if's as possible.
'maxnode' is only initialized in NUMA case. So, there is no need
to check for 'has_numa'. 'numa_free_cpumask' has 'free' semantics
and checks for the argument. It is safe to call it with NULL.
If you want to be fully compliant with man page, you may use less
invasive change like this:

---
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index dbf19499e..d0b9f3a2f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -390,7 +390,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct 
hugepage_info *hpi,
 
 out:
 #ifdef RTE_EAL_NUMA_AWARE_HUGEPAGES
-   if (maxnode) {
+   if (have_numa && maxnode) {
RTE_LOG(DEBUG, EAL,
"Restoring previous memory policy: %d\n", oldpolicy);
if (oldpolicy == MPOL_DEFAULT) {
@@ -402,7 +402,8 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct 
hugepage_info *hpi,
numa_set_localalloc();
}
}
-   numa_free_cpumask(oldmask);
+   if (oldmask)
+   numa_free_cpumask(oldmask);
 #endif
return i;
 }
---

But still, checking both 'have_numa && maxnode', IMHO, is unnecessary.

As this change is cosmetic (issue doesn't produce any real bug),
I'd like to avoid changing the functional code to something less readable.
This also will complicate 'git blame' process.

What do you think?

>  #endif
>   return i;
>  }
> 


Re: [dpdk-dev] 18.08 build error on ppc64el - bool as vector type

2018-08-29 Thread Adrien Mazarguil
On Wed, Aug 29, 2018 at 10:27:03AM +0200, Christian Ehrhardt wrote:
> On Tue, Aug 28, 2018 at 5:02 PM Adrien Mazarguil 
> wrote:
> 
> > On Tue, Aug 28, 2018 at 02:38:35PM +0200, Christian Ehrhardt wrote:

> > > --- a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> > > +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> > > @@ -36,6 +36,14 @@
> > > #include 
> > > #include 
> > > /*To include altivec.h, GCC version must  >= 4.8 */
> > > +/*
> > > + * If built with std=c11 stdbool and altivec bool will conflict.
> > > + * The altivec bool type is not needed at the moment, to avoid the
> > conflict
> > > + * define __APPLE_ALTIVEC__ so that the conflict will not happen.
> > > + */
> > > +#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> > > !defined(__APPLE_ALTIVEC__)
> > > +#define __APPLE_ALTIVEC__
> > > +#endif
> > > #include 
> > >
> > > #ifdef __cplusplus
> > >
> > > But it turned out we are not allowed to switch of other things as vector
> > > (and probably some more code than the type) is actually used:
> > > With your suggestion or mine above it will break on:
> > >
> > > x5.o -c /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.c
> > > In file included from
> > /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5_prm.h:21,
> > > from
> > /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5_rxtx.h:37,
> > > from /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.h:36,
> > > from /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.c:42:
> > > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_vect.h:43:15:
> > error:
> > > expected ‘;’ before ‘signed’
> > > typedef vector signed int xmm_t;
> > >   ^~~
> > >   ;
> > > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_vect.h:49:2:
> > error:
> > > expected specifier-qualifier-list before ‘xmm_t’
> > >  xmm_tx;
> > >  ^
> > >
> > > I have no much better suggestion for the ordering issue that you raised.
> > > To test what would happen I moved the stdbool include after all other
> > > includes in drivers/net/mlx5/mlx5_nl.c
> > > I also moved mlx5.h (which eventually brings in altivec) right at the
> > top.
> > > This works to build, but such a check is always subtle as one of the
> > other
> > > includes might have pulled in stdbool before altivec still.
> > > For a bit of confidence I picked said gcc call and ran it with -E.
> > > The output suggests altivec really was included before stdbool.
> >
> > How about making altivec.h users (rte_vect.h and rte_memcpy.h) rely on
> > "__vector" directly instead of the "vector" macro to make it transparent
> > for
> > others then?
> >
> > I think we can assume they have internal knowledge of this file in order to
> > deal with __APPLE_ALTIVEC__ anyway.
> >
> 
> While "pushing the internal knowledge out to users" sounds right at first.
> There are far too many IMHO, the change would be huge unclean and messy.
> 
> $ grep -Hrn altivec.h
> drivers/net/i40e/i40e_rxtx_vec_altivec.c:45:#include 
> examples/l3fwd/l3fwd_lpm.c:165:#include "l3fwd_lpm_altivec.h"
> examples/l3fwd/l3fwd_lpm_altivec.h:10:#include "l3fwd_altivec.h"
> MAINTAINERS:239:F: examples/l3fwd/*altivec.h
> lib/librte_acl/acl_run_altivec.c:34:#include "acl_run_altivec.h"
> lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h:49:/*To include
> altivec.h, GCC version must  >= 4.8 */
> lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h:50:#include <
> altivec.h>
> lib/librte_eal/common/include/arch/ppc_64/rte_vect.h:36:#include 
> 
> lib/librte_lpm/meson.build:9:headers += files('rte_lpm_altivec.h',
> 'rte_lpm_neon.h', 'rte_lpm_sse.h')
> lib/librte_lpm/Makefile:28:SYMLINK-$(CONFIG_RTE_LIBRTE_LPM)-include +=
> rte_lpm_altivec.h
> lib/librte_lpm/rte_lpm.h:461:#include "rte_lpm_altivec.h"

I'd still like to give it a try given only knwon users of AltiVec code may
rely on these vector/pixel/bool definitions. Scope should be quite small.

The root issue we need to address is that DPDK applications may
involuntarily pull altivec.h by including something unrelated (rte_memcpy.h)
and get unwanted bool/vector/pixel macros polluting their namespace and
breaking things.

> > Also I would suggest not to make this workaround C11-only. I suspect the
> > same issue will be encountered with -std=c99 or -std=c90. Keep in mind DPDK
> > applications are free to specify their own CFLAGS.
> >
> 
> Yeah Independent to the other part of the discussion I think we can make it
> generally apply and not just C11.
> 
> The following "would work" in the code right now.
> --- a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> @@ -35,6 +35,21 @@
> 
> #include 
> #include 
> +/*
> + * if built with newer C standard like -std=c11 stdbool.h bool and altivec
> + * bool types will conflict. We have to force altivec users (rte_vect.h and
> + * rte_memcpy.h) rely on __vector implying internal altivec knowledge to
> the
> + *

Re: [dpdk-dev] 16.11.8 (LTS) patches review and test

2018-08-29 Thread Marco Varlese
Hi Luca & all,

I confirm tests carried out via test_pmd and Ovs-DPDK do not show any regression
for us.


Cheers,
Marco

On Wed, 2018-08-29 at 10:55 +0100, Luca Boccassi wrote:
> On Mon, 2018-08-27 at 17:17 +0100, Luca Boccassi wrote:
> > On Thu, 2018-08-23 at 09:55 +0100, Luca Boccassi wrote:
> > > On Mon, 2018-08-13 at 19:21 +0100, luca.bocca...@gmail.com wrote:
> > > > Hi all,
> > > > 
> > > > Here is a list of patches targeted for LTS release 16.11.8.
> > > > Please
> > > > help review and test. The planned date for the final release is
> > > > August
> > > > the 23rd.
> > > > Before that, please shout if anyone has objections with these
> > > > patches being applied.
> > > > 
> > > > Also for the companies committed to running regression tests,
> > > > please run the tests and report any issue before the release
> > > > date.
> > > > 
> > > > A release candidate tarball can be found at:
> > > > 
> > > > https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc1
> > > > 
> > > > These patches are located at branch 16.11 of dpdk-stable repo:
> > > > https://dpdk.org/browse/dpdk-stable/
> > > > 
> > > > Thanks.
> > > > 
> > > > Luca Boccassi
> > > 
> > > Hi,
> > > 
> > > Regression tests from Intel have highlighted a possible issue with
> > > the
> > > changes (unidentified as of now), so while investigation is in
> > > progress
> > > we decided to postpone the release to Monday the 27th to be on the
> > > safe
> > > side.
> > > Apologies for any issues this might cause.
> > 
> > Hi,
> > 
> > Unfortunately triaging is still in progress, so it's better to
> > postpone
> > again, to Wednesday the 29th of August.
> > Apologies again for any issues due to this delay.
> 
> Hello all,
> 
> I've pushed an -rc2 with the following additional changes:
> 
> Luca Boccassi (1):
>   Revert "net/i40e: fix packet count for PF"
> 
> Radu Nicolau (3):
>   net/null: add MAC address setting fake operation
>   test/virtual_pmd: add MAC address setting fake op
>   test/bonding: assign non-zero MAC to null devices
> 
> Radu, I cherry-picked the following 3 patches that you got merged in
> 18.02 as they are necessary to fix bonding regression tests from Intel:
> 
> c5ac7748fd6bfd86b6fb4432b6792733cf32c94c
> c23fc36284e26fca9b52641118ad76a4da99d7af
> e8df563bac263e55b7dd9d45a00417aa92ef66cb
> 
> Qi, I have reverted the following patch that was backported to 16.11.4
> as it breaks a Fortville regression test from Intel:
> 
> 4bf705a7d74b0b4c1d82ad0821c43e32be15a5e5.
> 
> Marco, is there any chance you've got time today to re-run your tests?
> These changes in rc2 have been blessed by Intel and AT&T, so if it
> works for you as well I can then release later tonight.
> 
> A release candidate tarball can be found at:
> 
> https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc2
> 
> These patches are located at branch 16.11 of dpdk-stable repo:
> https://dpdk.org/browse/dpdk-stable/
> 
> Thanks!
> 


Re: [dpdk-dev] [dpdk-stable] 16.11.8 (LTS) patches review and test

2018-08-29 Thread Luca Boccassi
Great, thank you very much.

I'll push the release later tonight.

On Wed, 2018-08-29 at 15:28 +0200, Marco Varlese wrote:
> Hi Luca & all,
> 
> I confirm tests carried out via test_pmd and Ovs-DPDK do not show any
> regression
> for us.
> 
> 
> Cheers,
> Marco
> 
> On Wed, 2018-08-29 at 10:55 +0100, Luca Boccassi wrote:
> > On Mon, 2018-08-27 at 17:17 +0100, Luca Boccassi wrote:
> > > On Thu, 2018-08-23 at 09:55 +0100, Luca Boccassi wrote:
> > > > On Mon, 2018-08-13 at 19:21 +0100, luca.bocca...@gmail.com
> > > > wrote:
> > > > > Hi all,
> > > > > 
> > > > > Here is a list of patches targeted for LTS release 16.11.8.
> > > > > Please
> > > > > help review and test. The planned date for the final release
> > > > > is
> > > > > August
> > > > > the 23rd.
> > > > > Before that, please shout if anyone has objections with these
> > > > > patches being applied.
> > > > > 
> > > > > Also for the companies committed to running regression tests,
> > > > > please run the tests and report any issue before the release
> > > > > date.
> > > > > 
> > > > > A release candidate tarball can be found at:
> > > > > 
> > > > > https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc1
> > > > > 
> > > > > These patches are located at branch 16.11 of dpdk-stable
> > > > > repo:
> > > > > https://dpdk.org/browse/dpdk-stable/
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > > Luca Boccassi
> > > > 
> > > > Hi,
> > > > 
> > > > Regression tests from Intel have highlighted a possible issue
> > > > with
> > > > the
> > > > changes (unidentified as of now), so while investigation is in
> > > > progress
> > > > we decided to postpone the release to Monday the 27th to be on
> > > > the
> > > > safe
> > > > side.
> > > > Apologies for any issues this might cause.
> > > 
> > > Hi,
> > > 
> > > Unfortunately triaging is still in progress, so it's better to
> > > postpone
> > > again, to Wednesday the 29th of August.
> > > Apologies again for any issues due to this delay.
> > 
> > Hello all,
> > 
> > I've pushed an -rc2 with the following additional changes:
> > 
> > Luca Boccassi (1):
> >   Revert "net/i40e: fix packet count for PF"
> > 
> > Radu Nicolau (3):
> >   net/null: add MAC address setting fake operation
> >   test/virtual_pmd: add MAC address setting fake op
> >   test/bonding: assign non-zero MAC to null devices
> > 
> > Radu, I cherry-picked the following 3 patches that you got merged
> > in
> > 18.02 as they are necessary to fix bonding regression tests from
> > Intel:
> > 
> > c5ac7748fd6bfd86b6fb4432b6792733cf32c94c
> > c23fc36284e26fca9b52641118ad76a4da99d7af
> > e8df563bac263e55b7dd9d45a00417aa92ef66cb
> > 
> > Qi, I have reverted the following patch that was backported to
> > 16.11.4
> > as it breaks a Fortville regression test from Intel:
> > 
> > 4bf705a7d74b0b4c1d82ad0821c43e32be15a5e5.
> > 
> > Marco, is there any chance you've got time today to re-run your
> > tests?
> > These changes in rc2 have been blessed by Intel and AT&T, so if it
> > works for you as well I can then release later tonight.
> > 
> > A release candidate tarball can be found at:
> > 
> > https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc2
> > 
> > These patches are located at branch 16.11 of dpdk-stable repo:
> > https://dpdk.org/browse/dpdk-stable/
> > 
> > Thanks!
> > 

-- 
Kind regards,
Luca Boccassi


[dpdk-dev] [PATCH v3] net/mlx: add meson build support

2018-08-29 Thread Nelio Laranjeiro
Compile Mellanox drivers when their external dependencies are met.  A
glue version of the driver can still be requested by using the
-Denable_driver_mlx_glue=true

To avoid modifying the whole sources and keep the compatibility with
current build systems (e.g. make), the mlx{4,5}_autoconf.h is still
generated by invoking DPDK scripts though meson's run_command() instead
of using has_types, has_members, ... commands.

Meson will try to find the required external libraries.  When they are
not installed system wide, they can be provided though CFLAGS, LDFLAGS
and LD_LIBRARY_PATH environment variables, example (considering
RDMA-Core is installed in /tmp/rdma-core):

 # CLFAGS=-I/tmp/rdma-core/build/include \
   LDFLAGS=-L/tmp/rdma-core/build/lib \
   LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
   meson output
 # LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
   ninja -C output install

Note: LD_LIBRARY_PATH before ninja is necessary when the meson
configuration has changed (e.g. meson configure has been called), in
such situation the LD_LIBRARY_PATH is necessary to invoke the
autoconfiguration script.

Signed-off-by: Nelio Laranjeiro 

---

Changes in v3:

Sanitize the build files:
- remove enable_driver_mlx{4,5} options,
- test cflags capabilities before using them,
- remove old autoconfiguration file,
- use an array for autoconfiguration and put them in the build directory,
- use dependencies in shared_library for link arguments.

Changes in v2:

- dropped patch https://patches.dpdk.org/patch/43897/
- remove extra_{cflags,ldflags} as already honored by meson through
environment variables.
---
 drivers/net/meson.build  |   2 +
 drivers/net/mlx4/meson.build |  97 +++
 drivers/net/mlx5/meson.build | 232 +++
 meson_options.txt|   2 +
 4 files changed, 333 insertions(+)
 create mode 100644 drivers/net/mlx4/meson.build
 create mode 100644 drivers/net/mlx5/meson.build

diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 9c28ed4da..c7a2d0e7d 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -18,6 +18,8 @@ drivers = ['af_packet',
'ixgbe',
'kni',
'liquidio',
+   'mlx4',
+   'mlx5',
'mvpp2',
'netvsc',
'nfp',
diff --git a/drivers/net/mlx4/meson.build b/drivers/net/mlx4/meson.build
new file mode 100644
index 0..6b5460b91
--- /dev/null
+++ b/drivers/net/mlx4/meson.build
@@ -0,0 +1,97 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018 6WIND S.A.
+# Copyright 2018 Mellanox Technologies, Ltd
+
+pmd_dlopen = get_option('enable_driver_mlx_glue')
+LIB_GLUE_BASE = 'librte_pmd_mlx4_glue.so'
+LIB_GLUE_VERSION = '18.02.0'
+LIB_GLUE = LIB_GLUE_BASE + '.' + LIB_GLUE_VERSION
+if pmd_dlopen
+dpdk_conf.set('RTE_LIBRTE_MLX4_DLOPEN_DEPS', 1)
+cflags += [
+'-DMLX4_GLUE="@0@"'.format(LIB_GLUE),
+'-DMLX4_GLUE_VERSION="@0@"'.format(LIB_GLUE_VERSION),
+]
+endif
+libs = [
+cc.find_library('mnl', required:false),
+cc.find_library('mlx4', required:false),
+cc.find_library('ibverbs', required:false),
+]
+build = true
+foreach lib:libs
+if not lib.found()
+build = false
+endif
+endforeach
+# Compile PMD
+if build
+allow_experimental_apis = true
+ext_deps += libs
+sources = files(
+   'mlx4.c',
+   'mlx4_ethdev.c',
+   'mlx4_flow.c',
+   'mlx4_intr.c',
+   'mlx4_mr.c',
+   'mlx4_rxq.c',
+   'mlx4_rxtx.c',
+   'mlx4_txq.c',
+   'mlx4_utils.c',
+)
+if not pmd_dlopen
+sources += files('mlx4_glue.c')
+endif
+cflags_options = [
+'-Wall',
+'-Wextra',
+'-std=c11',
+'-Wno-strict-prototypes',
+'-D_BSD_SOURCE',
+'-D_DEFAULT_SOURCE',
+'-D_XOPEN_SOURCE=600'
+]
+foreach option:cflags_options
+if cc.has_argument(option)
+cflags += option
+endif
+endforeach
+if get_option('buildtype').contains('debug')
+cflags += [ '-pedantic', '-UNDEBUG', '-DPEDANTIC' ]
+else
+cflags += [ '-DNDEBUG', '-UPEDANTIC' ]
+endif
+# To maintain the compatibility with the make build system
+# mlx4_autoconf.h file is still generated.
+run_command('rm', '-f', meson.current_build_dir() + '/mlx4_autoconf.h')
+r = run_command('sh', '../../../buildtools/auto-config-h.sh',
+meson.current_build_dir() + '/mlx4_autoconf.h',
+'HAVE_IBV_MLX4_WQE_LSO_SEG',
+'infiniband/mlx4dv.h',
+'type', 'struct mlx4_wqe_lso_seg')
+if r.returncode() != 0
+error('autoconfigur

Re: [dpdk-dev] [PATCH] vhost: fix crash if set vring num handling failed

2018-08-29 Thread Ilya Maximets
Any thoughts on this?

Best regards, Ilya Maximets.

On 17.08.2018 14:33, Ilya Maximets wrote:
> Allocation failures of shadow used ring and batched copy array
> are not recoverable and leads to the segmentation faults like
> this on the receiving/transmission path:
> 
>   Program received signal SIGSEGV, Segmentation fault.
>   [Switching to Thread 0x7f913fecf0 (LWP 43625)]
>   in copy_desc_to_mbuf () at /lib/librte_vhost/virtio_net.c:760
>   760   batch_copy[vq->batch_copy_nb_elems].dst =
> 
> This could be easily reproduced in case of low memory or big
> number of vhost-user ports. Fix that by propagating error to
> the upper layer which will end up with disconnection.
> 
> Fixes: f689586bc060 ("vhost: shadow used ring update")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ilya Maximets 
> ---
>  lib/librte_vhost/vhost_user.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index 9aa1ce118..4c7fd57fb 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -1693,7 +1693,9 @@ vhost_user_msg_handler(int vid, int fd)
>   break;
>  
>   case VHOST_USER_SET_VRING_NUM:
> - vhost_user_set_vring_num(dev, &msg);
> + ret = vhost_user_set_vring_num(dev, &msg);
> + if (ret)
> + return -1;
>   break;
>   case VHOST_USER_SET_VRING_ADDR:
>   vhost_user_set_vring_addr(&dev, &msg);
> 


Re: [dpdk-dev] [PATCH 7/7] eal: improve musl compatibility

2018-08-29 Thread Burakov, Anatoly

On 29-Aug-18 1:39 PM, Bruce Richardson wrote:

On Wed, Aug 29, 2018 at 12:56:21PM +0100, Anatoly Burakov wrote:

Musl complains about pthread id being of wrong size. Fix it by
casting to 64-bit and printing 64-bit hex unconditionally.

Signed-off-by: Anatoly Burakov 
---

Given that on linux pthread_t is a pointer type, will this not give other
warnings of casting from pointer to integer of a different type when
compiling 32-bit? For safety I suggest casting to long or uintptr_t
instead, to ensure we always get an int of the right size.

/Bruce



Sure, will fix.

--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH 2/7] pci/vfio: improve musl compatibility

2018-08-29 Thread Burakov, Anatoly

On 29-Aug-18 1:35 PM, Bruce Richardson wrote:

On Wed, Aug 29, 2018 at 12:56:16PM +0100, Anatoly Burakov wrote:

Musl already has PAGE_SIZE defined, and our define clashed with it.
Rename our define to SYS_PAGE_SIZE.

Bugzilla ID: 36

Signed-off-by: Anatoly Burakov 
---


Would it not be easier to just do?

#ifndef PAGE_SIZE
#define PAGE_SIZE ...
#endif


Sure, that can work.

--
Thanks,
Anatoly


Re: [dpdk-dev] [dpdk-stable] [PATCH v4] net/bonding: per-slave intermediate rx ring

2018-08-29 Thread Chas Williams
On Tue, Aug 28, 2018 at 5:51 AM Matan Azrad  wrote:

>
>
> From: Chas Williams
> >On Mon, Aug 27, 2018 at 11:30 AM Matan Azrad 
> wrote:
> 
> >>>Because rings are generally quite efficient.
> >>
> >>But you are using a ring in addition to regular array management, it
> must hurt performance of the bonding PMD
> >>(means the bonding itself - not the slaves PMDs which are called from
> the bonding)
> >
> >It adds latency.
>
> And by that hurts the application performance because it takes more CPU
> time in the bonding PMD.
>

No, as I said before it takes _less_ CPU time in the bonding PMD
because we use a more optimal read from the slaves.


>
> >It increases performance because we spend less CPU time reading from the
> PMDs.
>
> So, it's hack in the bonding PMD to improve some slaves code performance
> but hurt the bonding code performance,
> Over all the performance we gain for those slaves improves the application
> performance only when working with those slaves.
> But may hurt the application performance when working with other slaves.
>

What is your evidence that is hurts bonding performance?  Your
argument is purely theoretical.  I could easily argue than even
for non-vectorized PMDs there is a performance gain because we
spend less time switching between PMDs.  If you are going to read
from a PMD you should attempt to read as much as possible. It's
expensive to read the cards registers and perform the queue
manipulations.



>
> >  This means we have more CPU to use for
> >post processing (i.e. routing).
>
>
>
> >>>Bonding is in a middle ground between application and PMD.
> >>Yes.
> >>>What bonding is doing, may not improve all applications.
> >>Yes, but it can be solved using some bonding modes.
> >>> If using a ring to buffer the vectorized receive routines, improves
> your particular application,
> >>>that's great.
> >>It may be not great and even bad for some other PMDs which are not
> vectororized.
> >>
> >>> However, I don't think I can say that it would help all
> >>>applications.  As you point out, there is overhead associated with
> >>>a ring.
> >>Yes.
> >>>Bonding's receive burst isn't especially efficient (in mode 4).
> >>
> >>Why?
> >>
> >>It makes a copy of the slaves, has a fair bit of stack usage,
> >>needs to check the slave status, and needs to examine each
> >>packet to see if it is a slow protocol packet.  So each
> >>packet is essentially read twice.  The fast queue code for mode 4
> >>avoids some of this (and probably ignores checking collecting
> >>incorrectly).  If you find a slow protocol packet, you need to
> >>chop it out of the array with memmove due to overlap.
> >
> >Agree.
> >So to improve the bonding performance you need to optimize the aboves
> problems.
> >There is no connection to the ring.
> >
> >And as I have described numerous times, these problems
> >can't be easily fixed and preserve the existing API.
>
> Sometimes we need to work harder to see a gain for all.
> We should not apply a patch because it is easy and show a gain for
> specific scenarios.
>
> >>> Bonding benefits from being able to read as much as possible (within
> limits of
> >>>course, large reads would blow out caches) from each slave.
> >>
> >>The slaves PMDs can benefits in the same way.
> >>
> >>>It can't return all that data though because applications tend to use
> the
> >>>burst size that would be efficient for a typical PMD.
> >>
> >>What is the preferred burst size of the bonding? Maybe the application
> should use it when they are using bonding.
> >>
> >>The preferred burst size for bonding would be the sum of all the
> >>slaves ideal read size.  However, that's not likely to be simple
> >>since most applications decide early the size for the read/write
> >>burst operations.
> >>
> >>>An alternative might be to ask bonding applications to simply issue
> larger reads for
> >>>certain modes.  That's probably not as easy as it sounds given the
> >>>way that the burst length effects multiplexing.
> >>
> >>Can you explain it more?
> >>
> >>A longer burst size on one PMD will tend to favor that PMD
> >>over others.  It will fill your internal queues with more
> >>of its packets.
> >
> >Agree, it's about fairness.
> >
> >>
> >>>Another solution might be just alternatively poll the individual
> >>>slaves on each rx burst.  But that means you need to poll at a
> >>>faster rate.  Depending on your application, you might not be
> >>>able to do that.
> >
> >>Again, can you be more precise in the above explanation?
> >>
> >>If the application knows that there are two slaves backing
> >>a bonding interface, the application could just read twice
> >>from the bonding interface, knowing that the bonding
> >>interface is going to alternate between the slaves.  But
> >>this requires the application to know things about the bonding
> >>PMD, like the number of slaves.
> >
> >Why should the application poll twice?
> >Poll slave 0, than process it's packets, poll slave 1 than proce

Re: [dpdk-dev] 18.08 build error on ppc64el - bool as vector type

2018-08-29 Thread Christian Ehrhardt
On Wed, Aug 29, 2018 at 3:16 PM Adrien Mazarguil 
wrote:

> On Wed, Aug 29, 2018 at 10:27:03AM +0200, Christian Ehrhardt wrote:
> > On Tue, Aug 28, 2018 at 5:02 PM Adrien Mazarguil <
> adrien.mazarg...@6wind.com>
> > wrote:
> >
> > > On Tue, Aug 28, 2018 at 02:38:35PM +0200, Christian Ehrhardt wrote:
> 
> > > > --- a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> > > > +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> > > > @@ -36,6 +36,14 @@
> > > > #include 
> > > > #include 
> > > > /*To include altivec.h, GCC version must  >= 4.8 */
> > > > +/*
> > > > + * If built with std=c11 stdbool and altivec bool will conflict.
> > > > + * The altivec bool type is not needed at the moment, to avoid the
> > > conflict
> > > > + * define __APPLE_ALTIVEC__ so that the conflict will not happen.
> > > > + */
> > > > +#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> > > > !defined(__APPLE_ALTIVEC__)
> > > > +#define __APPLE_ALTIVEC__
> > > > +#endif
> > > > #include 
> > > >
> > > > #ifdef __cplusplus
> > > >
> > > > But it turned out we are not allowed to switch of other things as
> vector
> > > > (and probably some more code than the type) is actually used:
> > > > With your suggestion or mine above it will break on:
> > > >
> > > > x5.o -c /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.c
> > > > In file included from
> > > /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5_prm.h:21,
> > > > from
> > > /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5_rxtx.h:37,
> > > > from
> /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.h:36,
> > > > from
> /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.c:42:
> > > >
> /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_vect.h:43:15:
> > > error:
> > > > expected ‘;’ before ‘signed’
> > > > typedef vector signed int xmm_t;
> > > >   ^~~
> > > >   ;
> > > >
> /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_vect.h:49:2:
> > > error:
> > > > expected specifier-qualifier-list before ‘xmm_t’
> > > >  xmm_tx;
> > > >  ^
> > > >
> > > > I have no much better suggestion for the ordering issue that you
> raised.
> > > > To test what would happen I moved the stdbool include after all other
> > > > includes in drivers/net/mlx5/mlx5_nl.c
> > > > I also moved mlx5.h (which eventually brings in altivec) right at the
> > > top.
> > > > This works to build, but such a check is always subtle as one of the
> > > other
> > > > includes might have pulled in stdbool before altivec still.
> > > > For a bit of confidence I picked said gcc call and ran it with -E.
> > > > The output suggests altivec really was included before stdbool.
> > >
> > > How about making altivec.h users (rte_vect.h and rte_memcpy.h) rely on
> > > "__vector" directly instead of the "vector" macro to make it
> transparent
> > > for
> > > others then?
> > >
> > > I think we can assume they have internal knowledge of this file in
> order to
> > > deal with __APPLE_ALTIVEC__ anyway.
> > >
> >
> > While "pushing the internal knowledge out to users" sounds right at
> first.
> > There are far too many IMHO, the change would be huge unclean and messy.
> >
> > $ grep -Hrn altivec.h
> > drivers/net/i40e/i40e_rxtx_vec_altivec.c:45:#include 
> > examples/l3fwd/l3fwd_lpm.c:165:#include "l3fwd_lpm_altivec.h"
> > examples/l3fwd/l3fwd_lpm_altivec.h:10:#include "l3fwd_altivec.h"
> > MAINTAINERS:239:F: examples/l3fwd/*altivec.h
> > lib/librte_acl/acl_run_altivec.c:34:#include "acl_run_altivec.h"
> > lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h:49:/*To include
> > altivec.h, GCC version must  >= 4.8 */
> > lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h:50:#include <
> > altivec.h>
> > lib/librte_eal/common/include/arch/ppc_64/rte_vect.h:36:#include
> 
> >
> > lib/librte_lpm/meson.build:9:headers += files('rte_lpm_altivec.h',
> > 'rte_lpm_neon.h', 'rte_lpm_sse.h')
> > lib/librte_lpm/Makefile:28:SYMLINK-$(CONFIG_RTE_LIBRTE_LPM)-include +=
> > rte_lpm_altivec.h
> > lib/librte_lpm/rte_lpm.h:461:#include "rte_lpm_altivec.h"
>
> I'd still like to give it a try given only knwon users of AltiVec code may
> rely on these vector/pixel/bool definitions. Scope should be quite small.
>
> The root issue we need to address is that DPDK applications may
> involuntarily pull altivec.h by including something unrelated
> (rte_memcpy.h)
> and get unwanted bool/vector/pixel macros polluting their namespace and
> breaking things.
>
> > > Also I would suggest not to make this workaround C11-only. I suspect
> the
> > > same issue will be encountered with -std=c99 or -std=c90. Keep in mind
> DPDK
> > > applications are free to specify their own CFLAGS.
> > >
> >
> > Yeah Independent to the other part of the discussion I think we can make
> it
> > generally apply and not just C11.
> >
> > The following "would work" in the code right now.
> > --- a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> > +++ b/lib/librte_eal/common/inclu

Re: [dpdk-dev] [PATCH v5] app/testpmd: add forwarding mode to simulate a noisy neighbour

2018-08-29 Thread Kevin Traynor
On 08/10/2018 03:49 PM, Jens Freimann wrote:
> This adds a new forwarding mode to testpmd to simulate
> more realistic behavior of a guest machine engaged in receiving
> and sending packets performing Virtual Network Function (VNF).
> 

Hi Jens, comments below,

thanks,
Kevin.

> The goal is to enable a simple way of measuring performance impact on
> cache and memory footprint utilization from various VNF co-located on
> the same host machine. For this it does:
> 
> * Buffer packets in a FIFO:
> 
> Create a fifo to buffer received packets. Once it flows over put
> those packets into the actual tx queue. The fifo is created per tx
> queue and its size can be set with the --buffersize-before-sending
> commandline parameter.

--noisy-tx-sw-buffer-flushtime

> 
> A second commandline parameter is used to set a timeout in
> milliseconds after which the fifo is flushed.
> 
> --noisy-tx-sw-buffer-size [packet numbers]
> Keep the mbuf in a FIFO and forward the over flooding packets from the
> FIFO. This queue is per TX-queue (after all other packet processing).
> 
> --noisy-tx-sw-buffer-flushtime [delay]
> Flush the packet queue if no packets have been seen during
> [delay]. As long as packets are seen, the timer is reset.
> 
> Add several options to simulate route lookups (memory reads) in tables
> that can be quite large, as well as route hit statistics update.
> These options simulates the while stack traversal and
> will trash the cache. Memory access is random.
> 
> * simulate route lookups:
> 
> Allocate a buffer and perform reads and writes on it as specified by
> commandline options:
> 
> --noisy-lkup-memory [size]
> Size of the VNF internal memory (MB), in which the random
> read/write will be done, allocated by rte_malloc (hugepages).
> 
> --noisy-lkup-num-writes [num]
> Number of random writes in memory per packet should be
> performed, simulating hit-flags update. 64 bits per write,
> all write in different cache lines.
> 
> --noisy-lkup-num-reads [num]
> Number of random reads in memory per packet should be
> performed, simulating FIB/table lookups. 64 bits per read,
> all write in different cache lines.
> 
> --noisy-lkup-num-reads-writes [num]
> Number of random reads and writes in memory per packet should
> be performed, simulating stats update. 64 bits per read-write, all
> reads and writes in different cache lines.
> 
> Signed-off-by: Jens Freimann 
> ---
>  app/test-pmd/Makefile   |   1 +
>  app/test-pmd/meson.build|   1 +
>  app/test-pmd/noisy_vnf.c| 269 
> 
>  app/test-pmd/parameters.c   |  60 +++
>  app/test-pmd/testpmd.c  |  35 
>  app/test-pmd/testpmd.h  |  10 ++
>  doc/guides/testpmd_app_ug/run_app.rst   |  33 
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |   7 +-
>  8 files changed, 414 insertions(+), 2 deletions(-)
>  create mode 100644 app/test-pmd/noisy_vnf.c
> 
> diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
> index 2b4d604..e2581ca 100644
> --- a/app/test-pmd/Makefile
> +++ b/app/test-pmd/Makefile
> @@ -33,6 +33,7 @@ SRCS-y += rxonly.c
>  SRCS-y += txonly.c
>  SRCS-y += csumonly.c
>  SRCS-y += icmpecho.c
> +SRCS-y += noisy_vnf.c
>  SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c
>  SRCS-$(CONFIG_RTE_LIBRTE_BPF) += bpf_cmd.c
>  
> diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
> index a0b3be0..9ef6ed9 100644
> --- a/app/test-pmd/meson.build
> +++ b/app/test-pmd/meson.build
> @@ -17,6 +17,7 @@ sources = files('cmdline.c',
>   'iofwd.c',
>   'macfwd.c',
>   'macswap.c',
> + 'noisy_vnf.c',
>   'parameters.c',
>   'rxonly.c',
>   'testpmd.c',
> diff --git a/app/test-pmd/noisy_vnf.c b/app/test-pmd/noisy_vnf.c
> new file mode 100644
> index 000..dcde7d0
> --- /dev/null
> +++ b/app/test-pmd/noisy_vnf.c
> @@ -0,0 +1,269 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Red Hat Corp.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "testpmd.h"
> +
> +struct noisy_config {
> + struct rte_ring *f;
> + uint64_t prev_time;
> + char *vnf_mem;
> + bool do_buffering;
> + bool do_flush;
> + bool do_sim;
> +};
> +
> +struct noisy_config *noisy_cfg[RTE_MAX_ETHPORTS];
> +
> +static inline void
> +do_write(char *vnf_mem)
> +{
> + uint64_t i = rte_rand();
> + uint64_t w = rte_rand();
> +
> + vnf_mem[i % ((noisy_lkup_mem_sz * 1024 * 1024) /
> + RTE_CACHE_LINE_SIZE)] = w;
> +}
> +
> +static inline void
> +do_read(char *vnf_mem)
> +{
> + uint64_t i = rte_rand();
> + uint64_t r;
> +
> + r = vnf

Re: [dpdk-dev] [dpdk-stable] [PATCH v4] net/bonding: per-slave intermediate rx ring

2018-08-29 Thread Matan Azrad


From: Chas Williams
>On Tue, Aug 28, 2018 at 5:51 AM Matan Azrad  wrote:
>
>
>From: Chas Williams
>>On Mon, Aug 27, 2018 at 11:30 AM Matan Azrad 
>> wrote:
>
Because rings are generally quite efficient.
>>>
>>>But you are using a ring in addition to regular array management, it must 
>>>hurt performance of the bonding PMD
>>>(means the bonding itself - not the slaves PMDs which are called from the 
>>>bonding)
>>
>>It adds latency.
>
>And by that hurts the application performance because it takes more CPU time 
>in the bonding PMD.
>
>No, as I said before it takes _less_ CPU time in the bonding PMD
>because we use a more optimal read from the slaves.

Each packet pointer should be copied more 2 times because of this patch + some 
management(the ring overhead)
So in the bonding code you lose performance.

>
>>It increases performance because we spend less CPU time reading from the PMDs.
>
>So, it's hack in the bonding PMD to improve some slaves code performance but 
>hurt the bonding code performance,
>Over all the performance we gain for those slaves improves the application 
>performance only when working with those slaves. 
>But may hurt the application performance when working with other slaves.
>
>What is your evidence that is hurts bonding performance?  Your
>argument is purely theoretical.
Yes, we cannot test all the scenarios cross the PMDs.

>  I could easily argue than even for non-vectorized PMDs there is a 
>performance gain because we
>spend less time switching between PMDs.

But spend more time in the bonding part.

 > If you are going to read from a PMD you should attempt to read as much as 
possible. It's
>expensive to read the cards registers and perform the queue
>manipulations.

You do it anyway.

The context changing is expensive but also the extra copies per packet and the 
ring management.

We have here tradeoff that may be affect differently for other scenarios.

>
>>  This means we have more CPU to use for
>>post processing (i.e. routing).
>
>
>
Bonding is in a middle ground between application and PMD.
>>>Yes.
What bonding is doing, may not improve all applications.
>>>Yes, but it can be solved using some bonding modes.
 If using a ring to buffer the vectorized receive routines, improves your 
 particular application,
that's great. 
>>>It may be not great and even bad for some other PMDs which are not 
>>>vectororized.
>>>
 However, I don't think I can say that it would help all
applications.  As you point out, there is overhead associated with
a ring.
>>>Yes.
Bonding's receive burst isn't especially efficient (in mode 4).
>>>
>>>Why?
>>>
>>>It makes a copy of the slaves, has a fair bit of stack usage, 
>>>needs to check the slave status, and needs to examine each
>>>packet to see if it is a slow protocol packet.  So each
>>>packet is essentially read twice.  The fast queue code for mode 4
>>>avoids some of this (and probably ignores checking collecting
>>>incorrectly).  If you find a slow protocol packet, you need to
>>>chop it out of the array with memmove due to overlap.
>>
>>Agree.
>>So to improve the bonding performance you need to optimize the aboves 
>>problems.
>>There is no connection to the ring.
>>
>>And as I have described numerous times, these problems
>>can't be easily fixed and preserve the existing API.
>
>Sometimes we need to work harder to see a gain for all.
>We should not apply a patch because it is easy and show a gain for specific 
>scenarios.
>
 Bonding benefits from being able to read as much as possible (within 
 limits of
course, large reads would blow out caches) from each slave.
>>>
>>>The slaves PMDs can benefits in the same way.
>>>
It can't return all that data though because applications tend to use the 
burst size that would be efficient for a typical PMD.
>>>
>>>What is the preferred burst size of the bonding? Maybe the application 
>>>should use it when they are using bonding.
>>>
>>>The preferred burst size for bonding would be the sum of all the
>>>slaves ideal read size.  However, that's not likely to be simple
>>>since most applications decide early the size for the read/write
>>>burst operations.
>>> 
An alternative might be to ask bonding applications to simply issue larger 
reads for
certain modes.  That's probably not as easy as it sounds given the
way that the burst length effects multiplexing.
>>>
>>>Can you explain it more?
>>>
>>>A longer burst size on one PMD will tend to favor that PMD
>>>over others.  It will fill your internal queues with more 
>>>of its packets.
>>
>>Agree, it's about fairness.
>> 
>>>
Another solution might be just alternatively poll the individual
slaves on each rx burst.  But that means you need to poll at a
faster rate.  Depending on your application, you might not be
able to do that.
>>
>>>Again, can you be more precise in the above explanation?
>>>
>>>If th

Re: [dpdk-dev] [PATCH] mbuf: add IGMP packet type

2018-08-29 Thread Stephen Hemminger
On Mon, 27 Aug 2018 18:08:35 +0530
Jerin Jacob  wrote:

> Add support for IGMP packet type.
> 
> Signed-off-by: Jerin Jacob 

Could you add logic to recoginize IGMP to the software packet type 
identification
rte_net_get_ptype used by drivers that don't have hardware support.

Also shouldn't this bit be part of RTE_PTYPE_L4_MASK?



Re: [dpdk-dev] [PATCH v3] app/testpmd: add new command for show port info

2018-08-29 Thread Stephen Hemminger
On Wed, 29 Aug 2018 11:47:29 +0100
Emma Finn  wrote:

> existing testpmd command "show port info" is too verbose.
> Added a new summary command to print brief information on ports.
> 
> console output:
>   testpmd> show port summary all  
>   Number of available ports: 2
>   Port MAC Address   Name  Driver   Status Link
>   011:22:33:44:55:66 :07:00.0, net_i40e, up, 4 Mbps
>   166:55:44:33:22:11 :07:00.1, net_i40e, up, 4 Mbps
> 
> Signed-off-by: Emma Finn 
> 

Looks good, thanks for doing this.
Is there a good way to handle ports that are "owned" as in bonding/failsafe etc?

Reviewed-by: Stephen Hemminger 



Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface

2018-08-29 Thread Stephen Hemminger
On Thu, 28 Jun 2018 18:55:08 -0700
Dan Gora  wrote:

> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> applications to update the link state for the KNI network interfaces
> in the linux kernel.
> 
> Note that the default carrier state is set to off when the interface
> is opened.
> 
> Signed-off-by: Dan Gora 

Do you really need a special ioctl for this?
There is already ability to set link state via sysfs or netlink.


[dpdk-dev] [PATCH v3] net/virtio-user: check negotiated features before set

2018-08-29 Thread eric zhang
This patch checks negotiated features to see if necessary to offload
before set the tap device offload capabilities. It also checks if kernel
support the TUNSETOFFLOAD operation.

Signed-off-by: eric zhang 

---
v3:
* make other offloading features depend on CSUM
* check IFF_VNET_HDR support when handling VHOST_GET_FEATURES

---
v2:
* don't return failure when failed to set offload to tap
* check if offloads available when handling VHOST_GET_FEATURES
---
 drivers/net/virtio/virtio_user/vhost_kernel.c | 18 +---
 drivers/net/virtio/virtio_user/vhost_kernel_tap.c | 56 +--
 drivers/net/virtio/virtio_user/vhost_kernel_tap.h |  2 +-
 3 files changed, 54 insertions(+), 22 deletions(-)

diff --git a/drivers/net/virtio/virtio_user/vhost_kernel.c 
b/drivers/net/virtio/virtio_user/vhost_kernel.c
index dd24b6b..3502001 100644
--- a/drivers/net/virtio/virtio_user/vhost_kernel.c
+++ b/drivers/net/virtio/virtio_user/vhost_kernel.c
@@ -189,8 +189,8 @@ struct vhost_memory_kernel {
 (1ULL << VIRTIO_NET_F_HOST_TSO6) | \
 (1ULL << VIRTIO_NET_F_CSUM))
 
-static int
-tap_supporte_mq(void)
+static unsigned int
+tap_support_features(void)
 {
int tapfd;
unsigned int tap_features;
@@ -209,7 +209,7 @@ struct vhost_memory_kernel {
}
 
close(tapfd);
-   return tap_features & IFF_MULTI_QUEUE;
+   return tap_features;
 }
 
 static int
@@ -223,6 +223,7 @@ struct vhost_memory_kernel {
struct vhost_memory_kernel *vm = NULL;
int vhostfd;
unsigned int queue_sel;
+   unsigned int features;
 
PMD_DRV_LOG(INFO, "%s", vhost_msg_strings[req]);
 
@@ -276,17 +277,20 @@ struct vhost_memory_kernel {
}
 
if (!ret && req_kernel == VHOST_GET_FEATURES) {
+   features = tap_support_features();
/* with tap as the backend, all these features are supported
 * but not claimed by vhost-net, so we add them back when
 * reporting to upper layer.
 */
-   *((uint64_t *)arg) |= VHOST_KERNEL_GUEST_OFFLOADS_MASK;
-   *((uint64_t *)arg) |= VHOST_KERNEL_HOST_OFFLOADS_MASK;
+   if (features & IFF_VNET_HDR) {
+   *((uint64_t *)arg) |= VHOST_KERNEL_GUEST_OFFLOADS_MASK;
+   *((uint64_t *)arg) |= VHOST_KERNEL_HOST_OFFLOADS_MASK;
+   }
 
/* vhost_kernel will not declare this feature, but it does
 * support multi-queue.
 */
-   if (tap_supporte_mq())
+   if (features & IFF_MULTI_QUEUE)
*(uint64_t *)arg |= (1ull << VIRTIO_NET_F_MQ);
}
 
@@ -381,7 +385,7 @@ struct vhost_memory_kernel {
hdr_size = sizeof(struct virtio_net_hdr);
 
tapfd = vhost_kernel_open_tap(&dev->ifname, hdr_size, req_mq,
-(char *)dev->mac_addr);
+(char *)dev->mac_addr, dev->features);
if (tapfd < 0) {
PMD_DRV_LOG(ERR, "fail to open tap for vhost kernel");
return -1;
diff --git a/drivers/net/virtio/virtio_user/vhost_kernel_tap.c 
b/drivers/net/virtio/virtio_user/vhost_kernel_tap.c
index d036428..e9ee774 100644
--- a/drivers/net/virtio/virtio_user/vhost_kernel_tap.c
+++ b/drivers/net/virtio/virtio_user/vhost_kernel_tap.c
@@ -45,21 +45,55 @@
 
 #include "vhost_kernel_tap.h"
 #include "../virtio_logs.h"
+#include "../virtio_pci.h"
+
+static int
+vhost_kernel_tap_set_offload(int fd, uint64_t features)
+{
+   unsigned int offload = 0;
+
+   if (features & (1ULL << VIRTIO_NET_F_GUEST_CSUM)) {
+   offload |= TUN_F_CSUM;
+   if (features & (1ULL << VIRTIO_NET_F_GUEST_TSO4))
+   offload |= TUN_F_TSO4;
+   if (features & (1ULL << VIRTIO_NET_F_GUEST_TSO6))
+   offload |= TUN_F_TSO6;
+   if (features & ((1ULL << VIRTIO_NET_F_GUEST_TSO4) |
+   (1ULL << VIRTIO_NET_F_GUEST_TSO6)) &&
+   (features & (1ULL << VIRTIO_NET_F_GUEST_ECN)))
+   offload |= TUN_F_TSO_ECN;
+   if (features & (1ULL << VIRTIO_NET_F_GUEST_UFO))
+   offload |= TUN_F_UFO;
+   }
+
+   if (offload != 0) {
+   /* Check if our kernel supports TUNSETOFFLOAD */
+   if (ioctl(fd, TUNSETOFFLOAD, 0) != 0 && errno == EINVAL) {
+   PMD_DRV_LOG(ERR, "Kernel does't support 
TUNSETOFFLOAD\n");
+   return -ENOTSUP;
+   }
+
+   if (ioctl(fd, TUNSETOFFLOAD, offload) != 0) {
+   offload &= ~TUN_F_UFO;
+   if (ioctl(fd, TUNSETOFFLOAD, offload) != 0) {
+   PMD_DRV_LOG(ERR, "TUNSETOFFLOAD ioctl() failed: 
%s\n",
+   strerror(errno));
+   return -1;
+  

[dpdk-dev] [PATCH] eal: force IOVA mode to physical

2018-08-29 Thread eric zhang
This patch adds a configuration option to force the IOVA mode to
physical address (PA). There exists virtual devices that are not
directly attached to the PCI bus, and therefore the auto detection
of the IOVA mode based on probing the PCI bus and IOMMU configuration
may not report the required addressing mode. Having the configuration
option permits the mode to be explicitly configured in this scenario.

Signed-off-by: eric zhang 
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index e0b5ae1..bee4aed 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -805,6 +805,7 @@ static void rte_eal_init_alert(const char *msg)
return -1;
}
 
+#ifndef RTE_EAL_IOVA_MODE_PA
/* autodetect the iova mapping mode (default is iova_pa) */
rte_eal_get_configuration()->iova_mode = rte_bus_get_iommu_class();
 
@@ -816,6 +817,12 @@ static void rte_eal_init_alert(const char *msg)
"Some devices want IOVA as VA but PA will be used 
because.. "
"KNI module inserted\n");
}
+#else
+   /* Force iova mapping mode to be physical address */
+   rte_eal_get_configuration()->iova_mode = RTE_IOVA_PA;
+   RTE_LOG(WARNING, EAL,
+   "Force the iova mapping mode to be physical address\n");
+#endif
 
if (internal_config.no_hugetlbfs == 0 &&
internal_config.process_type != RTE_PROC_SECONDARY &&
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH] mbuf: add IGMP packet type

2018-08-29 Thread Jerin Jacob
-Original Message-
> Date: Wed, 29 Aug 2018 08:31:10 -0700
> From: Stephen Hemminger 
> To: Jerin Jacob 
> Cc: dev@dpdk.org, olivier.m...@6wind.com
> Subject: Re: [dpdk-dev]  [PATCH] mbuf: add IGMP packet type
> 
> External Email
> 
> On Mon, 27 Aug 2018 18:08:35 +0530
> Jerin Jacob  wrote:
> 
> > Add support for IGMP packet type.
> >
> > Signed-off-by: Jerin Jacob 
> 
> Could you add logic to recoginize IGMP to the software packet type 
> identification
> rte_net_get_ptype used by drivers that don't have hardware support.

If everyone agrees then I can do it as adding IGMP support will reduce
the performance of rte_net_get_ptype() and most of the NIC may not need it.

Opinions?

> 
> Also shouldn't this bit be part of RTE_PTYPE_L4_MASK?

The RTE_PTYPE_L4_MASK is 0x0f00 so it is part it. Right?

> 


[dpdk-dev] [PATCH] test-meson-builds: add 32-bit compilation test

2018-08-29 Thread Bruce Richardson
Add in a cross-file to enable 32-bit compile tests as part
of the test-meson-builds script.

Signed-off-by: Bruce Richardson 
---
NOTE: For ease of use, it's recommended that meson 0.47 be used for
this testing. With earlier versions, it may be necessary to ensure that
the same development packages are installed for both 64-bit and 32-bit.
---
 config/x86/i686_sse4_linuxapp_gcc | 18 ++
 devtools/test-meson-builds.sh |  4 
 2 files changed, 22 insertions(+)
 create mode 100644 config/x86/i686_sse4_linuxapp_gcc

diff --git a/config/x86/i686_sse4_linuxapp_gcc 
b/config/x86/i686_sse4_linuxapp_gcc
new file mode 100644
index 0..6bca8e336
--- /dev/null
+++ b/config/x86/i686_sse4_linuxapp_gcc
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Intel Corporation
+
+[binaries]
+c = 'gcc'
+cpp = 'cpp'
+ar = 'ar'
+strip = 'strip'
+
+[properties]
+c_args = ['-m32']
+c_link_args = ['-m32']
+
+[host_machine]
+system = 'linux'
+cpu_family = 'x86'
+cpu = 'nehalem'
+endian = 'little'
diff --git a/devtools/test-meson-builds.sh b/devtools/test-meson-builds.sh
index 951c9067a..66723fe2b 100755
--- a/devtools/test-meson-builds.sh
+++ b/devtools/test-meson-builds.sh
@@ -44,6 +44,10 @@ done
 # test compilation with minimal x86 instruction set
 build build-x86-default -Dmachine=nehalem
 
+# test 32-bit x86 compilation
+# NOTE: meson >0.47 recommended for best results
+build build-i686 --cross-file config/x86/i686_sse4_linuxapp_gcc
+
 # enable cross compilation if gcc cross-compiler is found
 c=aarch64-linux-gnu-gcc
 if command -v $c >/dev/null 2>&1 ; then
-- 
2.17.1



[dpdk-dev] [PATCH] build: add configuration summary at end of config

2018-08-29 Thread Bruce Richardson
After running meson to configure a DPDK build, it can be useful to know
what was automatically enabled or disabled. Therefore, print out by way of
summary a categorised list of libraries and drivers to be built.

Signed-off-by: Bruce Richardson 
---
 drivers/meson.build |  5 +
 lib/meson.build |  3 +++
 meson.build | 31 +++
 3 files changed, 39 insertions(+)

diff --git a/drivers/meson.build b/drivers/meson.build
index f94e2fe67..b6ce974de 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -24,6 +24,7 @@ foreach class:driver_classes
 # version file for linking
 
subdir(class)
+   class_drivers = []
 
foreach drv:drivers
drv_path = join_paths(class, drv)
@@ -51,6 +52,8 @@ foreach class:driver_classes
subdir(drv_path)
 
if build
+   class_drivers += name
+
dpdk_conf.set(config_flag_fmt.format(name.to_upper()),1)
lib_name = driver_name_fmt.format(name)
 
@@ -141,4 +144,6 @@ foreach class:driver_classes
set_variable('static_@0@'.format(lib_name), static_dep)
endif # build
endforeach
+
+   set_variable(class + '_drivers', class_drivers)
 endforeach
diff --git a/lib/meson.build b/lib/meson.build
index 71f35d162..3acc67e6e 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -30,6 +30,8 @@ default_cflags = machine_args
 if cc.has_argument('-Wno-format-truncation')
default_cflags += '-Wno-format-truncation'
 endif
+
+enabled_libs = [] # used to print summary at the end
 foreach l:libraries
build = true
name = l
@@ -55,6 +57,7 @@ foreach l:libraries
subdir(dir_name)
 
if build
+   enabled_libs += name
dpdk_conf.set('RTE_LIBRTE_' + name.to_upper(), 1)
install_headers(headers)
 
diff --git a/meson.build b/meson.build
index 84af32ece..7332e75b5 100644
--- a/meson.build
+++ b/meson.build
@@ -73,3 +73,34 @@ pkg.generate(name: meson.project_name(),
subdirs: [get_option('include_subdir_arch'), '.'],
extra_cflags: ['-include', 'rte_config.h'] + machine_args
 )
+
+# final output, list all the libs and drivers to be built
+# this does not affect any part of the build, for information only.
+output_message = '\n=\nLibraries Enabled\n=\n'
+output_message += '\nlibs:\n\t'
+output_count = 0
+foreach lib:enabled_libs
+   output_message += lib + ', '
+   output_count += 1
+   if output_count == 8
+   output_message += '\n\t'
+   output_count = 0
+   endif
+endforeach
+message(output_message + '\n')
+
+output_message = '\n===\nDrivers Enabled\n===\n'
+foreach class:driver_classes
+   class_drivers = get_variable(class + '_drivers')
+   output_message += '\n' + class + ':\n\t'
+   output_count = 0
+   foreach drv:class_drivers
+   output_message += drv + ', '
+   output_count += 1
+   if output_count == 8
+   output_message += '\n\t'
+   output_count = 0
+   endif
+   endforeach
+endforeach
+message(output_message + '\n')
-- 
2.11.0



Re: [dpdk-dev] [PATCH v3] app/testpmd: add new command for show port info

2018-08-29 Thread Ferruh Yigit
On 8/29/2018 11:47 AM, Emma Finn wrote:
> existing testpmd command "show port info" is too verbose.
> Added a new summary command to print brief information on ports.
> 
> console output:
>   testpmd> show port summary all
>   Number of available ports: 2
>   Port MAC Address   Name  Driver   Status Link
>   011:22:33:44:55:66 :07:00.0, net_i40e, up, 4 Mbps
>   166:55:44:33:22:11 :07:00.1, net_i40e, up, 4 Mbps
> 
> Signed-off-by: Emma Finn 

When header is used, I think no need "," between fields. Except from this,

Reviewed-by: Ferruh Yigit 


Re: [dpdk-dev] [PATCH] mbuf: add IGMP packet type

2018-08-29 Thread Stephen Hemminger
On Wed, 29 Aug 2018 21:29:05 +0530
Jerin Jacob  wrote:

> -Original Message-
> > Date: Wed, 29 Aug 2018 08:31:10 -0700
> > From: Stephen Hemminger 
> > To: Jerin Jacob 
> > Cc: dev@dpdk.org, olivier.m...@6wind.com
> > Subject: Re: [dpdk-dev]  [PATCH] mbuf: add IGMP packet type
> > 
> > External Email
> > 
> > On Mon, 27 Aug 2018 18:08:35 +0530
> > Jerin Jacob  wrote:
> >   
> > > Add support for IGMP packet type.
> > >
> > > Signed-off-by: Jerin Jacob   
> > 
> > Could you add logic to recoginize IGMP to the software packet type 
> > identification
> > rte_net_get_ptype used by drivers that don't have hardware support.  
> 
> If everyone agrees then I can do it as adding IGMP support will reduce
> the performance of rte_net_get_ptype() and most of the NIC may not need it.
> 
> Opinions?
> 
> > 
> > Also shouldn't this bit be part of RTE_PTYPE_L4_MASK?  
> 
> The RTE_PTYPE_L4_MASK is 0x0f00 so it is part it. Right?

Then you must add it to the software matcher since most drivers are advertising 
L4_MASK



Re: [dpdk-dev] [PATCH] mbuf: add IGMP packet type

2018-08-29 Thread Stephen Hemminger
On Wed, 29 Aug 2018 21:29:05 +0530
Jerin Jacob  wrote:

> -Original Message-
> > Date: Wed, 29 Aug 2018 08:31:10 -0700
> > From: Stephen Hemminger 
> > To: Jerin Jacob 
> > Cc: dev@dpdk.org, olivier.m...@6wind.com
> > Subject: Re: [dpdk-dev]  [PATCH] mbuf: add IGMP packet type
> > 
> > External Email
> > 
> > On Mon, 27 Aug 2018 18:08:35 +0530
> > Jerin Jacob  wrote:
> >   
> > > Add support for IGMP packet type.
> > >
> > > Signed-off-by: Jerin Jacob   
> > 
> > Could you add logic to recoginize IGMP to the software packet type 
> > identification
> > rte_net_get_ptype used by drivers that don't have hardware support.  
> 
> If everyone agrees then I can do it as adding IGMP support will reduce
> the performance of rte_net_get_ptype() and most of the NIC may not need it.

Since IGMP is an IP protocol field and the code is already looking for
TCP and UDP, how could adding another else slow it down in any observable way.


Re: [dpdk-dev] [PATCH] mbuf: add IGMP packet type

2018-08-29 Thread Jerin Jacob
-Original Message-
> Date: Wed, 29 Aug 2018 09:34:36 -0700
> From: Stephen Hemminger 
> To: Jerin Jacob 
> Cc: dev@dpdk.org, olivier.m...@6wind.com
> Subject: Re: [dpdk-dev]  [PATCH] mbuf: add IGMP packet type
> 
> External Email
> 
> On Wed, 29 Aug 2018 21:29:05 +0530
> Jerin Jacob  wrote:
> 
> > -Original Message-
> > > Date: Wed, 29 Aug 2018 08:31:10 -0700
> > > From: Stephen Hemminger 
> > > To: Jerin Jacob 
> > > Cc: dev@dpdk.org, olivier.m...@6wind.com
> > > Subject: Re: [dpdk-dev]  [PATCH] mbuf: add IGMP packet type
> > >
> > > External Email
> > >
> > > On Mon, 27 Aug 2018 18:08:35 +0530
> > > Jerin Jacob  wrote:
> > >
> > > > Add support for IGMP packet type.
> > > >
> > > > Signed-off-by: Jerin Jacob 
> > >
> > > Could you add logic to recoginize IGMP to the software packet type 
> > > identification
> > > rte_net_get_ptype used by drivers that don't have hardware support.
> >
> > If everyone agrees then I can do it as adding IGMP support will reduce
> > the performance of rte_net_get_ptype() and most of the NIC may not need it.
> >
> > Opinions?
> >
> > >
> > > Also shouldn't this bit be part of RTE_PTYPE_L4_MASK?
> >
> > The RTE_PTYPE_L4_MASK is 0x0f00 so it is part it. Right?
> 
> Then you must add it to the software matcher since most drivers are 
> advertising L4_MASK

Which driver returns .dev_supported_ptypes_get ethdev ops with L4_MASK?

> 


Re: [dpdk-dev] [PATCH 7/7] eal: improve musl compatibility

2018-08-29 Thread Stephen Hemminger
On Wed, 29 Aug 2018 12:56:21 +0100
Anatoly Burakov  wrote:

> Musl complains about pthread id being of wrong size. Fix it by
> casting to 64-bit and printing 64-bit hex unconditionally.
> 
> Signed-off-by: Anatoly Burakov 

What is pthread_t on musl? On Linux it is unsigned long.


Re: [dpdk-dev] [PATCH 7/7] eal: improve musl compatibility

2018-08-29 Thread Stephen Hemminger
On Wed, 29 Aug 2018 15:09:47 +0100
"Burakov, Anatoly"  wrote:

> On 29-Aug-18 1:39 PM, Bruce Richardson wrote:
> > On Wed, Aug 29, 2018 at 12:56:21PM +0100, Anatoly Burakov wrote:  
> >> Musl complains about pthread id being of wrong size. Fix it by
> >> casting to 64-bit and printing 64-bit hex unconditionally.
> >>
> >> Signed-off-by: Anatoly Burakov 
> >> ---  
> > Given that on linux pthread_t is a pointer type, will this not give other
> > warnings of casting from pointer to integer of a different type when
> > compiling 32-bit? For safety I suggest casting to long or uintptr_t
> > instead, to ensure we always get an int of the right size.
> > 
> > /Bruce
> >   
> 
> Sure, will fix.
> 
> -- 
> Thanks,
> Anatoly

Maybe use gettid() to get thread id which is actually way more useful
than the pointer value. Of course, glibc doesn't want to provide a syscall
wrapper for this.



Re: [dpdk-dev] IXGBE throughput loss with 4+ cores

2018-08-29 Thread Saber Rezvani




On 08/29/2018 01:39 AM, Wiles, Keith wrote:



On Aug 28, 2018, at 2:16 PM, Saber Rezvani  wrote:



On 08/28/2018 11:39 PM, Wiles, Keith wrote:

Which version of Pktgen? I just pushed a patch in 3.5.3 to fix a  performance 
problem.

I use Pktgen verion 3.0.0, indeed it is O.k as far as I  have one core. (10 
Gb/s) but when I increase the number of core (one core per queue) then I loose 
some performance (roughly 8.5 Gb/s for 8-core). In my scenario Pktgen shows it 
is generating at line rate, but receiving 8.5 Gb/s.
Is it because of Pktgen???

Normally Pktgen can receive at line rate up to 10G 64 byte frames, which means 
Pktgen should not be the problem. You can verify that by looping the cable from 
one port to another on the pktgen machine to create a external loopback. Then 
send traffic what ever you can send from one port you should be able to receive 
those packets unless something is configured wrong.

Please send me the command line for pktgen.


In pktgen if you have this config -m “[1-4:5-8].0” then you have 4 cores 
sending traffic and 4 core receiving packets.

In this case the TX cores will be sending the packets on all 4 lcores to the 
same port. On the rx side you have 4 cores polling 4 rx queues. The rx queues 
are controlled by RSS, which means the RX traffic 5 tuples hash must divide the 
inbound packets across all 4 queues to make sure each core is doing the same 
amount of work. If you are sending only a single packet on the Tx cores then 
only one rx queue be used.

I hope that makes sense.
I think there is a misunderstanding of the problem. Indeed the problem 
is not the Pktgen.
Here is my command --> ./app/app/x86_64-native-linuxapp-gcc/pktgen -c 
ffc -n 4 -w 84:00.0 -w 84:00.1 --file-prefix pktgen_F2 --socket-mem 
1000,2000,1000,1000 -- -T -P -m "[18-19:20-21].0, [22:23].1"


The problem is when I run the symmetric_mp example for 
$numberOfProcesses=8 cores, then I have less throughput (roughly 8.4 
Gb/s). but when I run it for $numberOfProcesses=3 cores throughput is 10G.

for i in `seq $numberOfProcesses`;
    do
     some calculation goes here.
 symmetric_mp -c $coremask -n 2 --proc-type=auto -w 0b:00.0 
-w 0b:00.1 --file-prefix sm --socket-mem 4000,1000,1000,1000 -- -p 3 
--num-procs=$numberOfProcesses --proc-id=$procid";

 .
    done

I am trying find out what makes this loss!



On Aug 28, 2018, at 12:05 PM, Saber Rezvani  wrote:



On 08/28/2018 08:31 PM, Stephen Hemminger wrote:

On Tue, 28 Aug 2018 17:34:27 +0430
Saber Rezvani  wrote:


Hi,


I have run multi_process/symmetric_mp example in DPDK example directory.
For a one process its throughput is line rate but as I increase the
number of cores I see decrease in throughput. For example, If the number
of queues set to 4 and each queue assigns to a single core, then the
throughput will be something about 9.4. if 8 queues, then throughput
will be 8.5.

I have read the following, but it was not convincing.

http://mails.dpdk.org/archives/dev/2015-October/024960.html


I am eagerly looking forward to hearing from you, all.


Best wishes,

Saber



Not completely surprising. If you have more cores than packet line rate
then the number of packets returned for each call to rx_burst will be less.
With large number of cores, most of the time will be spent doing reads of
PCI registers for no packets!

Indeed pktgen says it is generating traffic at line rate, but receiving less 
than 10 Gb/s. So, it that case there should be something that causes the 
reduction in throughput :(



Regards,
Keith





Regards,
Keith



Best regards,
Saber




[dpdk-dev] [dpdk-announce] DPDK 16.11.8 (LTS) released

2018-08-29 Thread Luca Boccassi
Hi all,

Here is a new stable release:
https://fast.dpdk.org/rel/dpdk-16.11.8.tar.xz

The git tree is at:
https://dpdk.org/browse/dpdk-stable/?h=16.11

Luca Boccassi

---
 MAINTAINERS|  10 +-
 app/test-pmd/testpmd.c |  15 +-
 app/test/test_cryptodev.c  |   2 +-
 app/test/test_eal_flags.c  |  33 ++--
 app/test/test_hash_multiwriter.c   |  50 +-
 app/test/test_link_bonding_rssconf.c   |   5 +
 app/test/test_pmd_ring.c   |   2 +
 app/test/virtual_pmd.c |   7 +-
 doc/guides/rel_notes/release_16_11.rst |  76 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst|   2 +-
 drivers/crypto/qat/qat_crypto.c|  10 +-
 drivers/net/bnx2x/bnx2x.c  |   4 +-
 drivers/net/bnxt/bnxt_ethdev.c |  18 +-
 drivers/net/bnxt/bnxt_hwrm.c   |  12 +-
 drivers/net/bnxt/bnxt_txr.c|  59 ++-
 drivers/net/bnxt/bnxt_txr.h|  10 ++
 drivers/net/bnxt/bnxt_vnic.c   |   5 +-
 drivers/net/bnxt/bnxt_vnic.h   |   6 +-
 drivers/net/bonding/rte_eth_bond_api.c |  14 +-
 drivers/net/bonding/rte_eth_bond_pmd.c |   9 +-
 drivers/net/cxgbe/base/t4_hw.c | 183 +
 drivers/net/ena/base/ena_plat_dpdk.h   |  32 ++--
 drivers/net/ena/ena_ethdev.c   |   4 +-
 drivers/net/enic/enic_main.c   |  30 ++--
 drivers/net/i40e/i40e_ethdev.c | 155 -
 drivers/net/i40e/i40e_ethdev_vf.c  |   1 -
 drivers/net/ixgbe/ixgbe_ethdev.h   |   5 +
 drivers/net/ixgbe/ixgbe_fdir.c |  31 +++-
 drivers/net/nfp/nfp_net.c  |  14 +-
 drivers/net/null/rte_eth_null.c|   7 +
 drivers/net/pcap/rte_eth_pcap.c|  86 --
 drivers/net/qede/base/ecore_int.c  |  14 +-
 drivers/net/qede/qede_ethdev.c |   8 +-
 drivers/net/thunderx/nicvf_ethdev.c|   5 +-
 drivers/net/thunderx/nicvf_rxtx.c  |  24 +--
 examples/exception_path/main.c |   3 +
 examples/ipsec-secgw/ipsec-secgw.c |   7 +-
 examples/l3fwd/l3fwd_em.c  |   1 -
 examples/l3fwd/l3fwd_lpm.c |   1 -
 examples/multi_process/Makefile|   1 +
 lib/librte_eal/common/include/rte_version.h|   2 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   |   2 +-
 lib/librte_eal/linuxapp/eal/eal_thread.c   |   4 +-
 .../linuxapp/kni/ethtool/igb/igb_ethtool.c |   7 +-
 lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h  |   5 +
 lib/librte_ether/rte_ethdev.c  |  10 ++
 lib/librte_ether/rte_ethdev.h  |   5 +-
 lib/librte_hash/rte_cuckoo_hash.c  |  21 ++-
 lib/librte_hash/rte_cuckoo_hash_x86.h  |   3 +
 lib/librte_hash/rte_hash.h |  20 ++-
 lib/librte_kni/rte_kni.c   |   3 +
 lib/librte_mbuf/rte_mbuf_ptype.h   |   6 +-
 lib/librte_net/rte_ip.h|  26 +--
 lib/librte_sched/rte_bitmap.h  |   8 +-
 lib/librte_vhost/virtio_net.c  |   1 +
 mk/rte.sdkinstall.mk   |  36 ++--
 pkg/dpdk.spec  |   2 +-
 57 files changed, 805 insertions(+), 317 deletions(-)
Adrien Mazarguil (1):
  maintainers: update for Mellanox PMDs

Ajit Khaparde (6):
  net/bnxt: fix HW Tx checksum offload check
  net/bnxt: fix incorrect IO address handling in Tx
  net/bnxt: fix Rx ring count limitation
  net/bnxt: check access denied for HWRM commands
  net/bnxt: fix RETA size
  net/bnxt: fix close operation

Alejandro Lucero (1):
  net/nfp: fix field initialization in Tx descriptor

Anatoly Burakov (2):
  eal/linux: fix invalid syntax in interrupts
  test: fix EAL flags autotest on FreeBSD

Beilei Xing (1):
  net/i40e: fix shifts of 32-bit value

Bruce Richardson (2):
  examples/exception_path: fix out-of-bounds read
  mk: fix permissions when using make install

Chas Williams (1):
  net/bonding: do not clear active slave count

Damjan Marion (1):
  net/i40e: do not reset device info data

Dan Gora (1):
  kni: fix crash with null name

Daria Kolistratova (1):
  net/ena: fix SIGFPE with 0 Rx queue

Dariusz Stojaczyk (1):
  eal: fix return codes on thread naming failure

Drocula Lambda (1):
  kni: fix build on RHEL 7.5

Emma Kenny (1):
  examples/multi_process: build l2fwd_fork app

Ferruh Yigit (2):
  

Re: [dpdk-dev] IXGBE throughput loss with 4+ cores

2018-08-29 Thread Wiles, Keith


> On Aug 29, 2018, at 12:19 PM, Saber Rezvani  wrote:
> 
> 
> 
> On 08/29/2018 01:39 AM, Wiles, Keith wrote:
>> 
>>> On Aug 28, 2018, at 2:16 PM, Saber Rezvani  wrote:
>>> 
>>> 
>>> 
>>> On 08/28/2018 11:39 PM, Wiles, Keith wrote:
 Which version of Pktgen? I just pushed a patch in 3.5.3 to fix a  
 performance problem.
>>> I use Pktgen verion 3.0.0, indeed it is O.k as far as I  have one core. (10 
>>> Gb/s) but when I increase the number of core (one core per queue) then I 
>>> loose some performance (roughly 8.5 Gb/s for 8-core). In my scenario Pktgen 
>>> shows it is generating at line rate, but receiving 8.5 Gb/s.
>>> Is it because of Pktgen???
>> Normally Pktgen can receive at line rate up to 10G 64 byte frames, which 
>> means Pktgen should not be the problem. You can verify that by looping the 
>> cable from one port to another on the pktgen machine to create a external 
>> loopback. Then send traffic what ever you can send from one port you should 
>> be able to receive those packets unless something is configured wrong.
>> 
>> Please send me the command line for pktgen.
>> 
>> 
>> In pktgen if you have this config -m “[1-4:5-8].0” then you have 4 cores 
>> sending traffic and 4 core receiving packets.
>> 
>> In this case the TX cores will be sending the packets on all 4 lcores to the 
>> same port. On the rx side you have 4 cores polling 4 rx queues. The rx 
>> queues are controlled by RSS, which means the RX traffic 5 tuples hash must 
>> divide the inbound packets across all 4 queues to make sure each core is 
>> doing the same amount of work. If you are sending only a single packet on 
>> the Tx cores then only one rx queue be used.
>> 
>> I hope that makes sense.
> I think there is a misunderstanding of the problem. Indeed the problem is not 
> the Pktgen.
> Here is my command --> ./app/app/x86_64-native-linuxapp-gcc/pktgen -c ffc 
> -n 4 -w 84:00.0 -w 84:00.1 --file-prefix pktgen_F2 --socket-mem 
> 1000,2000,1000,1000 -- -T -P -m "[18-19:20-21].0, [22:23].1"
> 
> The problem is when I run the symmetric_mp example for $numberOfProcesses=8 
> cores, then I have less throughput (roughly 8.4 Gb/s). but when I run it for 
> $numberOfProcesses=3 cores throughput is 10G.
> for i in `seq $numberOfProcesses`;
> do
>  some calculation goes here.
>  symmetric_mp -c $coremask -n 2 --proc-type=auto -w 0b:00.0 -w 
> 0b:00.1 --file-prefix sm --socket-mem 4000,1000,1000,1000 -- -p 3 
> --num-procs=$numberOfProcesses --proc-id=$procid";
>  .
> done

Most NICs have a limited amount of memory on the NIC and when you start to 
segment that memory because you are using more queues it can effect performance.

In one of the NICs if you go over say 6 or 5 queues the memory per queue for 
Rx/Tx packets starts to become a bottle neck as you do not have enough memory 
in the Tx/Rx queues to hold enough packets. This can cause the NIC to drop Rx 
packets because the host can not pull the data from the NIC or Rx ring on the 
host fast enough. This seems to be the problem as the amount of time to process 
a packet on the host has not changed only the amount of buffer space in the NIC 
as you increase queues.

I am not sure this is your issue, but I figured I would state this point.

> 
> I am trying find out what makes this loss!
> 
> 
> On Aug 28, 2018, at 12:05 PM, Saber Rezvani  wrote:
> 
> 
> 
> On 08/28/2018 08:31 PM, Stephen Hemminger wrote:
>> On Tue, 28 Aug 2018 17:34:27 +0430
>> Saber Rezvani  wrote:
>> 
>>> Hi,
>>> 
>>> 
>>> I have run multi_process/symmetric_mp example in DPDK example directory.
>>> For a one process its throughput is line rate but as I increase the
>>> number of cores I see decrease in throughput. For example, If the number
>>> of queues set to 4 and each queue assigns to a single core, then the
>>> throughput will be something about 9.4. if 8 queues, then throughput
>>> will be 8.5.
>>> 
>>> I have read the following, but it was not convincing.
>>> 
>>> http://mails.dpdk.org/archives/dev/2015-October/024960.html
>>> 
>>> 
>>> I am eagerly looking forward to hearing from you, all.
>>> 
>>> 
>>> Best wishes,
>>> 
>>> Saber
>>> 
>>> 
>> Not completely surprising. If you have more cores than packet line rate
>> then the number of packets returned for each call to rx_burst will be 
>> less.
>> With large number of cores, most of the time will be spent doing reads of
>> PCI registers for no packets!
> Indeed pktgen says it is generating traffic at line rate, but receiving 
> less than 10 Gb/s. So, it that case there should be something that causes 
> the reduction in throughput :(
> 
> 
 Regards,
 Keith
 
>>> 
>>> 
>> Regards,
>> Keith
>> 
> 
> Best regards,
> Saber

Regards,
Keith



Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface

2018-08-29 Thread Dan Gora
On Wed, Aug 29, 2018 at 12:54 PM, Stephen Hemminger
 wrote:
> On Thu, 28 Jun 2018 18:55:08 -0700
> Dan Gora  wrote:
>
>> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
>> applications to update the link state for the KNI network interfaces
>> in the linux kernel.
>>
>> Note that the default carrier state is set to off when the interface
>> is opened.
>>
>> Signed-off-by: Dan Gora 
>
> Do you really need a special ioctl for this?
> There is already ability to set link state via sysfs or netlink.

I think yes.. AFAIK sysfs does not constitute a stable API; it's only
available for Linux (yes, I know KNI is linux-only currently, but
there's not really any technical reason why it can't work on BSD) and
there are already callbacks to change the MTU and MAC addresses which
could also be done via netlink.  IMHO having the kernel have an
accurate view of the link state is more important than the ability to
change the MAC address of the interface...

In our application we want the linux kernel/"normal" userspace to be
able to use the DPDK controlled interfaces like any other interface.
We need to be able to assign IP addresses to them, have them
participate in routing, etc.  Since they are controlled via our DPDK
application, there is no way for the kernel to know when the cable is
connected/removed since that information is only communicated to the
DPDK application.

The other option, which I toyed with but decided against, would be to
have a polling thread in the KNI module to call a callback into the
DPDK application to poll the link status.  However that would still
possibly leave a time period when the link is down, but the kernel
does not know about it.  I decided that it would probably be best to
just have a way for the DPDK application to inform the linux kernel
(via the KNI module) that the link was down.

It's important for the linux kernel to know about the link status if
the interface is going to be treated like any other.  Things like
assigning IP addresses and adding the interfaces to the routing table
happen automatically when the link is marked "up".  If the link is not
marked "up", or is "up" when it should be "down", then the kernel
cannot configure that interface correctly, or will use it when it
should not be.

thanks
dan


Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface

2018-08-29 Thread Dan Gora
On Wed, Aug 29, 2018 at 8:48 AM, Ferruh Yigit  wrote:
> On 6/29/2018 2:55 AM, Dan Gora wrote:
>> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
>> applications to update the link state for the KNI network interfaces
>> in the linux kernel.
>>
>> Note that the default carrier state is set to off when the interface
>> is opened.
>
> Why set carrier off when interface opened?

A couple of reasons:

1) That's the way every other Ethernet driver in the linux kernel does
it that I've seen.

2) The DPDK application may not actually be ready for the interface to
be used when it is first created.  Things like NetworkManager, etc
will gladly go trying to assign IP addresses to those interfaces, add
them to the routing table, etc as soon as the interface is marked
"up".  By making the default be "down", this allows the application to
finish any initialization on the DPDK side of the interface before
allowing it to be used by the kernel.

> Although I don't see any difference
> in interface state with or without this call.

Previously in the 'ip addr' output, the 'state' would be 'UNKNOWN'
when the interface was created.  After this patch the 'state' in 'ip
addr' is 'DOWN'.

thanks
dan


Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface

2018-08-29 Thread Stephen Hemminger
On Wed, 29 Aug 2018 18:02:06 -0300
Dan Gora  wrote:

> On Wed, Aug 29, 2018 at 12:54 PM, Stephen Hemminger
>  wrote:
> > On Thu, 28 Jun 2018 18:55:08 -0700
> > Dan Gora  wrote:
> >  
> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> >> applications to update the link state for the KNI network interfaces
> >> in the linux kernel.
> >>
> >> Note that the default carrier state is set to off when the interface
> >> is opened.
> >>
> >> Signed-off-by: Dan Gora   
> >
> > Do you really need a special ioctl for this?
> > There is already ability to set link state via sysfs or netlink.  
> 
> I think yes.. AFAIK sysfs does not constitute a stable API; 

It is a stable API on Linux.

> it's only
> available for Linux (yes, I know KNI is linux-only currently, but
> there's not really any technical reason why it can't work on BSD) and
> there are already callbacks to change the MTU and MAC addresses which
> could also be done via netlink.  IMHO having the kernel have an
> accurate view of the link state is more important than the ability to
> change the MAC address of the interface...

The device model on BSD is significantly different than Linux.
Doing KNI on BSD is going to be a full rewrite of the driver anyway;
I won't worry about sysfs, dependency.

The important part is that if KNI is ever going to be supportable
it needs to be upstream in Linux, not a bolt on out of tree driver.
Most Enterprise distributions will not support out of tree drivers
for good reasons.



Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface

2018-08-29 Thread Stephen Hemminger
On Wed, 29 Aug 2018 18:10:35 -0300
Dan Gora  wrote:

> On Wed, Aug 29, 2018 at 8:48 AM, Ferruh Yigit  wrote:
> > On 6/29/2018 2:55 AM, Dan Gora wrote:  
> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
> >> applications to update the link state for the KNI network interfaces
> >> in the linux kernel.
> >>
> >> Note that the default carrier state is set to off when the interface
> >> is opened.  
> >
> > Why set carrier off when interface opened?  
> 
> A couple of reasons:
> 
> 1) That's the way every other Ethernet driver in the linux kernel does
> it that I've seen.
> 
> 2) The DPDK application may not actually be ready for the interface to
> be used when it is first created.  Things like NetworkManager, etc
> will gladly go trying to assign IP addresses to those interfaces, add
> them to the routing table, etc as soon as the interface is marked
> "up".  By making the default be "down", this allows the application to
> finish any initialization on the DPDK side of the interface before
> allowing it to be used by the kernel.
> 
> > Although I don't see any difference
> > in interface state with or without this call.  
> 
> Previously in the 'ip addr' output, the 'state' would be 'UNKNOWN'
> when the interface was created.  After this patch the 'state' in 'ip
> addr' is 'DOWN'.
> 
> thanks
> dan

There is also a better (richer) API for link status on linux via
the operstate functions. Those might better match the semantics of
a tunnellish interface like KNI.


Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface

2018-08-29 Thread Dan Gora
On Wed, Aug 29, 2018 at 7:00 PM, Stephen Hemminger
 wrote:
>> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
>> >> applications to update the link state for the KNI network interfaces
>> >> in the linux kernel.
>> >>
>> >> Note that the default carrier state is set to off when the interface
>> >> is opened.
>> >>
>> >> Signed-off-by: Dan Gora 
>> >
>> > Do you really need a special ioctl for this?
>> > There is already ability to set link state via sysfs or netlink.
>>
>> I think yes.. AFAIK sysfs does not constitute a stable API;
>
> It is a stable API on Linux.

Ok, I didn't know this...

Still it seems better to me to be able to call
rte_kni_update_link(kni, link); than 'open ("/sys/whatever/where ever
it may be this kernel version/link/"); write(fd, "1"); close(fd); or
whatever...

But I guess if it is actually a stable API, we can hide all of that in
'rte_kni_update_link() and just do away with the ioctl!

I'm actually kind of shocked that I'm the only one who has run into
this.. I would have thought that having an accurate link status would
have been important for whoever used KNI.

>
>> it's only
>> available for Linux (yes, I know KNI is linux-only currently, but
>> there's not really any technical reason why it can't work on BSD) and
>> there are already callbacks to change the MTU and MAC addresses which
>> could also be done via netlink.  IMHO having the kernel have an
>> accurate view of the link state is more important than the ability to
>> change the MAC address of the interface...
>
> The device model on BSD is significantly different than Linux.
> Doing KNI on BSD is going to be a full rewrite of the driver anyway;
> I won't worry about sysfs, dependency.
>
> The important part is that if KNI is ever going to be supportable
> it needs to be upstream in Linux, not a bolt on out of tree driver.
> Most Enterprise distributions will not support out of tree drivers
> for good reasons.

Agreed there.. I was really torn between using KNI or the TAP
interface.  KNI seems cleaner, and at least at the time that I started
working on this, seemed like the way to interface to the kernel moving
forward.  The TAP interface stuff didn't seem like it was necessarily
going to be supported moving forward and the KNI was supposed to be
the "high performance" method to interface to the kernel.  But having
to build and install the rte_kni module on every system that we
install our software on is a major pain.

d


Re: [dpdk-dev] [PATCH v2 10/10] kni: add API to set link status on kernel interface

2018-08-29 Thread Dan Gora
On Wed, Aug 29, 2018 at 7:12 PM, Dan Gora  wrote:
> On Wed, Aug 29, 2018 at 7:00 PM, Stephen Hemminger
>  wrote:
>>> >> Add a new API function to KNI, rte_kni_update_link() to allow DPDK
>>> >> applications to update the link state for the KNI network interfaces
>>> >> in the linux kernel.
>>> >>
>>> >> Note that the default carrier state is set to off when the interface
>>> >> is opened.
>>> >>
>>> >> Signed-off-by: Dan Gora 
>>> >
>>> > Do you really need a special ioctl for this?
>>> > There is already ability to set link state via sysfs or netlink.
>>>
>>> I think yes.. AFAIK sysfs does not constitute a stable API;
>>
>> It is a stable API on Linux.
>

Actually this does not seem to be completely true...

>From Documentation/admin-guide/sysfs-rules.rst:

Rules on how to access information in sysfs
===

The kernel-exported sysfs exports internal kernel implementation details
and depends on internal kernel structures and layout. It is agreed upon
by the kernel developers that the Linux kernel does not provide a stable
internal API. Therefore, there are aspects of the sysfs interface that
may not be stable across kernel releases.



- devices are only "devices"
There is no such thing like class-, bus-, physical devices,
interfaces, and such that you can rely on in userspace. Everything is
just simply a "device". Class-, bus-, physical, ... types are just
kernel implementation details which should not be expected by
applications that look for devices in sysfs.

The properties of a device are:

- devpath (``/devices/pci:00/:00:1d.1/usb2/2-2/2-2:1.0``)


- kernel name (``sda``, ``tty``, ``:00:1f.2``, ...)


- subsystem (``block``, ``tty``, ``pci``, ...)


- driver (``tg3``, ``ata_piix``, ``uhci_hcd``)


- attributes


Everything else is just a kernel driver-core implementation detail
that should not be assumed to be stable across kernel releases.


  1   2   >