[dpdk-dev] [PATCH v3] lib/librte_timer:fix corruption with reset

2020-07-10 Thread Sarosh Arif
If the user tries to reset/stop some other timer in it's callback
function, which is also about to expire, using 
rte_timer_reset_sync/rte_timer_stop_sync the application goes into
an infinite loop. This happens because 
rte_timer_reset_sync/rte_timer_stop_sync loop until the timer 
resets/stops and there is check inside timer_set_config_state which
prevents a running timer from being reset/stopped by not it's own 
timer_cb. Therefore timer_set_config_state returns -1 due to which 
rte_timer_reset returns -1 and rte_timer_reset_sync goes into an 
infinite loop. 

The soloution to this problem is to return -1 from 
rte_timer_reset_sync/rte_timer_stop_sync in case the user tries to 
reset/stop some other timer in it's callback function.

Bugzilla ID: 491
Fixes: 20d159f20543 ("timer: fix corruption with reset")
Cc: h.mikit...@gmail.com
Signed-off-by: Sarosh Arif 
---
v2: remove line continuations
v3: separate code and declarations
---
 lib/librte_timer/rte_timer.c | 26 --
 lib/librte_timer/rte_timer.h |  4 ++--
 2 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 6d19ce469..0cd3e2c86 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -576,14 +576,24 @@ rte_timer_alt_reset(uint32_t timer_data_id, struct 
rte_timer *tim,
 }
 
 /* loop until rte_timer_reset() succeed */
-void
+int
 rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks,
 enum rte_timer_type type, unsigned tim_lcore,
 rte_timer_cb_t fct, void *arg)
 {
+   struct rte_timer_data *timer_data;
+   TIMER_DATA_VALID_GET_OR_ERR_RET(default_data_id, timer_data, -EINVAL);
+
+   if (tim->status.state == RTE_TIMER_RUNNING &&
+   (tim->status.owner != (uint16_t)tim_lcore ||
+   tim != timer_data->priv_timer[tim_lcore].running_tim))
+   return -1;
+
while (rte_timer_reset(tim, ticks, type, tim_lcore,
   fct, arg) != 0)
rte_pause();
+
+   return 0;
 }
 
 static int
@@ -642,11 +652,23 @@ rte_timer_alt_stop(uint32_t timer_data_id, struct 
rte_timer *tim)
 }
 
 /* loop until rte_timer_stop() succeed */
-void
+int
 rte_timer_stop_sync(struct rte_timer *tim)
 {
+   struct rte_timer_data *timer_data;
+   unsigned int lcore_id = rte_lcore_id();
+
+   TIMER_DATA_VALID_GET_OR_ERR_RET(default_data_id, timer_data, -EINVAL);
+
+   if (tim->status.state == RTE_TIMER_RUNNING &&
+   (tim->status.owner != (uint16_t)lcore_id ||
+   tim != timer_data->priv_timer[lcore_id].running_tim))
+   return -1;
+
while (rte_timer_stop(tim) != 0)
rte_pause();
+
+   return 0;
 }
 
 /* Test the PENDING status of the timer handle tim */
diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index c6b3d450d..392ca423d 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -275,7 +275,7 @@ int rte_timer_reset(struct rte_timer *tim, uint64_t ticks,
  * @param arg
  *   The user argument of the callback function.
  */
-void
+int
 rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks,
 enum rte_timer_type type, unsigned tim_lcore,
 rte_timer_cb_t fct, void *arg);
@@ -314,7 +314,7 @@ int rte_timer_stop(struct rte_timer *tim);
  * @param tim
  *   The timer handle.
  */
-void rte_timer_stop_sync(struct rte_timer *tim);
+int rte_timer_stop_sync(struct rte_timer *tim);
 
 /**
  * Test if a timer is pending.
-- 
2.17.1



[dpdk-dev] [PATCH v7 01/25] ethdev: allow unknown link speed

2020-07-10 Thread Ivan Dyukov
From: Thomas Monjalon 

When querying the link information, the link status is
a mandatory major information.
Other boolean values are supposed to be accurate:
- duplex mode (half/full)
- negotiation (auto/fixed)

This API update is making explicit that the link speed information
is optional.
The value ETH_SPEED_NUM_NONE (0) was already part of the API.
The value ETH_SPEED_NUM_UNKNOWN (infinite) is added to cover
two different cases:
- speed is not known by the driver
- device is virtual

Suggested-by: Morten Brørup 
Suggested-by: Benoit Ganne 
Signed-off-by: Thomas Monjalon 
Reviewed-by: Ferruh Yigit 
---
 lib/librte_ethdev/rte_ethdev.h | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a49242bcd..2090af501 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -303,6 +303,7 @@ struct rte_eth_stats {
 #define ETH_SPEED_NUM_56G  56000 /**<  56 Gbps */
 #define ETH_SPEED_NUM_100G10 /**< 100 Gbps */
 #define ETH_SPEED_NUM_200G20 /**< 200 Gbps */
+#define ETH_SPEED_NUM_UNKNOWN UINT32_MAX /**< Unknown */
 
 /**
  * A structure used to retrieve link-level information of an Ethernet port.
@@ -2262,15 +2263,16 @@ int rte_eth_allmulticast_disable(uint16_t port_id);
 int rte_eth_allmulticast_get(uint16_t port_id);
 
 /**
- * Retrieve the status (ON/OFF), the speed (in Mbps) and the mode (HALF-DUPLEX
- * or FULL-DUPLEX) of the physical link of an Ethernet device. It might need
- * to wait up to 9 seconds in it.
+ * Retrieve the link status (up/down), the duplex mode (half/full),
+ * the negotiation (auto/fixed), and if available, the speed (Mbps).
+ *
+ * It might need to wait up to 9 seconds.
+ * @see rte_eth_link_get_nowait.
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
  * @param link
- *   A pointer to an *rte_eth_link* structure to be filled with
- *   the status, the speed and the mode of the Ethernet device link.
+ *   Link information written back.
  * @return
  *   - (0) if successful.
  *   - (-ENOTSUP) if the function is not supported in PMD driver.
@@ -2279,15 +2281,13 @@ int rte_eth_allmulticast_get(uint16_t port_id);
 int rte_eth_link_get(uint16_t port_id, struct rte_eth_link *link);
 
 /**
- * Retrieve the status (ON/OFF), the speed (in Mbps) and the mode (HALF-DUPLEX
- * or FULL-DUPLEX) of the physical link of an Ethernet device. It is a no-wait
- * version of rte_eth_link_get().
+ * Retrieve the link status (up/down), the duplex mode (half/full),
+ * the negotiation (auto/fixed), and if available, the speed (Mbps).
  *
  * @param port_id
  *   The port identifier of the Ethernet device.
  * @param link
- *   A pointer to an *rte_eth_link* structure to be filled with
- *   the status, the speed and the mode of the Ethernet device link.
+ *   Link information written back.
  * @return
  *   - (0) if successful.
  *   - (-ENOTSUP) if the function is not supported in PMD driver.
-- 
2.17.1



[dpdk-dev] [PATCH v7 0/25] ethdev: allow unknown link speed

2020-07-10 Thread Ivan Dyukov
MAINTAINERS  |   1 +
 app/proc-info/main.c |   9 ++
 app/test-pipeline/init.c |  11 ---
 app/test-pmd/config.c|  20 -
 app/test-pmd/testpmd.c   |   9 +-
 app/test/Makefile|   3 ++
 app/test/meson.build |   2 ++
 app/test/test_ethdev_link.c  | 278 

 app/test/test_pmd_perf.c |  17 +--
 doc/guides/sample_app_ug/link_status_intr.rst|  10 +++
 drivers/net/i40e/i40e_ethdev.c   |   5 +++-
 drivers/net/i40e/i40e_ethdev_vf.c|  10 +++
 drivers/net/ice/ice_ethdev.c |   5 +++-
 drivers/net/ixgbe/ixgbe_ethdev.c |   6 +---
 examples/bbdev_app/main.c|   8 ++---
 examples/ioat/ioatfwd.c  |  13 
 examples/ip_fragmentation/main.c |  13 
 examples/ip_pipeline/cli.c   |  12 
 examples/ip_reassembly/main.c|  12 +++-
 examples/ipsec-secgw/ipsec-secgw.c   |  12 +++-
 examples/ipv4_multicast/main.c   |  12 +++-
 examples/kni/main.c  |  26 ++--
 examples/l2fwd-crypto/main.c |  12 +++-
 examples/l2fwd-event/main.c  |  12 +++-
 examples/l2fwd-jobstats/main.c   |  12 +++-
 examples/l2fwd-keepalive/main.c  |  12 +++-
 examples/l2fwd/main.c|  12 +++-
 examples/l3fwd-acl/main.c|  12 +++-
 examples/l3fwd-graph/main.c  |  14 +++--
 examples/l3fwd-power/main.c  |  13 +++-
 examples/l3fwd/main.c|  12 +++-
 examples/link_status_interrupt/main.c|  30 
---
 examples/multi_process/client_server_mp/mp_server/init.c |  14 -
 examples/multi_process/symmetric_mp/main.c   |  12 +++-
 examples/ntb/ntb_fwd.c   |  10 +++
 examples/performance-thread/l3fwd-thread/main.c  |  12 +++-
 examples/qos_sched/init.c|  10 ++-
 examples/server_node_efd/server/init.c   |  15 --
 examples/vm_power_manager/main.c |  14 -
 lib/librte_ethdev/rte_ethdev.c   | 169 

 lib/librte_ethdev/rte_ethdev.h   |  74 
+++---
 lib/librte_ethdev/rte_ethdev_version.map |   4 +++
 42 files changed, 687 insertions(+), 282 deletions(-)

v7 changes:
* fix meson build
* change _strf function. now it does not fails in case of unknown specifiers 
like %d. it just copy it to target string.
* remove invalid_fmt unit test.
* add unknown specifier test.
* fix codestyle

v6 changes:
* fix spelling in comments according to checkpatch warning

v5 changes:
* rename rte_eth_link_format to rte_eth_link_strf
* add '\n' to default strings
* update remaining examples. patch with subj 'examples: new link status print 
format' contains examples which have no maintainers.
TBD:
update remaining nic drivers with 'unknown' speed.  It should be provided in 
separate patchset.

v4 changes:
* refactor rte_eth_link_format using strlcat func instead of snprintf
* added new checks to unit tests
* few minor fixes according review comments
TBD:
update examples in 'example' folder with new status printing mechanism
update remaining nic drivers with 'unknown' speed

v3 changes:
* remove rte_eth_link_prepare_text function
* add rte_eth_link_format and rte_eth_link_printf functions
* added unit tests for rte_eth_link_format function
TBD:
update examples in 'example' folder with new status printing mechanism
update remaining nic drivers with 'unknown' speed

v2 changes:
* add function which format link status to textual representation
* update drivers for Intel nics with 'unknown' speed
TBD:
update examples in 'example' folder with new status printing mechanism
update remaining nic drivers with 'unknown' speed

v1 changes:
This is initial patchset which introduces UNKNOWN speed to dpdk
applications. Also it contains changes related t

[dpdk-dev] [PATCH v7 02/25] ethdev: add a link status text representation

2020-07-10 Thread Ivan Dyukov
This commit add function which treat link status structure
and format it to text representation.

Signed-off-by: Ivan Dyukov 
---
 MAINTAINERS  |   1 +
 app/test/Makefile|   3 +
 app/test/meson.build |   2 +
 app/test/test_ethdev_link.c  | 299 +++
 lib/librte_ethdev/rte_ethdev.c   | 174 +
 lib/librte_ethdev/rte_ethdev.h   |  54 
 lib/librte_ethdev/rte_ethdev_version.map |   4 +
 7 files changed, 537 insertions(+)
 create mode 100644 app/test/test_ethdev_link.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 5e706cd7e..f4fb31ea2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -393,6 +393,7 @@ T: git://dpdk.org/next/dpdk-next-net
 F: lib/librte_ethdev/
 F: devtools/test-null.sh
 F: doc/guides/prog_guide/switch_representation.rst
+F: app/test/test_ethdev*
 
 Flow API
 M: Ori Kam 
diff --git a/app/test/Makefile b/app/test/Makefile
index e5440774b..9f43b8c3c 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -251,6 +251,9 @@ SRCS-$(CONFIG_RTE_LIBRTE_SECURITY) += test_security.c
 
 SRCS-$(CONFIG_RTE_LIBRTE_IPSEC) += test_ipsec.c test_ipsec_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_IPSEC) += test_ipsec_sad.c
+
+SRCS-$(CONFIG_RTE_LIBRTE_ETHER) += test_ethdev_link.c
+
 ifeq ($(CONFIG_RTE_LIBRTE_IPSEC),y)
 LDLIBS += -lrte_ipsec
 endif
diff --git a/app/test/meson.build b/app/test/meson.build
index 56591db4e..1e6acf701 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -39,6 +39,7 @@ test_sources = files('commands.c',
'test_efd.c',
'test_efd_perf.c',
'test_errno.c',
+   'test_ethdev_link.c',
'test_event_crypto_adapter.c',
'test_event_eth_rx_adapter.c',
'test_event_ring.c',
@@ -199,6 +200,7 @@ fast_tests = [
 ['eal_flags_misc_autotest', false],
 ['eal_fs_autotest', true],
 ['errno_autotest', true],
+['ethdev_link_status', true],
 ['event_ring_autotest', true],
 ['fib_autotest', true],
 ['fib6_autotest', true],
diff --git a/app/test/test_ethdev_link.c b/app/test/test_ethdev_link.c
new file mode 100644
index 0..8a2f133c0
--- /dev/null
+++ b/app/test/test_ethdev_link.c
@@ -0,0 +1,299 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2020 Samsung Electronics Co., Ltd All Rights Reserved
+ */
+
+#include 
+#include 
+
+#include 
+#include "test.h"
+
+
+static int32_t
+test_link_status_up_default(void)
+{
+   int ret = 0;
+   struct rte_eth_link link_status = {
+   .link_speed = ETH_SPEED_NUM_2_5G,
+   .link_status = ETH_LINK_UP,
+   .link_autoneg = ETH_LINK_AUTONEG,
+   .link_duplex = ETH_LINK_FULL_DUPLEX
+   };
+   char text[128];
+   ret = rte_eth_link_strf(text, 128, NULL, &link_status);
+   RTE_TEST_ASSERT(ret > 0, "Failed to format default string\n");
+   printf("Default link up #1: %s\n", text);
+   TEST_ASSERT_BUFFERS_ARE_EQUAL("Link up at 2.5 Gbit/s FDX Autoneg\n",
+   text, strlen(text), "Invalid default link status string");
+
+   link_status.link_duplex = ETH_LINK_HALF_DUPLEX;
+   link_status.link_autoneg = ETH_LINK_FIXED;
+   link_status.link_speed = ETH_SPEED_NUM_10M,
+   ret = rte_eth_link_strf(text, 128, NULL, &link_status);
+   printf("Default link up #2: %s\n", text);
+   RTE_TEST_ASSERT(ret > 0, "Failed to format default string\n");
+   TEST_ASSERT_BUFFERS_ARE_EQUAL("Link up at 10 Mbit/s HDX Fixed\n",
+   text, strlen(text), "Invalid default link status "
+   "string with HDX");
+
+   link_status.link_speed = ETH_SPEED_NUM_UNKNOWN,
+   ret = rte_eth_link_strf(text, 128, NULL, &link_status);
+   printf("Default link up #3: %s\n", text);
+   RTE_TEST_ASSERT(ret > 0, "Failed to format default string\n");
+   TEST_ASSERT_BUFFERS_ARE_EQUAL("Link up at Unknown speed HDX Fixed\n",
+   text, strlen(text), "Invalid default link status "
+   "string with HDX");
+   return TEST_SUCCESS;
+}
+
+static int32_t
+test_link_status_down_default(void)
+{
+   int ret = 0;
+   struct rte_eth_link link_status = {
+   .link_speed = ETH_SPEED_NUM_2_5G,
+   .link_status = ETH_LINK_DOWN,
+   .link_autoneg = ETH_LINK_AUTONEG,
+   .link_duplex = ETH_LINK_FULL_DUPLEX
+   };
+   char text[128];
+   ret = rte_eth_link_strf(text, 128, NULL, &link_status);
+   RTE_TEST_ASSERT(ret > 0, "Failed to format default string\n");
+   TEST_ASSERT_BUFFERS_ARE_EQUAL("Link down\n",
+   text, strlen(text), "Invalid default link status string");
+
+   return TEST_SUCCESS;
+}
+
+static int32_t
+test_link_status_string_overflow(void)
+{
+   int ret = 0;
+   struct rte_eth_link link_status = {
+   .link_speed = ETH_SPEED_NUM_2_5G,
+   .link_status = ETH_LINK_UP,
+   

[dpdk-dev] [PATCH v7 03/25] app: UNKNOWN link speed print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 app/proc-info/main.c |  9 +++--
 app/test-pipeline/init.c | 11 +--
 app/test-pmd/config.c| 20 
 app/test-pmd/testpmd.c   |  9 +
 app/test/test_pmd_perf.c | 17 +++--
 5 files changed, 28 insertions(+), 38 deletions(-)

diff --git a/app/proc-info/main.c b/app/proc-info/main.c
index abeca4aab..4a4c572c3 100644
--- a/app/proc-info/main.c
+++ b/app/proc-info/main.c
@@ -685,12 +685,9 @@ show_port(void)
printf("Link get failed (port %u): %s\n",
   i, rte_strerror(-ret));
} else {
-   printf("\t  -- link speed %d duplex %d,"
-   " auto neg %d status %d\n",
-   link.link_speed,
-   link.link_duplex,
-   link.link_autoneg,
-   link.link_status);
+   rte_eth_link_printf("\t  -- link speed: %M, duplex: %D,"
+   " auto neg: %A, status: %S\n",
+   &link);
}
printf("\t  -- promiscuous (%d)\n",
rte_eth_promiscuous_get(i));
diff --git a/app/test-pipeline/init.c b/app/test-pipeline/init.c
index 67d54ae05..b59064672 100644
--- a/app/test-pipeline/init.c
+++ b/app/test-pipeline/init.c
@@ -155,7 +155,7 @@ static void
 app_ports_check_link(void)
 {
uint32_t all_ports_up, i;
-
+   char link_status_text[50];
all_ports_up = 1;
 
for (i = 0; i < app.n_ports; i++) {
@@ -173,12 +173,11 @@ app_ports_check_link(void)
all_ports_up = 0;
continue;
}
-
-   RTE_LOG(INFO, USER1, "Port %u (%u Gbps) %s\n",
+   rte_eth_link_strf(link_status_text, 50, "(%G Gbps) %S",
+ &link);
+   RTE_LOG(INFO, USER1, "Port %u %s\n",
port,
-   link.link_speed / 1000,
-   link.link_status ? "UP" : "DOWN");
-
+   link_status_text);
if (link.link_status == ETH_LINK_DOWN)
all_ports_up = 0;
}
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index a7112c998..cb2795a94 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -604,10 +604,9 @@ port_infos_display(portid_t port_id)
} else
printf("\nmemory allocation on the socket: %u",port->socket_id);
 
-   printf("\nLink status: %s\n", (link.link_status) ? ("up") : ("down"));
-   printf("Link speed: %u Mbps\n", (unsigned) link.link_speed);
-   printf("Link duplex: %s\n", (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-  ("full-duplex") : ("half-duplex"));
+   rte_eth_link_printf("\nLink status: %S\n"
+   "Link speed: %M Mbps\n"
+   "Link duplex: %D\n", &link);
 
if (!rte_eth_dev_get_mtu(port_id, &mtu))
printf("MTU: %u\n", mtu);
@@ -730,6 +729,8 @@ port_summary_display(portid_t port_id)
struct rte_eth_link link;
struct rte_eth_dev_info dev_info;
char name[RTE_ETH_NAME_MAX_LEN];
+   char status_text[6];
+   char speed_text[12];
int ret;
 
if (port_id_is_invalid(port_id, ENABLED_WARN)) {
@@ -750,12 +751,14 @@ port_summary_display(portid_t port_id)
if (ret != 0)
return;
 
-   printf("%-4d %02X:%02X:%02X:%02X:%02X:%02X %-12s %-14s %-8s %uMbps\n",
+   rte_eth_link_strf(status_text, 6, "%S", &link);
+   rte_eth_link_strf(speed_text, 12, "%M", &link);
+   printf("%-4d %02X:%02X:%02X:%02X:%02X:%02X %-12s %-14s %-8s %sMbps\n",
port_id, mac_addr.addr_bytes[0], mac_addr.addr_bytes[1],
mac_addr.addr_bytes[2], mac_addr.addr_bytes[3],
mac_addr.addr_bytes[4], mac_addr.addr_bytes[5], name,
-   dev_info.driver_name, (link.link_status) ? ("up") : ("down"),
-   (unsigned int) link.link_speed);
+   dev_info.driver_name, status_text,
+   speed_text);
 }
 
 void
@@ -3899,7 +3902,8 @@ set_queue_rate_limit(portid_t port_id, uint16_t 
queue_idx, uint16_t rate)
ret = eth_link_get_nowait_print_err(port_id, &link);
if (ret < 0)
return 1;
-   if (rate > link.link_speed) {
+   if (link.link_speed != ETH_SPEED_NUM_UNKNOWN &&
+   rate > link.link_speed) {
printf("Invalid rate value:%u bigger than link speed: %u\n",
rate, link.link_speed);
return 1;
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 4989d22ca..a1b9c1c1c 100644
--- a/app/test-pmd/testpmd.c
+++ 

[dpdk-dev] [PATCH v7 04/25] doc: update sample app with unknown speed

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 doc/guides/sample_app_ug/link_status_intr.rst | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/doc/guides/sample_app_ug/link_status_intr.rst 
b/doc/guides/sample_app_ug/link_status_intr.rst
index 04c40f285..596782b9d 100644
--- a/doc/guides/sample_app_ug/link_status_intr.rst
+++ b/doc/guides/sample_app_ug/link_status_intr.rst
@@ -158,6 +158,7 @@ An example callback function that has been written as 
indicated below.
 {
 struct rte_eth_link link;
 int ret;
+char link_status[200];
 
 RTE_SET_USED(param);
 
@@ -169,11 +170,10 @@ An example callback function that has been written as 
indicated below.
 if (ret < 0) {
 printf("Failed to get port %d link status: %s\n\n",
port_id, rte_strerror(-ret));
-} else if (link.link_status) {
-printf("Port %d Link Up - speed %u Mbps - %s\n\n", port_id, 
(unsigned)link.link_speed,
-  (link.link_duplex == ETH_LINK_FULL_DUPLEX) ? ("full-duplex") 
: ("half-duplex"));
-} else
-printf("Port %d Link Down\n\n", port_id);
+} else {
+rte_eth_link_strf(link_status, 200, NULL, &link);
+printf("Port %d %s\n\n", port_id, link_status);
+}
 }
 
 This function is called when a link status interrupt is present for the right 
port.
-- 
2.17.1



[dpdk-dev] [PATCH v7 07/25] net/ice: return unknown speed in status

2020-07-10 Thread Ivan Dyukov
rte_ethdev has declared new NUM_UNKNOWN speed which
could be used in case when no speed information is available and
link is up. NUM_NONE should be returned, if link is down.

Signed-off-by: Ivan Dyukov 
Reviewed-by: Ferruh Yigit 
---
 drivers/net/ice/ice_ethdev.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index b51fa2f17..76f797de0 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -3135,8 +3135,11 @@ ice_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
link.link_speed = ETH_SPEED_NUM_100G;
break;
case ICE_AQ_LINK_SPEED_UNKNOWN:
-   default:
PMD_DRV_LOG(ERR, "Unknown link speed");
+   link.link_speed = ETH_SPEED_NUM_UNKNOWN;
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "None link speed");
link.link_speed = ETH_SPEED_NUM_NONE;
break;
}
-- 
2.17.1



[dpdk-dev] [PATCH v7 05/25] net/ixgbe: return unknown speed in status

2020-07-10 Thread Ivan Dyukov
rte_ethdev has declared new NUM_UNKNOWN speed which
could be used in case when no speed information is available

Signed-off-by: Ivan Dyukov 
Reviewed-by: Wei Zhao 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 248f21d14..34a171116 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -4300,11 +4300,7 @@ ixgbe_dev_link_update_share(struct rte_eth_dev *dev,
switch (link_speed) {
default:
case IXGBE_LINK_SPEED_UNKNOWN:
-   if (hw->device_id == IXGBE_DEV_ID_X550EM_A_1G_T ||
-   hw->device_id == IXGBE_DEV_ID_X550EM_A_1G_T_L)
-   link.link_speed = ETH_SPEED_NUM_10M;
-   else
-   link.link_speed = ETH_SPEED_NUM_100M;
+   link.link_speed = ETH_SPEED_NUM_UNKNOWN;
break;
 
case IXGBE_LINK_SPEED_100_FULL:
-- 
2.17.1



[dpdk-dev] [PATCH v7 08/25] examples: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications:
* ipv4_multicast
* l2fwd-jobstats
* l2fwd-keepalive
* l3fwd
* link_status_interrupt

Signed-off-by: Ivan Dyukov 
---
 examples/ipv4_multicast/main.c| 12 ---
 examples/l2fwd-jobstats/main.c| 12 ---
 examples/l2fwd-keepalive/main.c   | 12 ---
 examples/l3fwd/main.c | 12 ---
 examples/link_status_interrupt/main.c | 30 +++
 5 files changed, 28 insertions(+), 50 deletions(-)

diff --git a/examples/ipv4_multicast/main.c b/examples/ipv4_multicast/main.c
index 7e255c35a..0d4957658 100644
--- a/examples/ipv4_multicast/main.c
+++ b/examples/ipv4_multicast/main.c
@@ -572,6 +572,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -591,14 +592,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+ &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
diff --git a/examples/l2fwd-jobstats/main.c b/examples/l2fwd-jobstats/main.c
index 47a3b0976..740f1c80f 100644
--- a/examples/l2fwd-jobstats/main.c
+++ b/examples/l2fwd-jobstats/main.c
@@ -689,6 +689,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -708,14 +709,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+ &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
diff --git a/examples/l2fwd-keepalive/main.c b/examples/l2fwd-keepalive/main.c
index b2742633b..d8be0a727 100644
--- a/examples/l2fwd-keepalive/main.c
+++ b/examples/l2fwd-keepalive/main.c
@@ -453,6 +453,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -472,14 +473,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+ &link);
+   

[dpdk-dev] [PATCH v7 06/25] net/i40e: return unknown speed in status

2020-07-10 Thread Ivan Dyukov
rte_ethdev has declared new NUM_UNKNOWN speed which
could be used in case when no speed information is available and
link is up. NUM_NONE should be returned, if link is down.

Signed-off-by: Ivan Dyukov 
Acked-by: Jeff Guo 
---
 drivers/net/i40e/i40e_ethdev.c|  5 -
 drivers/net/i40e/i40e_ethdev_vf.c | 10 +-
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 472ce2a9e..f718356b5 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -2891,7 +2891,10 @@ update_link_aq(struct i40e_hw *hw, struct rte_eth_link 
*link,
link->link_speed = ETH_SPEED_NUM_40G;
break;
default:
-   link->link_speed = ETH_SPEED_NUM_NONE;
+   if (link->link_status)
+   link->link_speed = ETH_SPEED_NUM_UNKNOWN;
+   else
+   link->link_speed = ETH_SPEED_NUM_NONE;
break;
}
 }
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index eca716a6a..cf931bf9c 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -2163,15 +2163,15 @@ i40evf_dev_link_update(struct rte_eth_dev *dev,
new_link.link_speed = ETH_SPEED_NUM_40G;
break;
default:
-   new_link.link_speed = ETH_SPEED_NUM_NONE;
+   if (vf->link_up)
+   new_link.link_speed = ETH_SPEED_NUM_UNKNOWN;
+   else
+   new_link.link_speed = ETH_SPEED_NUM_NONE;
break;
}
/* full duplex only */
new_link.link_duplex = ETH_LINK_FULL_DUPLEX;
-   new_link.link_status = vf->link_up &&
-   new_link.link_speed != ETH_SPEED_NUM_NONE
-   ? ETH_LINK_UP
-   : ETH_LINK_DOWN;
+   new_link.link_status = vf->link_up ? ETH_LINK_UP : ETH_LINK_DOWN;
new_link.link_autoneg =
!(dev->data->dev_conf.link_speeds & ETH_LINK_SPEED_FIXED);
 
-- 
2.17.1



[dpdk-dev] [PATCH v7 10/25] examples/ioat: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/ioat/ioatfwd.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c
index b66ee73bc..8bf80c262 100644
--- a/examples/ioat/ioatfwd.c
+++ b/examples/ioat/ioatfwd.c
@@ -700,6 +700,7 @@ check_link_status(uint32_t port_mask)
uint16_t portid;
struct rte_eth_link link;
int ret, link_status = 0;
+   char link_status_text[60];
 
printf("\nChecking link status\n");
RTE_ETH_FOREACH_DEV(portid) {
@@ -715,15 +716,11 @@ check_link_status(uint32_t port_mask)
}
 
/* Print link status */
-   if (link.link_status) {
-   printf(
-   "Port %d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
+   rte_eth_link_strf(link_status_text, 60, NULL, &link);
+   printf("Port %d %s", portid, link_status_text);
+
+   if (link.link_status)
link_status = 1;
-   } else
-   printf("Port %d Link Down\n", portid);
}
return link_status;
 }
-- 
2.17.1



[dpdk-dev] [PATCH v7 12/25] examples/ip_pipeline: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/ip_pipeline/cli.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/examples/ip_pipeline/cli.c b/examples/ip_pipeline/cli.c
index d79699e2e..ca461ea0c 100644
--- a/examples/ip_pipeline/cli.c
+++ b/examples/ip_pipeline/cli.c
@@ -249,7 +249,8 @@ print_link_info(struct link *link, char *out, size_t 
out_size)
struct rte_eth_link eth_link;
uint16_t mtu;
int ret;
-
+   char link_speed_text[16];
+   char link_status_text[10];
memset(&stats, 0, sizeof(stats));
rte_eth_stats_get(link->port_id, &stats);
 
@@ -268,18 +269,19 @@ print_link_info(struct link *link, char *out, size_t 
out_size)
}
 
rte_eth_dev_get_mtu(link->port_id, &mtu);
-
+   rte_eth_link_strf(link_speed_text, 16, "%M", ð_link);
+   rte_eth_link_strf(link_status_text, 10, "%S", ð_link);
snprintf(out, out_size,
"\n"
"%s: flags=<%s> mtu %u\n"
"\tether %02X:%02X:%02X:%02X:%02X:%02X rxqueues %u txqueues 
%u\n"
-   "\tport# %u  speed %u Mbps\n"
+   "\tport# %u  speed %s Mbps\n"
"\tRX packets %" PRIu64"  bytes %" PRIu64"\n"
"\tRX errors %" PRIu64"  missed %" PRIu64"  no-mbuf %" 
PRIu64"\n"
"\tTX packets %" PRIu64"  bytes %" PRIu64"\n"
"\tTX errors %" PRIu64"\n",
link->name,
-   eth_link.link_status == 0 ? "DOWN" : "UP",
+   link_status_text,
mtu,
mac_addr.addr_bytes[0], mac_addr.addr_bytes[1],
mac_addr.addr_bytes[2], mac_addr.addr_bytes[3],
@@ -287,7 +289,7 @@ print_link_info(struct link *link, char *out, size_t 
out_size)
link->n_rxq,
link->n_txq,
link->port_id,
-   eth_link.link_speed,
+   link_speed_text,
stats.ipackets,
stats.ibytes,
stats.ierrors,
-- 
2.17.1



[dpdk-dev] [PATCH v7 11/25] examples/ip_*: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications:
* ip_fragmentation
* ip_reassembly
* l3fwd-acl

Signed-off-by: Ivan Dyukov 
---
 examples/ip_fragmentation/main.c | 13 +
 examples/ip_reassembly/main.c| 12 
 examples/l3fwd-acl/main.c| 12 
 3 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
index 4afb97109..18a6df77e 100644
--- a/examples/ip_fragmentation/main.c
+++ b/examples/ip_fragmentation/main.c
@@ -593,6 +593,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -612,14 +613,10 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up .Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid,
+  link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
diff --git a/examples/ip_reassembly/main.c b/examples/ip_reassembly/main.c
index 494d7ee77..910c89ae3 100644
--- a/examples/ip_reassembly/main.c
+++ b/examples/ip_reassembly/main.c
@@ -712,6 +712,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -731,14 +732,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
diff --git a/examples/l3fwd-acl/main.c b/examples/l3fwd-acl/main.c
index f22fca732..ddfec9487 100644
--- a/examples/l3fwd-acl/main.c
+++ b/examples/l3fwd-acl/main.c
@@ -1815,6 +1815,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -1834,14 +1835,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all

[dpdk-dev] [PATCH v7 13/25] examples/ipsec-secgw: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/ipsec-secgw/ipsec-secgw.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
b/examples/ipsec-secgw/ipsec-secgw.c
index f777ce2af..551838229 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -1775,6 +1775,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -1794,14 +1795,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up - speed %u Mbps -%s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 09/25] examples/bbdev_app: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/bbdev_app/main.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
index 68a46050c..44e6952e6 100644
--- a/examples/bbdev_app/main.c
+++ b/examples/bbdev_app/main.c
@@ -313,6 +313,7 @@ check_port_link_status(uint16_t port_id)
uint8_t count;
struct rte_eth_link link;
int link_get_err = -EINVAL;
+   char link_status_text[60];
 
printf("\nChecking link status.");
fflush(stdout);
@@ -323,11 +324,8 @@ check_port_link_status(uint16_t port_id)
link_get_err = rte_eth_link_get_nowait(port_id, &link);
 
if (link_get_err >= 0 && link.link_status) {
-   const char *dp = (link.link_duplex ==
-   ETH_LINK_FULL_DUPLEX) ?
-   "full-duplex" : "half-duplex";
-   printf("\nPort %u Link Up - speed %u Mbps - %s\n",
-   port_id, link.link_speed, dp);
+   rte_eth_link_strf(link_status_text, 60, NULL, &link);
+   printf("\nPort %u %s", port_id, link_status_text);
return 0;
}
printf(".");
-- 
2.17.1



[dpdk-dev] [PATCH v7 14/25] examples/kni: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/kni/main.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/examples/kni/main.c b/examples/kni/main.c
index f5d12a5b8..8ad7fb532 100644
--- a/examples/kni/main.c
+++ b/examples/kni/main.c
@@ -661,6 +661,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status\n");
fflush(stdout);
@@ -680,14 +681,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up - speed %uMbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
@@ -717,19 +713,15 @@ check_all_ports_link_status(uint32_t port_mask)
 static void
 log_link_state(struct rte_kni *kni, int prev, struct rte_eth_link *link)
 {
+   char link_status_text[60];
if (kni == NULL || link == NULL)
return;
 
-   if (prev == ETH_LINK_DOWN && link->link_status == ETH_LINK_UP) {
-   RTE_LOG(INFO, APP, "%s NIC Link is Up %d Mbps %s %s.\n",
+   rte_eth_link_strf(link_status_text, 60, NULL, link);
+   if (prev != link->link_status)
+   RTE_LOG(INFO, APP, "%s NIC %s",
rte_kni_get_name(kni),
-   link->link_speed,
-   link->link_autoneg ?  "(AutoNeg)" : "(Fixed)",
-   link->link_duplex ?  "Full Duplex" : "Half Duplex");
-   } else if (prev == ETH_LINK_UP && link->link_status == ETH_LINK_DOWN) {
-   RTE_LOG(INFO, APP, "%s NIC Link is Down.\n",
-   rte_kni_get_name(kni));
-   }
+   link_status_text);
 }
 
 /*
-- 
2.17.1



[dpdk-dev] [PATCH v7 15/25] examples/l2fwd-crypt: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/l2fwd-crypto/main.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/examples/l2fwd-crypto/main.c b/examples/l2fwd-crypto/main.c
index 827da9b3e..7648ea027 100644
--- a/examples/l2fwd-crypto/main.c
+++ b/examples/l2fwd-crypto/main.c
@@ -1734,6 +1734,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -1753,14 +1754,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 17/25] examples/l2fwd: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/l2fwd/main.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/examples/l2fwd/main.c b/examples/l2fwd/main.c
index e04c601b5..9d5f7307e 100644
--- a/examples/l2fwd/main.c
+++ b/examples/l2fwd/main.c
@@ -571,6 +571,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -594,14 +595,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 16/25] examples/l2fwd-event: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/l2fwd-event/main.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/examples/l2fwd-event/main.c b/examples/l2fwd-event/main.c
index 4fe500333..3e6d1c311 100644
--- a/examples/l2fwd-event/main.c
+++ b/examples/l2fwd-event/main.c
@@ -366,6 +366,7 @@ check_all_ports_link_status(struct l2fwd_resources *rsrc,
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status...");
fflush(stdout);
@@ -389,14 +390,9 @@ check_all_ports_link_status(struct l2fwd_resources *rsrc,
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   port_id, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", port_id);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", port_id, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 23/25] examples/qos_sched: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/qos_sched/init.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
index 9626c15b8..4bb975fc9 100644
--- a/examples/qos_sched/init.c
+++ b/examples/qos_sched/init.c
@@ -160,14 +160,8 @@ app_init_port(uint16_t portid, struct rte_mempool *mp)
 "rte_eth_link_get: err=%d, port=%u: %s\n",
 ret, portid, rte_strerror(-ret));
 
-   if (link.link_status) {
-   printf(" Link Up - speed %u Mbps - %s\n",
-   (uint32_t) link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   } else {
-   printf(" Link Down\n");
-   }
+   rte_eth_link_printf(NULL, &link);
+
ret = rte_eth_promiscuous_enable(portid);
if (ret != 0)
rte_exit(EXIT_FAILURE,
-- 
2.17.1



[dpdk-dev] [PATCH v7 18/25] examples/l3fwd-graph: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/l3fwd-graph/main.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/examples/l3fwd-graph/main.c b/examples/l3fwd-graph/main.c
index c70270c4d..cd8e3aad1 100644
--- a/examples/l3fwd-graph/main.c
+++ b/examples/l3fwd-graph/main.c
@@ -599,6 +599,7 @@ check_all_ports_link_status(uint32_t port_mask)
struct rte_eth_link link;
uint16_t portid;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -623,16 +624,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* Print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf("Port%d Link Up. Speed %u Mbps "
-  "-%s\n",
-  portid, link.link_speed,
-  (link.link_duplex ==
-   ETH_LINK_FULL_DUPLEX)
-  ? ("full-duplex")
-  : ("half-duplex\n"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* Clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 22/25] example/performance*: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/performance-thread/l3fwd-thread/main.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/examples/performance-thread/l3fwd-thread/main.c 
b/examples/performance-thread/l3fwd-thread/main.c
index 84c1d7b3a..bd10014a0 100644
--- a/examples/performance-thread/l3fwd-thread/main.c
+++ b/examples/performance-thread/l3fwd-thread/main.c
@@ -3433,6 +3433,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -3452,14 +3453,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 19/25] examples/l3fwd-power: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/l3fwd-power/main.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 9db94ce04..ba6bab4a5 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -1945,6 +1945,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint16_t portid;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -1964,15 +1965,9 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf("Port %d Link Up - speed %u "
-   "Mbps - %s\n", (uint8_t)portid,
-   (unsigned)link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n",
-   (uint8_t)portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 20/25] examples/multi_proc*: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 .../client_server_mp/mp_server/init.c  | 14 +-
 examples/multi_process/symmetric_mp/main.c | 12 
 2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/examples/multi_process/client_server_mp/mp_server/init.c 
b/examples/multi_process/client_server_mp/mp_server/init.c
index c2ec07ac6..3ca9bcae3 100644
--- a/examples/multi_process/client_server_mp/mp_server/init.c
+++ b/examples/multi_process/client_server_mp/mp_server/init.c
@@ -185,6 +185,7 @@ check_all_ports_link_status(uint16_t port_num, uint32_t 
port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -204,15 +205,10 @@ check_all_ports_link_status(uint16_t port_num, uint32_t 
port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf("Port %d Link Up - speed %u "
-   "Mbps - %s\n", 
ports->id[portid],
-   (unsigned)link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n",
-   (uint8_t)ports->id[portid]);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", (uint8_t)ports->id[portid],
+  link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
diff --git a/examples/multi_process/symmetric_mp/main.c 
b/examples/multi_process/symmetric_mp/main.c
index 9a16e198c..0480874f8 100644
--- a/examples/multi_process/symmetric_mp/main.c
+++ b/examples/multi_process/symmetric_mp/main.c
@@ -365,6 +365,7 @@ check_all_ports_link_status(uint16_t port_num, uint32_t 
port_mask)
uint8_t count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -384,14 +385,9 @@ check_all_ports_link_status(uint16_t port_num, uint32_t 
port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   portid, link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n", portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid, link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 25/25] examples/vm_power_*: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/vm_power_manager/main.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/examples/vm_power_manager/main.c b/examples/vm_power_manager/main.c
index 273bfec29..05aec1aad 100644
--- a/examples/vm_power_manager/main.c
+++ b/examples/vm_power_manager/main.c
@@ -244,6 +244,7 @@ check_all_ports_link_status(uint32_t port_mask)
uint16_t portid, count, all_ports_up, print_flag = 0;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -267,15 +268,10 @@ check_all_ports_link_status(uint32_t port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf("Port %d Link Up - speed %u "
-   "Mbps - %s\n", (uint16_t)portid,
-   (unsigned int)link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n",
-   (uint16_t)portid);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", portid,
+  link_status_text);
continue;
}
   /* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 24/25] examples/server_nod*: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/server_node_efd/server/init.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/examples/server_node_efd/server/init.c 
b/examples/server_node_efd/server/init.c
index 378a74fa5..00224850e 100644
--- a/examples/server_node_efd/server/init.c
+++ b/examples/server_node_efd/server/init.c
@@ -247,6 +247,7 @@ check_all_ports_link_status(uint16_t port_num, uint32_t 
port_mask)
uint16_t portid;
struct rte_eth_link link;
int ret;
+   char link_status_text[60];
 
printf("\nChecking link status");
fflush(stdout);
@@ -266,16 +267,10 @@ check_all_ports_link_status(uint16_t port_num, uint32_t 
port_mask)
}
/* print link status if flag set */
if (print_flag == 1) {
-   if (link.link_status)
-   printf(
-   "Port%d Link Up. Speed %u Mbps - %s\n",
-   info->id[portid],
-   link.link_speed,
-   (link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
-   else
-   printf("Port %d Link Down\n",
-   info->id[portid]);
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Port %d %s", info->id[portid],
+  link_status_text);
continue;
}
/* clear all_ports_up flag if any link down */
-- 
2.17.1



[dpdk-dev] [PATCH v7 21/25] examples/ntb: new link status print format

2020-07-10 Thread Ivan Dyukov
Add usage of rte_eth_link_strf function to example
applications

Signed-off-by: Ivan Dyukov 
---
 examples/ntb/ntb_fwd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
index eba8ebf9f..84fe374c4 100644
--- a/examples/ntb/ntb_fwd.c
+++ b/examples/ntb/ntb_fwd.c
@@ -729,6 +729,7 @@ start_pkt_fwd(void)
struct rte_eth_link eth_link;
uint32_t lcore_id;
int ret, i;
+   char link_status_text[60];
 
ret = ntb_fwd_config_setup();
if (ret < 0) {
@@ -747,11 +748,10 @@ start_pkt_fwd(void)
return;
}
if (eth_link.link_status) {
-   printf("Eth%u Link Up. Speed %u Mbps - %s\n",
-   eth_port_id, eth_link.link_speed,
-   (eth_link.link_duplex ==
-ETH_LINK_FULL_DUPLEX) ?
-   ("full-duplex") : ("half-duplex"));
+   rte_eth_link_strf(link_status_text, 60, NULL,
+   &link);
+   printf("Eth%u %s", eth_port_id,
+  link_status_text);
break;
}
}
-- 
2.17.1



Re: [dpdk-dev] [PATCH v3] eal: use c11 atomic built-ins for interrupt status

2020-07-10 Thread Dodji Seketeli
David Marchand  writes:

[...]

>> --- a/devtools/libabigail.abignore
>> +++ b/devtools/libabigail.abignore
>> @@ -48,6 +48,10 @@
>>  changed_enumerators = RTE_CRYPTO_AEAD_LIST_END
>>  [suppress_variable]
>>  name = rte_crypto_aead_algorithm_strings
>> +; Ignore updates of epoll event
>> +[suppress_type]
>> +type_kind = struct
>> +name = rte_epoll_event
>
> In general, ignoring all changes on a structure is risky.
> But the risk is acceptable as long as we remember this for the rest of
> the 20.08 release (and we will start from scratch for 20.11).

Right, I thought about this too when I saw that change.  If that struct
is inherently *not* part of the logically exposed ABI, the risk is
really minimal as well.  In that case, maybe a comment saying so in the
.abignore file could be useful for future reference.

[...]

Cheers,

-- 
Dodji



[dpdk-dev] [Bug 505] [dpdk-20.08] meson build 32-bits failed on ubuntu20.04

2020-07-10 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=505

Bug ID: 505
   Summary: [dpdk-20.08] meson build 32-bits failed on ubuntu20.04
   Product: DPDK
   Version: 20.08
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: major
  Priority: High
 Component: meson
  Assignee: dev@dpdk.org
  Reporter: xuemingx.zh...@intel.com
  Target Milestone: 20.08

•DPDK version: 
commit 2e40fdc2d305e6864c8840a0985018edc94562d5 
Author: Karra Satwik 
Date:   Sat Jun 13 03:40:20 2020 +0530  
net/cxgbe: always enable HASH filter support

•OS:
Linux ub2004-i686 5.4.0-26-generic #30-Ubuntu SMP Mon Apr 20 16:58:30 UTC 2020
x86_64 x86_64 x86_64 GNU/Linux
•Compiler:
gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0 
clang version 10.0.0-4ubuntu1

Test Setup:
export CFLAGS="-m32"
export PKG_CONFIG_LIBDIR="/usr/lib/pkgconfig"
./devtools/test-meson-builds.sh


Show the output from the previous commands:
ninja -C ./build-clang-static
ninja: Entering directory `./build-clang-static'
[36/1953] Compiling C object
'lib/76b5a35@@rte_eal@sta/librte_eal_common_eal_common_trace.c.o'.
FAILED: lib/76b5a35@@rte_eal@sta/librte_eal_common_eal_common_trace.c.o
clang -Ilib/76b5a35@@rte_eal@sta -Ilib -I../lib -I. -I../ -Iconfig -I../config
-Ilib/librte_eal/include -I../lib/librte_eal/include
-Ilib/librte_eal/linux/include -I../lib/librte_eal/linux/include
-Ilib/librte_eal/x86/include -I../lib/librte_eal/x86/include
-Ilib/librte_eal/common -I../lib/librte_eal/common -Ilib/librte_eal
-I../lib/librte_eal -Ilib/librte_kvargs -I../lib/librte_kvargs
-Ilib/librte_telemetry/../librte_metrics
-I../lib/librte_telemetry/../librte_metrics -Ilib/librte_telemetry
-I../lib/librte_telemetry -Xclang -fcolor-diagnostics -pipe
-D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -O2 -g -include rte_config.h
-Wextra -Wcast-qual -Wdeprecated -Wformat-nonliteral -Wformat-security
-Wmissing-declarations -Wmissing-prototypes -Wnested-externs
-Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes
-Wundef -Wwrite-strings -Wno-address-of-packed-member
-Wno-missing-field-initializers -Wno-pointer-to-int-cast -D_GNU_SOURCE -m32
-fPIC -march=native -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API
-DRTE_LIBEAL_USE_GETENTROPY -DHAVE_GETOPT_H -DHAVE_GETOPT -DHAVE_GETOPT_LONG
-MD -MQ 'lib/76b5a35@@rte_eal@sta/librte_eal_common_eal_common_trace.c.o' -MF
'lib/76b5a35@@rte_eal@sta/librte_eal_common_eal_common_trace.c.o.d' -o
'lib/76b5a35@@rte_eal@sta/librte_eal_common_eal_common_trace.c.o' -c
../lib/librte_eal/common/eal_common_trace.c
../lib/librte_eal/common/eal_common_trace.c:160:8: error: misaligned atomic
operation may incur significant performance penalty
[-Werror,-Watomic-alignment]
val = __atomic_load_n(trace, __ATOMIC_ACQUIRE);
  ^
1 error generated.
[41/1953] Compiling C object
'lib/76b5a35@@rte_eal@sta/librte_eal_common_eal_common_options.c.o'.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors

2020-07-10 Thread Kinsella, Ray



On 08/07/2020 11:22, David Marchand wrote:
> abidiff can provide some more information about the ABI difference it
> detected.
> In all cases, a discussion on the mailing must happen but we can give
> some hints to know if this is a problem with the script calling abidiff,
> a potential ABI breakage or an unambiguous ABI breakage.
> 
> Signed-off-by: David Marchand 
> ---
>  devtools/check-abi.sh | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
> index e17fedbd9f..521e2cce7c 100755
> --- a/devtools/check-abi.sh
> +++ b/devtools/check-abi.sh
> @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
>   error=1
>   continue
>   fi
> - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
> + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
> + abiret=$?
>   echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS 
> $dump $dump2'"
>   error=1
> - fi
> + echo
> + if [ $(($abiret & 3)) != 0 ]; then
> + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report 
> this to dev@dpdk.org."
> + fi
> + if [ $(($abiret & 4)) != 0 ]; then
> + echo "ABIDIFF_ABI_CHANGE, this change requires a review 
> (abidiff flagged this as a potential issue)."
> + fi
> + if [ $(($abiret & 8)) != 0 ]; then
> + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change 
> breaks the ABI."
> + fi
> + echo
> + }
>  done
>  
>  [ -z "$error" ] || [ -n "$warnonly" ]
> 

Acked-by: Ray Kinsella 


[dpdk-dev] Weird 2 KB MBUF data room requirement

2020-07-10 Thread Morten Brørup
Dear Ethernet PMD developers,

According to rte_mbuf_core.h, RTE_MBUF_DEFAULT_DATAROOM is 2048 bytes because 
some NICs need at least 2 KB buffer to receive standard Ethernet frames without 
splitting them into multiple segments.

This is a serious waste of memory, considering that standard Ethernet frames 
are max 1518 bytes.

How wide spread is this limitation... is it common or a rare exception?

Where is it documented which NICs suffer from this limitation?

Do any Intel NICs suffer from this limitation?


NB: We are targeting an MBUF total size (incl. memzone element overhead) of 
2^N, and this limitation would increase our MBUF total size to 4 KB.


Med venlig hilsen / kind regards
- Morten Brørup



[dpdk-dev] [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API

2020-07-10 Thread Bing Zhao
Add a new item "rte_flow_item_ecpri" in order to match eCRPI header.

eCPRI is a packet based protocol used in the fronthaul interface of
5G networks. Header format definition could be found in the
specification via the link below:
https://www.gigalight.com/downloads/standards/ecpri-specification.pdf

eCPRI message can be over Ethernet layer (.1Q supported also) or over
UDP layer. Message header formats are the same in these two variants.

Signed-off-by: Bing Zhao 
Acked-by: Ori Kam 
---
 doc/guides/prog_guide/rte_flow.rst |   8 ++
 doc/guides/rel_notes/release_20_08.rst |   5 +
 lib/librte_ethdev/rte_flow.c   |   1 +
 lib/librte_ethdev/rte_flow.h   |  31 ++
 lib/librte_net/Makefile|   1 +
 lib/librte_net/meson.build |   3 +-
 lib/librte_net/rte_ecpri.h | 182 +
 lib/librte_net/rte_ether.h |   1 +
 8 files changed, 231 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_net/rte_ecpri.h

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index d5dd18c..669d519 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1362,6 +1362,14 @@ Matches a PFCP Header.
 - ``seid``: session endpoint identifier.
 - Default ``mask`` matches s_field and seid.
 
+Item: ``ECPRI``
+^
+
+Matches a eCPRI header.
+
+- ``hdr``: eCPRI header definition (``rte_ecpri.h``).
+- Default ``mask`` matches message type of common header only.
+
 Actions
 ~~~
 
diff --git a/doc/guides/rel_notes/release_20_08.rst 
b/doc/guides/rel_notes/release_20_08.rst
index 988474c..19feb68 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -184,6 +184,11 @@ New Features
   which are used to access packet data in a safe manner. Currently JIT support
   for these instructions is implemented for x86 only.
 
+* **Added eCPRI protocol support in rte_flow.**
+
+  The ``ECPRI`` item have been added to support eCPRI packet offloading for
+  5G network.
+
 * **Added flow performance test application.**
 
   Added new application to test ``rte_flow`` performance, including:
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 1685be5..f8fdd68 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -95,6 +95,7 @@ struct rte_flow_desc_data {
MK_FLOW_ITEM(HIGIG2, sizeof(struct rte_flow_item_higig2_hdr)),
MK_FLOW_ITEM(L2TPV3OIP, sizeof(struct rte_flow_item_l2tpv3oip)),
MK_FLOW_ITEM(PFCP, sizeof(struct rte_flow_item_pfcp)),
+   MK_FLOW_ITEM(ECPRI, sizeof(struct rte_flow_item_ecpri)),
 };
 
 /** Generate flow_action[] entry. */
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index b0e4199..8a90226 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -527,6 +528,15 @@ enum rte_flow_item_type {
 */
RTE_FLOW_ITEM_TYPE_PFCP,
 
+   /**
+* Matches eCPRI Header.
+*
+* Configure flow for eCPRI over ETH or UDP packets.
+*
+* See struct rte_flow_item_ecpri.
+*/
+   RTE_FLOW_ITEM_TYPE_ECPRI,
+
 };
 
 /**
@@ -1547,6 +1557,27 @@ struct rte_flow_item_pfcp {
 #endif
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_ECPRI
+ *
+ * Match eCPRI Header
+ */
+struct rte_flow_item_ecpri {
+   struct rte_ecpri_msg_hdr hdr;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_ECPRI. */
+#ifndef __cplusplus
+static const struct rte_flow_item_ecpri rte_flow_item_ecpri_mask = {
+   .hdr = {
+   .dw0 = 0x0,
+   },
+};
+#endif
+
+/**
  * Matching pattern item definition.
  *
  * A pattern is formed by stacking items starting from the lowest protocol
diff --git a/lib/librte_net/Makefile b/lib/librte_net/Makefile
index aa1d6fe..9830e77 100644
--- a/lib/librte_net/Makefile
+++ b/lib/librte_net/Makefile
@@ -20,5 +20,6 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_sctp.h 
rte_icmp.h rte_arp.h
 SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_ether.h rte_gre.h rte_net.h
 SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_net_crc.h rte_mpls.h 
rte_higig.h
 SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_gtp.h rte_vxlan.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include += rte_ecpri.h
 
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_net/meson.build b/lib/librte_net/meson.build
index f799349..24ed825 100644
--- a/lib/librte_net/meson.build
+++ b/lib/librte_net/meson.build
@@ -15,7 +15,8 @@ headers = files('rte_ip.h',
'rte_net.h',
'rte_net_crc.h',
'rte_mpls.h',
-   'rte_higig.h')
+   'rte_higig.h',
+   'rte_ecpri.h')
 
 sources = files('rte_arp.c', 'rte_ether.c', 'rte_net.c', 'rte_net_crc.c')
 deps += ['mbuf']
diff --git a/lib/librte_net/rte_ecpri.h b/lib/li

[dpdk-dev] [PATCH v5 2/2] app/testpmd: add eCPRI in flow creation patterns

2020-07-10 Thread Bing Zhao
In order to verify offloading of eCPRI protocol via flow rules, the
command line of flow creation should support the parsing of the eCPRI
pattern.

Based on the specification, one eCPRI message will have the common
header and payload. Payload format is various based on the type field
of the common header. Fixed strings will be used instead of integer
to make the CLI easy for auto-completion.

The testpmd command line examples of flow to match eCPRI item are
listed below:
  1. flow create 0 ... pattern eth / ecpri / end actions ...
This is to match all eCPRI messages.
  2. flow create 0 ... pattern eth / ecpri common type rtc_ctrl / end actions 
...
This is to match all eCPRI messages with the type #2 - "Real-Time
Control Data".
  3. flow create 0 ... pattern eth / ecpri common type iq_data pc_id is 
[U16Int] / end actions ...
This is to match eCPRI messages with the type #0 - "IQ Data", and
the physical channel ID 'pc_id' of the messages is a specific
value. Since the sequence ID is changeable, there is no need to
match that field in the flow.
Currently, only type #0, #2 and #5 will be supported.

Since eCPRI could be over Ethernet layer (or after .1Q) and UDP
layer, it is the PMD driver's responsibility to check whether eCPRI
is supported and which protocol stack is supported. Network byte
order should be used for eCPRI header, the same as other headers.

Signed-off-by: Bing Zhao 
Acked-by: Ori Kam 
---
 app/test-pmd/cmdline_flow.c | 143 
 1 file changed, 143 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 4e2006c..801581e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -230,6 +230,15 @@ enum index {
ITEM_PFCP,
ITEM_PFCP_S_FIELD,
ITEM_PFCP_SEID,
+   ITEM_ECPRI,
+   ITEM_ECPRI_COMMON,
+   ITEM_ECPRI_COMMON_TYPE,
+   ITEM_ECPRI_COMMON_TYPE_IQ_DATA,
+   ITEM_ECPRI_COMMON_TYPE_RTC_CTRL,
+   ITEM_ECPRI_COMMON_TYPE_DLY_MSR,
+   ITEM_ECPRI_MSG_IQ_DATA_PCID,
+   ITEM_ECPRI_MSG_RTC_CTRL_RTCID,
+   ITEM_ECPRI_MSG_DLY_MSR_MSRID,
 
/* Validate/create actions. */
ACTIONS,
@@ -791,6 +800,7 @@ struct parse_action_priv {
ITEM_ESP,
ITEM_AH,
ITEM_PFCP,
+   ITEM_ECPRI,
END_SET,
ZERO,
 };
@@ -1101,6 +,24 @@ struct parse_action_priv {
ZERO,
 };
 
+static const enum index item_ecpri[] = {
+   ITEM_ECPRI_COMMON,
+   ITEM_NEXT,
+   ZERO,
+};
+
+static const enum index item_ecpri_common[] = {
+   ITEM_ECPRI_COMMON_TYPE,
+   ZERO,
+};
+
+static const enum index item_ecpri_common_type[] = {
+   ITEM_ECPRI_COMMON_TYPE_IQ_DATA,
+   ITEM_ECPRI_COMMON_TYPE_RTC_CTRL,
+   ITEM_ECPRI_COMMON_TYPE_DLY_MSR,
+   ZERO,
+};
+
 static const enum index next_action[] = {
ACTION_END,
ACTION_VOID,
@@ -1409,6 +1437,9 @@ static int parse_vc_spec(struct context *, const struct 
token *,
 const char *, unsigned int, void *, unsigned int);
 static int parse_vc_conf(struct context *, const struct token *,
 const char *, unsigned int, void *, unsigned int);
+static int parse_vc_item_ecpri_type(struct context *, const struct token *,
+   const char *, unsigned int,
+   void *, unsigned int);
 static int parse_vc_action_rss(struct context *, const struct token *,
   const char *, unsigned int, void *,
   unsigned int);
@@ -2802,6 +2833,66 @@ static int comp_set_raw_index(struct context *, const 
struct token *,
.next = NEXT(item_pfcp, NEXT_ENTRY(UNSIGNED), item_param),
.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_pfcp, seid)),
},
+   [ITEM_ECPRI] = {
+   .name = "ecpri",
+   .help = "match eCPRI header",
+   .priv = PRIV_ITEM(ECPRI, sizeof(struct rte_flow_item_ecpri)),
+   .next = NEXT(item_ecpri),
+   .call = parse_vc,
+   },
+   [ITEM_ECPRI_COMMON] = {
+   .name = "common",
+   .help = "eCPRI common header",
+   .next = NEXT(item_ecpri_common),
+   },
+   [ITEM_ECPRI_COMMON_TYPE] = {
+   .name = "type",
+   .help = "type of common header",
+   .next = NEXT(item_ecpri_common_type),
+   .args = ARGS(ARG_ENTRY_HTON(struct rte_flow_item_ecpri)),
+   },
+   [ITEM_ECPRI_COMMON_TYPE_IQ_DATA] = {
+   .name = "iq_data",
+   .help = "Type #0: IQ Data",
+   .next = NEXT(NEXT_ENTRY(ITEM_ECPRI_MSG_IQ_DATA_PCID,
+   ITEM_NEXT)),
+   .call = parse_vc_item_ecpri_type,
+   },
+   [ITEM_ECPRI_MSG_IQ_DATA_PCID] = {
+   .name = "pc_id",
+   .help = "Physical Ch

[dpdk-dev] [PATCH v5 0/2] rte_flow: introduce eCPRI item for rte_flow

2020-07-10 Thread Bing Zhao
This patch set contains two commits.
1. header definition of the ethdev API
2. testpmd support for the eCPRI flow item

---
v2: Add dw0 for the eCPRI common header to switch the endianess, and
use fixed u32 value with big-endian for rte_flow_item_ecpri_mask.
It is due to the fact that global variable only support constant
expression in C when building.
v3: Add commit for testpmd support.
v4: update release notes part.
v5: fix type#6 define, add event indication macros, and comments for
revisions.
---

Bing Zhao (2):
  rte_flow: add eCPRI key fields to flow API
  app/testpmd: add eCPRI in flow creation patterns

 app/test-pmd/cmdline_flow.c| 143 ++
 doc/guides/prog_guide/rte_flow.rst |   8 ++
 doc/guides/rel_notes/release_20_08.rst |   5 +
 lib/librte_ethdev/rte_flow.c   |   1 +
 lib/librte_ethdev/rte_flow.h   |  31 ++
 lib/librte_net/Makefile|   1 +
 lib/librte_net/meson.build |   3 +-
 lib/librte_net/rte_ecpri.h | 182 +
 lib/librte_net/rte_ether.h |   1 +
 9 files changed, 374 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_net/rte_ecpri.h

-- 
1.8.3.1



Re: [dpdk-dev] 18.11.9 (LTS) patches review and test

2020-07-10 Thread Kevin Traynor
On 30/06/2020 11:02, Ali Alnubani wrote:
>> -Original Message-
>> From: Kevin Traynor 
>> Sent: Tuesday, June 30, 2020 12:53 PM
>> To: Ali Alnubani ; sta...@dpdk.org
>> Cc: dev@dpdk.org; Abhishek Marathe ;
>> Akhil Goyal ; benjamin.wal...@intel.com; David
>> Christensen ; Hemant Agrawal
>> ; Ian Stokes ; Jerin
>> Jacob ; John McNamara ;
>> Ju-Hyoung Lee ; Luca Boccassi ;
>> Pei Zhang ; pingx...@intel.com;
>> qian.q...@intel.com; Raslan Darawsheh ; Thomas
>> Monjalon ; yuan.p...@intel.com;
>> zhaoyan.c...@intel.com
>> Subject: Re: 18.11.9 (LTS) patches review and test
>>
>> On 30/06/2020 09:54, Ali Alnubani wrote:
>>> Hi,
>>>
>>
>> Hi Ali,
>>
>> Thanks for testing.
>>
 -Original Message-
 From: Kevin Traynor 
 Sent: Friday, June 26, 2020 3:53 PM
 To: sta...@dpdk.org
 Cc: dev@dpdk.org; Abhishek Marathe
>> ;
 Akhil Goyal ; Ali Alnubani
 ; benjamin.wal...@intel.com; David Christensen
 ; Hemant Agrawal
>> ;
 Ian Stokes ; Jerin Jacob ;
 John McNamara ; Ju-Hyoung Lee
 ; Kevin Traynor ; Luca
 Boccassi ; Pei Zhang ;
 pingx...@intel.com; qian.q...@intel.com; Raslan Darawsheh
 ; Thomas Monjalon ;
 yuan.p...@intel.com; zhaoyan.c...@intel.com
 Subject: 18.11.9 (LTS) patches review and test

 Hi all,

 Here is a list of patches targeted for LTS release 18.11.9.

 The planned date for the final release is 3rd July.

 Please help with testing and validation of your use cases and report
 any issues/results with reply-all to this mail. For the final release
 the fixes and reported validations will be added to the release notes.

>>>
>>> We ran the following tests on Mellanox hardware for this version:
>>> - Basic functionality:
>>>   Send and receive multiple types of traffic.
>>> - testpmd xstats counter tests.
>>> - testpmd timestamp tests.
>>> - Changing/checking link status through testpmd.
>>> - RTE flow and flow_director tests:
>>>   Items: eth / vlan / ipv4 / ipv6 / tcp / udp / gre
>>>   Actions: drop / queue / rss / mark / flag
>>> - Some RSS tests.
>>> - VLAN stripping and insertion tests.
>>> - Checksum and TSO tests.
>>> - ptype tests.
>>> - l3fwd-power example application tests.
>>> - Multi-process example applications tests.
>>>
>>> Testing matrix:
>>> - NIC: ConnectX-4 Lx / OS: RHEL7.4 / Driver:
>>> MLNX_OFED_LINUX-5.0-2.1.8.0 / Firmware: 14.27.1016
>>> - NIC: ConnectX-5 / OS: RHEL7.4 / Driver: MLNX_OFED_LINUX-5.0-2.1.8.0
>>> / Firmware: 16.27.2008
>>>
>>> We found 2 issues:
>>> - Failure to restart ports with kernels newer than 5.6 (call to mmap failed 
>>> on
>> UAR for txq).
>>> - Applications fail to start with CONFIG_RTE_LIBRTE_MLX5_DEBUG enabled
>> (mlx5_ifindex: Assertion `priv->if_index' failed).
>>

To wrap this thread, the following series were applied for the above
issues and Ali re-ran validation.

http://inbox.dpdk.org/stable/1594105971-14738-1-git-send-email-viachesl...@mellanox.com/

http://inbox.dpdk.org/stable/1593700670-25730-1-git-send-email-viachesl...@mellanox.com/


>> Are these regressions? i.e. does 18.11.8 fail in the same way in the same
>> testbed
> 
> The first issue reproduces in previous releases as well, but only with 
> kernels newer than 5.6.
> The second one seems to have started reproducing in v18.11.7, but I only 
> noticed it while testing this release.
> 
>>
>>> We are discussing these issues internally, and we see no other regressions
>> blocking the release.
>>>
>>
>> ok, please let me know the outcome and if any patches need to be
>> added/removed to resolve these issues.
> 
> Will do.
> 
>>
>> thanks,
>> Kevin.
>>
>>> Regards,
>>> Ali
>>>
> 
> Thanks,
> Ali
> 



[dpdk-dev] [Bug 506] i40e: Fix for rte_eth_dev_get_module_eeprom()

2020-07-10 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=506

Bug ID: 506
   Summary: i40e: Fix for rte_eth_dev_get_module_eeprom()
   Product: DPDK
   Version: 20.05
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: frederic.coiff...@6cure.com
  Target Milestone: ---

In DPDK 20.05 (and maybe previous versions), the
rte_eth_dev_get_module_eeprom() returns 512 bytes but all bytes are equal to
0x04.

Therefore, using the official Intel i40e 2.11.29 with ethtool -m works fine.

By comparing the 2 source code, we found a small typo in
i40e_get_module_eeprom():

- i40e-2.11.29:

status = i40e_aq_get_phy_register(hw,
I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE,
addr, true, offset, &value, NULL);

- DPDK 20.05:

status = i40e_aq_get_phy_register(hw,
I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE,
addr, offset, 1, &value, NULL);

By swapping offset and 1 in the DPDK source code, the
rte_eth_dev_get_module_eeprom() function works fine.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[dpdk-dev] DPDK hugepage memory fragmentation

2020-07-10 Thread Kamaraj P
Hello All,

We are running to run DPDK based application in a container mode,
When we do multiple start/stop of our container application, the DPDK
initialization seems to be failing.
This is because the hugepage memory fragementated and is not able to find
the continuous allocation of the memory to initialize the buffer in the
dpdk init.

As part of the cleanup of the container, we do call rte_eal_cleanup() to
cleanup the memory w.r.t our application. However after iterations we still
see the memory allocation failure due to the fragmentation issue.

We also tried to set the "--huge-unlink" as an argument before when we
called the rte_eal_init() and it did not help.

Could you please suggest if there is an option or any existing patches
available to clean up the memory to avoid fragmentation issues in the
future.

Please advise.

Thanks.
Kamaraj


[dpdk-dev] [PATCH] eal/linux: truncate thread name

2020-07-10 Thread David Marchand
pthread_setname_np refuses names larger than 16 bytes (\0 included).
Rather than return an error, truncate the name to this limit in the
rte_thread_setname helper.

Caught with ixgbe which creates control thread with name
"ixgbe-link-handler":

Configuring Port 0 (socket 0)
EAL: Cannot set name for ctrl thread
...
EAL: Cannot set name for ctrl thread

Port 0: link state change event
...
EAL: Cannot set name for ctrl thread

Port 0: link state change event

Note: before this change, the thread would keep its original name, which
meant in my test for the ixgbe handler either "dpdk-testpmd" or
"eal-intr-thread".

Signed-off-by: David Marchand 
---
 lib/librte_eal/linux/eal_thread.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linux/eal_thread.c 
b/lib/librte_eal/linux/eal_thread.c
index 48a2c1124b..068de25595 100644
--- a/lib/librte_eal/linux/eal_thread.c
+++ b/lib/librte_eal/linux/eal_thread.c
@@ -153,7 +153,10 @@ int rte_thread_setname(pthread_t id, const char *name)
int ret = ENOSYS;
 #if defined(__GLIBC__) && defined(__GLIBC_PREREQ)
 #if __GLIBC_PREREQ(2, 12)
-   ret = pthread_setname_np(id, name);
+   char truncated[16];
+
+   strlcpy(truncated, name, sizeof(truncated));
+   ret = pthread_setname_np(id, truncated);
 #endif
 #endif
RTE_SET_USED(id);
-- 
2.23.0



[dpdk-dev] [PATCH v1 03/16] net/mlx5: fix UAR lock sharing for multiport devices

2020-07-10 Thread Viacheslav Ovsiienko
The master and representors might be created over the multiport
Infiniband devices and the UAR resource allocated for sibling
ports might belong to the same underlying Infiniband device.
Hardware requires the write access to the UAR must be performed
as atomic 64-bit write, on 32-bit systems this is two sequential
writes, protected by lock. Due to possibility to share the same
UAR between sibling devices the locks must be moved to shared
context.

Fixes: f048f3d479a6 ("net/mlx5: switch to the shared IB device context")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/linux/mlx5_os.c |  6 --
 drivers/net/mlx5/mlx5.c  |  6 ++
 drivers/net/mlx5/mlx5.h  | 10 +-
 drivers/net/mlx5/mlx5_rxq.c  |  2 +-
 drivers/net/mlx5/mlx5_txq.c  |  2 +-
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index daccd1c..7abb85d 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -630,12 +630,6 @@
priv->mtu = RTE_ETHER_MTU;
priv->mp_id.port_id = port_id;
strlcpy(priv->mp_id.name, MLX5_MP_NAME, RTE_MP_MAX_NAME_LEN);
-#ifndef RTE_ARCH_64
-   /* Initialize UAR access locks for 32bit implementations. */
-   rte_spinlock_init(&priv->uar_lock_cq);
-   for (i = 0; i < MLX5_UAR_PAGE_NUM_MAX; i++)
-   rte_spinlock_init(&priv->uar_lock[i]);
-#endif
/* Some internal functions rely on Netlink sockets, open them now. */
priv->nl_socket_rdma = mlx5_nl_init(NETLINK_RDMA);
priv->nl_socket_route = mlx5_nl_init(NETLINK_ROUTE);
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 13242a5..2efbc03 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -717,6 +717,12 @@ struct mlx5_dev_ctx_shared *
err = ENOMEM;
goto error;
}
+#ifndef RTE_ARCH_64
+   /* Initialize UAR access locks for 32bit implementations. */
+   rte_spinlock_init(&sh->uar_lock_cq);
+   for (i = 0; i < MLX5_UAR_PAGE_NUM_MAX; i++)
+   rte_spinlock_init(&sh->uar_lock[i]);
+#endif
/*
 * Once the device is added to the list of memory event
 * callback, its global MR cache table cannot be expanded
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 84cd3e1..d01d7f3 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -559,6 +559,11 @@ struct mlx5_dev_ctx_shared {
void *fdb_domain; /* FDB Direct Rules name space handle. */
void *rx_domain; /* RX Direct Rules name space handle. */
void *tx_domain; /* TX Direct Rules name space handle. */
+#ifndef RTE_ARCH_64
+   rte_spinlock_t uar_lock_cq; /* CQs share a common distinct UAR */
+   rte_spinlock_t uar_lock[MLX5_UAR_PAGE_NUM_MAX];
+   /* UAR same-page access control required in 32bit implementations. */
+#endif
struct mlx5_hlist *flow_tbls;
/* Direct Rules tables for FDB, NIC TX+RX */
void *esw_drop_action; /* Pointer to DR E-Switch drop action. */
@@ -673,11 +678,6 @@ struct mlx5_priv {
uint8_t mtr_color_reg; /* Meter color match REG_C. */
struct mlx5_mtr_profiles flow_meter_profiles; /* MTR profile list. */
struct mlx5_flow_meters flow_meters; /* MTR list. */
-#ifndef RTE_ARCH_64
-   rte_spinlock_t uar_lock_cq; /* CQs share a common distinct UAR */
-   rte_spinlock_t uar_lock[MLX5_UAR_PAGE_NUM_MAX];
-   /* UAR same-page access control required in 32bit implementations. */
-#endif
uint8_t skip_default_rss_reta; /* Skip configuration of default reta. */
uint8_t fdb_def_rule; /* Whether fdb jump to table 1 is configured. */
struct mlx5_mp_id mp_id; /* ID of a multi-process process */
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index b436f06..2681322 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1997,7 +1997,7 @@ struct mlx5_rxq_ctrl *
tmpl->rxq.elts =
(struct rte_mbuf *(*)[1 << tmpl->rxq.elts_n])(tmpl + 1);
 #ifndef RTE_ARCH_64
-   tmpl->rxq.uar_lock_cq = &priv->uar_lock_cq;
+   tmpl->rxq.uar_lock_cq = &priv->sh->uar_lock_cq;
 #endif
tmpl->rxq.idx = idx;
rte_atomic32_inc(&tmpl->refcnt);
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index 35b3ade..e1fa24e 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -355,7 +355,7 @@
/* Assign an UAR lock according to UAR page number */
lock_idx = (txq_ctrl->uar_mmap_offset / page_size) &
   MLX5_UAR_PAGE_NUM_MASK;
-   txq_ctrl->txq.uar_lock = &priv->uar_lock[lock_idx];
+   txq_ctrl->txq.uar_lock = &priv->sh->uar_lock[lock_idx];
 #endif
 }
 
-- 
1.8.3.1



[dpdk-dev] [PATCH v1 01/16] common/mlx5: update common part to support packet pacing

2020-07-10 Thread Viacheslav Ovsiienko
This patch prepares the common part of the mlx5 PMDs to
support packet send scheduling on mbuf timestamps:

  - the DevX routine to query the packet pacing HCA capabilities
  - packet pacing Send Queue attrubutes support
  - the hardware related definitions

Signed-off-by: Viacheslav Ovsiienko 
---

RFC:  http://patches.dpdk.org/patch/71078/
mbuf: http://patches.dpdk.org/patch/73643/

 drivers/common/mlx5/Makefile  | 20 ++
 drivers/common/mlx5/linux/meson.build |  8 
 drivers/common/mlx5/linux/mlx5_glue.c | 31 +++-
 drivers/common/mlx5/linux/mlx5_glue.h |  5 +++
 drivers/common/mlx5/mlx5_devx_cmds.c  | 19 +-
 drivers/common/mlx5/mlx5_devx_cmds.h  | 10 +
 drivers/common/mlx5/mlx5_prm.h| 69 ---
 7 files changed, 154 insertions(+), 8 deletions(-)

diff --git a/drivers/common/mlx5/Makefile b/drivers/common/mlx5/Makefile
index f6c762b..de03a40 100644
--- a/drivers/common/mlx5/Makefile
+++ b/drivers/common/mlx5/Makefile
@@ -172,6 +172,11 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
func mlx5dv_devx_qp_query \
$(AUTOCONF_OUTPUT)
$Q sh -- '$<' '$@' \
+   HAVE_MLX5DV_PP_ALLOC \
+   infiniband/mlx5dv.h \
+   func mlx5dv_pp_alloc \
+   $(AUTOCONF_OUTPUT)
+   $Q sh -- '$<' '$@' \
HAVE_MLX5DV_DR_ACTION_DEST_DEVX_TIR \
infiniband/mlx5dv.h \
func mlx5dv_dr_action_create_dest_devx_tir \
@@ -207,6 +212,21 @@ mlx5_autoconf.h.new: $(RTE_SDK)/buildtools/auto-config-h.sh
func mlx5dv_dr_domain_set_reclaim_device_memory \
$(AUTOCONF_OUTPUT)
$Q sh -- '$<' '$@' \
+   HAVE_MLX5_OPCODE_ENHANCED_MPSW \
+   infiniband/mlx5dv.h \
+   enum MLX5_OPCODE_ENHANCED_MPSW \
+   $(AUTOCONF_OUTPUT)
+   $Q sh -- '$<' '$@' \
+   HAVE_MLX5_OPCODE_SEND_EN \
+   infiniband/mlx5dv.h \
+   enum MLX5_OPCODE_SEND_EN \
+   $(AUTOCONF_OUTPUT)
+   $Q sh -- '$<' '$@' \
+   HAVE_MLX5_OPCODE_WAIT \
+   infiniband/mlx5dv.h \
+   enum MLX5_OPCODE_WAIT \
+   $(AUTOCONF_OUTPUT)
+   $Q sh -- '$<' '$@' \
HAVE_ETHTOOL_LINK_MODE_25G \
/usr/include/linux/ethtool.h \
enum ETHTOOL_LINK_MODE_25000baseCR_Full_BIT \
diff --git a/drivers/common/mlx5/linux/meson.build 
b/drivers/common/mlx5/linux/meson.build
index 2294213..6116b5e 100644
--- a/drivers/common/mlx5/linux/meson.build
+++ b/drivers/common/mlx5/linux/meson.build
@@ -101,6 +101,8 @@ has_sym_args = [
'mlx5dv_devx_obj_query_async' ],
[ 'HAVE_IBV_DEVX_QP', 'infiniband/mlx5dv.h',
'mlx5dv_devx_qp_query' ],
+   [ 'HAVE_MLX5DV_PP_ALLOC', 'infiniband/mlx5dv.h',
+   'mlx5dv_pp_alloc' ],
[ 'HAVE_MLX5DV_DR_ACTION_DEST_DEVX_TIR', 'infiniband/mlx5dv.h',
'mlx5dv_dr_action_create_dest_devx_tir' ],
[ 'HAVE_IBV_DEVX_EVENT', 'infiniband/mlx5dv.h',
@@ -116,6 +118,12 @@ has_sym_args = [
[ 'HAVE_MLX5DV_DR_VLAN', 'infiniband/mlx5dv.h',
'mlx5dv_dr_action_create_push_vlan' ],
[ 'HAVE_IBV_VAR', 'infiniband/mlx5dv.h', 'mlx5dv_alloc_var' ],
+   [ 'HAVE_MLX5_OPCODE_ENHANCED_MPSW', 'infiniband/mlx5dv.h',
+   'MLX5_OPCODE_ENHANCED_MPSW' ],
+   [ 'HAVE_MLX5_OPCODE_SEND_EN', 'infiniband/mlx5dv.h',
+   'MLX5_OPCODE_SEND_EN' ],
+   [ 'HAVE_MLX5_OPCODE_WAIT', 'infiniband/mlx5dv.h',
+   'MLX5_OPCODE_WAIT' ],
[ 'HAVE_SUPPORTED_4baseKR4_Full', 'linux/ethtool.h',
'SUPPORTED_4baseKR4_Full' ],
[ 'HAVE_SUPPORTED_4baseCR4_Full', 'linux/ethtool.h',
diff --git a/drivers/common/mlx5/linux/mlx5_glue.c 
b/drivers/common/mlx5/linux/mlx5_glue.c
index 395519d..b61a28b 100644
--- a/drivers/common/mlx5/linux/mlx5_glue.c
+++ b/drivers/common/mlx5/linux/mlx5_glue.c
@@ -1195,7 +1195,6 @@
 #endif
 }
 
-
 static void
 mlx5_glue_dr_reclaim_domain_memory(void *domain, uint32_t enable)
 {
@@ -1207,6 +1206,34 @@
 #endif
 }
 
+static struct mlx5dv_pp *
+mlx5_glue_dv_alloc_pp(struct ibv_context *context,
+ size_t pp_context_sz,
+ const void *pp_context,
+ uint32_t flags)
+{
+#ifdef HAVE_MLX5DV_PP_ALLOC
+   return mlx5dv_pp_alloc(context, pp_context_sz, pp_context, flags);
+#else
+   RTE_SET_USED(context);
+   RTE_SET_USED(pp_context_sz);
+   RTE_SET_USED(pp_context);
+   RTE_SET_USED(flags);
+   errno = ENOTSUP;
+   return NULL;
+#endif
+}
+
+static void
+mlx5_glue_dv_free_pp(struct mlx5dv_pp *pp)
+{
+#ifdef HAVE_MLX5DV_PP_ALLOC
+   return mlx5dv_pp_free(pp);
+#else
+   RTE_SET_USED(pp);
+#endif
+}
+
 __rte_cache_aligned
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue) {
.version = MLX5_GLUE_VERSION,
@@ -1319,4 +1346,6 @@
.devx_free

[dpdk-dev] [PATCH v1 04/16] net/mlx5: introduce shared UAR resource

2020-07-10 Thread Viacheslav Ovsiienko
This is preparation step before moving the Tx queue creation
to the DevX approach. Some features require the shared UAR
for Tx queues and scheduling completion queues, the patch
manages the shared UAR.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.c | 14 ++
 drivers/net/mlx5/mlx5.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 2efbc03..612d38c 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -709,6 +709,12 @@ struct mlx5_dev_ctx_shared *
err = ENOMEM;
goto error;
}
+   sh->tx_uar = mlx5_glue->devx_alloc_uar(sh->ctx, 0);
+   if (!sh->tx_uar) {
+   DRV_LOG(ERR, "Failed to allocate DevX UAR.");
+   err = ENOMEM;
+   goto error;
+   }
}
sh->flow_id_pool = mlx5_flow_id_pool_alloc
((1 << HAIRPIN_FLOW_ID_BITS) - 1);
@@ -767,6 +773,10 @@ struct mlx5_dev_ctx_shared *
mlx5_l3t_destroy(sh->cnt_id_tbl);
sh->cnt_id_tbl = NULL;
}
+   if (sh->tx_uar) {
+   mlx5_glue->devx_free_uar(sh->tx_uar);
+   sh->tx_uar = NULL;
+   }
if (sh->tis)
claim_zero(mlx5_devx_cmd_destroy(sh->tis));
if (sh->td)
@@ -832,6 +842,10 @@ struct mlx5_dev_ctx_shared *
mlx5_l3t_destroy(sh->cnt_id_tbl);
sh->cnt_id_tbl = NULL;
}
+   if (sh->tx_uar) {
+   mlx5_glue->devx_free_uar(sh->tx_uar);
+   sh->tx_uar = NULL;
+   }
if (sh->pd)
claim_zero(mlx5_glue->dealloc_pd(sh->pd));
if (sh->tis)
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index d01d7f3..799b8e3 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -586,6 +586,7 @@ struct mlx5_dev_ctx_shared {
struct mlx5_devx_obj *tis; /* TIS object. */
struct mlx5_devx_obj *td; /* Transport domain. */
struct mlx5_flow_id_pool *flow_id_pool; /* Flow ID pool. */
+   struct mlx5dv_devx_uar *tx_uar; /* Tx/packer pacing  shared UAR. */
struct mlx5_dev_shared_port port[]; /* per device port data array. */
 };
 
-- 
1.8.3.1



[dpdk-dev] [PATCH v1 02/16] net/mlx5: introduce send scheduling devargs

2020-07-10 Thread Viacheslav Ovsiienko
This patch introduces the new devargs:

tx_pp - enables accurate packet send scheduling on mbuf timestamps
  in the PMD. On the device start if "rte_dynflag_timestamp"
  dynamic flag is registered and this devarg non-zero value is
  specified, the driver initializes all necessary internal
  infrastructure to provide packet scheduling. The parameter
  value specifies scheduling granularity in nanoseconds.

tx_skew - the parameter adjusts the send packet scheduling on
  timestamps and represents the average delay between beginning
  of the transmitting descriptor processing by the hardware and
  appearance of actual packet data on the wire. The value should
  be provided in nanoseconds and is valid only if tx_pp parameter
  is specified. The default value is zero.

Signed-off-by: Viacheslav Ovsiienko 
---
 doc/guides/nics/mlx5.rst | 37 ++
 drivers/net/mlx5/linux/mlx5_os.c | 57 
 drivers/net/mlx5/mlx5.c  | 39 ---
 drivers/net/mlx5/mlx5.h  |  2 ++
 4 files changed, 132 insertions(+), 3 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index b51aa67..6b06d16 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -241,6 +241,24 @@ Limitations
   reduce the requested Tx size or adjust data inline settings with
   ``txq_inline_max`` and ``txq_inline_mpw`` devargs keys.
 
+- To provide the packet send scheduling on mbuf timestamps the ``tx_pp``
+  parameter should be specified, RTE_MBUF_DYNFIELD_TIMESTAMP_NAME and
+  RTE_MBUF_DYNFLAG_TIMESTAMP_NAME should be registered by application.
+  When PMD sees the RTE_MBUF_DYNFLAG_TIMESTAMP_NAME set on the packet
+  being sent it tries to synchronize the time of packet appearing on
+  the wire with the specified packet timestamp. It the specified one
+  is in the past it should be ignored, if one is in the distant future
+  it should be capped with some reasonable value (in range of seconds).
+  These specific cases ("too late" and "distant future") can be optionally
+  reported via device xstats to assist applications to detect the
+  time-related problems.
+
+  There is no any packet reordering according timestamps is supposed,
+  neither within packet burst, nor between packets, it is an entirely
+  application responsibility to generate packets and its timestamps
+  in desired order. The timestamps can be put only in the first packet
+  in the burst providing the entire burst scheduling.
+
 - E-Switch decapsulation Flow:
 
   - can be applied to PF port only.
@@ -700,6 +718,25 @@ Driver options
   variable "MLX5_SHUT_UP_BF" value is used. If there is no "MLX5_SHUT_UP_BF",
   the default ``tx_db_nc`` value is zero for ARM64 hosts and one for others.
 
+- ``tx_pp`` parameter [int]
+
+  If a nonzero value is specified the driver creates all necessary internal
+  objects to provide accurate packet send scheduling on mbuf timestamps.
+  The positive value specifies the scheduling granularity in nanoseconds,
+  the packet send will be accurate up to specified digits. The allowed range is
+  from 500 to 1 million of nanoseconds. The negative value specifies the module
+  of granularity and engages the special test mode the check the schedule rate.
+  By default (if the ``tx_pp`` is not specified) send scheduling on timestamps
+  feature is disabled.
+
+- ``tx_skew`` parameter [int]
+
+  The parameter adjusts the send packet scheduling on timestamps and represents
+  the daverage delay between beginning of the transmitting descriptor 
processing
+  by the hardware and appearance of actual packet data on the wire. The value
+  should be provided in nanoseconds and is valid only if ``tx_pp`` parameter is
+  specified. The default value is zero.
+
 - ``tx_vec_en`` parameter [int]
 
   A nonzero value enables Tx vector on ConnectX-5, ConnectX-6, ConnectX-6 Dx
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 2dc57b2..daccd1c 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -879,6 +879,63 @@
}
 #endif
}
+   if (config.tx_pp) {
+   DRV_LOG(DEBUG, "Timestamp counter frequency %u kHz",
+   config.hca_attr.dev_freq_khz);
+   DRV_LOG(DEBUG, "Packet pacing is %ssupported",
+   config.hca_attr.qos.packet_pacing ? "" : "not ");
+   DRV_LOG(DEBUG, "Cross channel ops are %ssupported",
+   config.hca_attr.cross_channel ? "" : "not ");
+   DRV_LOG(DEBUG, "WQE index ignore is %ssupported",
+   config.hca_attr.wqe_index_ignore ? "" : "not ");
+   DRV_LOG(DEBUG, "Non-wire SQ feature is %ssupported",
+   config.hca_attr.non_wire_sq ? "" : "not ");
+   DRV_LOG(DEBUG, "Static WQE SQ feature is %ssupported (%d)",
+   config.hca_attr.log_max_static_sq_wq ? 

[dpdk-dev] [PATCH v1 07/16] net/mlx5: create Tx queues with DevX

2020-07-10 Thread Viacheslav Ovsiienko
To provide the packet send schedule on mbuf timestamp the Tx
queue must be attached to the same UAR as Clock Queue is.
UAR is special hardware related resource mapped to the host
memory and provides doorbell registers, the assigning UAR
to the queue being created is provided via DevX API only.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_rxtx.c| 108 ++-
 drivers/net/mlx5/mlx5_rxtx.h|  14 ++
 drivers/net/mlx5/mlx5_trigger.c |   6 +-
 drivers/net/mlx5/mlx5_txq.c | 299 +++-
 4 files changed, 386 insertions(+), 41 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index e4106bf..c456d20 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -944,43 +944,79 @@ enum mlx5_txcmp_code {
struct mlx5_txq_data *txq = (*priv->txqs)[sm->queue_id];
struct mlx5_txq_ctrl *txq_ctrl =
container_of(txq, struct mlx5_txq_ctrl, txq);
-   struct ibv_qp_attr mod = {
-   .qp_state = IBV_QPS_RESET,
-   .port_num = (uint8_t)priv->dev_port,
-   };
-   struct ibv_qp *qp = txq_ctrl->obj->qp;
 
-   ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
-   if (ret) {
-   DRV_LOG(ERR, "Cannot change the Tx QP state to RESET "
-   "%s", strerror(errno));
-   rte_errno = errno;
-   return ret;
-   }
-   mod.qp_state = IBV_QPS_INIT;
-   ret = mlx5_glue->modify_qp(qp, &mod,
-  (IBV_QP_STATE | IBV_QP_PORT));
-   if (ret) {
-   DRV_LOG(ERR, "Cannot change Tx QP state to INIT %s",
-   strerror(errno));
-   rte_errno = errno;
-   return ret;
-   }
-   mod.qp_state = IBV_QPS_RTR;
-   ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
-   if (ret) {
-   DRV_LOG(ERR, "Cannot change Tx QP state to RTR %s",
-   strerror(errno));
-   rte_errno = errno;
-   return ret;
-   }
-   mod.qp_state = IBV_QPS_RTS;
-   ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
-   if (ret) {
-   DRV_LOG(ERR, "Cannot change Tx QP state to RTS %s",
-   strerror(errno));
-   rte_errno = errno;
-   return ret;
+   if (txq_ctrl->obj->type == MLX5_TXQ_OBJ_TYPE_DEVX_SQ) {
+   struct mlx5_devx_modify_sq_attr msq_attr = { 0 };
+
+   /* Change queue state to reset. */
+   msq_attr.sq_state = MLX5_SQC_STATE_ERR;
+   msq_attr.state = MLX5_SQC_STATE_RST;
+   ret = mlx5_devx_cmd_modify_sq(txq_ctrl->obj->sq_devx,
+ &msq_attr);
+   if (ret) {
+   DRV_LOG(ERR, "Cannot change the "
+   "Tx QP state to RESET %s",
+   strerror(errno));
+   rte_errno = errno;
+   return ret;
+   }
+   /* Change queue state to ready. */
+   msq_attr.sq_state = MLX5_SQC_STATE_RST;
+   msq_attr.state = MLX5_SQC_STATE_RDY;
+   ret = mlx5_devx_cmd_modify_sq(txq_ctrl->obj->sq_devx,
+ &msq_attr);
+   if (ret) {
+   DRV_LOG(ERR, "Cannot change the "
+   "Tx QP state to READY %s",
+   strerror(errno));
+   rte_errno = errno;
+   return ret;
+   }
+   } else {
+   struct ibv_qp_attr mod = {
+   .qp_state = IBV_QPS_RESET,
+   .port_num = (uint8_t)priv->dev_port,
+   };
+   struct ibv_qp *qp = txq_ctrl->obj->qp;
+
+   MLX5_ASSERT
+   (txq_ctrl->obj->type == MLX5_TXQ_OBJ_TYPE_IBV);
+
+   ret = mlx5_glue->modify_qp(qp, &mod, IBV_QP_STATE);
+   if (ret) {
+   DRV_LOG(ERR, "Cannot change the "
+   "Tx QP state to RESET %s",
+   strerror(errno));
+   rte_errno = errno;
+  

[dpdk-dev] [PATCH v1 06/16] net/mlx5: create rearm queue for packet pacing

2020-07-10 Thread Viacheslav Ovsiienko
The dedicated Rearm Queue is needed to fire the work requests to
the Clock Queue in realtime. The Clock Queue should never stop,
otherwise the clock synchronization mignt be broken and packet
send scheduling would fail. The Rearm Queue uses cross channel
SEND_EN/WAIT operations to provides the requests to the
CLock Queue in robust way.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.h  |   1 +
 drivers/net/mlx5/mlx5_defs.h |   5 +-
 drivers/net/mlx5/mlx5_txpp.c | 203 ++-
 3 files changed, 205 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index be28d80..a1956cc 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -567,6 +567,7 @@ struct mlx5_dev_txpp {
struct rte_intr_handle intr_handle; /* Periodic interrupt. */
struct mlx5dv_devx_event_channel *echan; /* Event Channel. */
struct mlx5_txpp_wq clock_queue; /* Clock Queue. */
+   struct mlx5_txpp_wq rearm_queue; /* Clock Queue. */
 };
 
 /*
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index fff11af..35f02cb 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -173,11 +173,14 @@
 
 /* Tx accurate scheduling on timestamps parameters. */
 #define MLX5_TXPP_CLKQ_SIZE 1
+#define MLX5_TXPP_REARM((1UL << MLX5_WQ_INDEX_WIDTH) / 4)
+#define MLX5_TXPP_REARM_SQ_SIZE (((1UL << MLX5_CQ_INDEX_WIDTH) / \
+ MLX5_TXPP_REARM) * 2)
+#define MLX5_TXPP_REARM_CQ_SIZE (MLX5_TXPP_REARM_SQ_SIZE / 2)
 /* The minimal size test packet to put into one WQE, padded by HW. */
 #define MLX5_TXPP_TEST_PKT_SIZE(sizeof(struct rte_ether_hdr) + \
 sizeof(struct rte_ipv4_hdr))
 
-
 /* Size of the simple hash table for metadata register table. */
 #define MLX5_FLOW_MREG_HTABLE_SZ 4096
 #define MLX5_FLOW_MREG_HNAME "MARK_COPY_TABLE"
diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c
index 7f8a6c4..34ac493 100644
--- a/drivers/net/mlx5/mlx5_txpp.c
+++ b/drivers/net/mlx5/mlx5_txpp.c
@@ -9,6 +9,7 @@
 
 #include "mlx5.h"
 #include "mlx5_rxtx.h"
+#include "mlx5_common_os.h"
 
 /* Destroy Event Queue Notification Channel. */
 static void
@@ -48,10 +49,8 @@
 }
 
 static void
-mlx5_txpp_destroy_clock_queue(struct mlx5_dev_ctx_shared *sh)
+mlx5_txpp_destroy_send_queue(struct mlx5_txpp_wq *wq)
 {
-   struct mlx5_txpp_wq *wq = &sh->txpp.clock_queue;
-
if (wq->sq)
claim_zero(mlx5_devx_cmd_destroy(wq->sq));
if (wq->sq_umem)
@@ -68,6 +67,199 @@
 }
 
 static void
+mlx5_txpp_destroy_rearm_queue(struct mlx5_dev_ctx_shared *sh)
+{
+   struct mlx5_txpp_wq *wq = &sh->txpp.rearm_queue;
+
+   mlx5_txpp_destroy_send_queue(wq);
+}
+
+static void
+mlx5_txpp_destroy_clock_queue(struct mlx5_dev_ctx_shared *sh)
+{
+   struct mlx5_txpp_wq *wq = &sh->txpp.clock_queue;
+
+   mlx5_txpp_destroy_send_queue(wq);
+}
+
+static void
+mlx5_txpp_fill_cqe_rearm_queue(struct mlx5_dev_ctx_shared *sh)
+{
+   struct mlx5_txpp_wq *wq = &sh->txpp.rearm_queue;
+   struct mlx5_cqe *cqe = (struct mlx5_cqe *)(uintptr_t)wq->cqes;
+   uint32_t i;
+
+   for (i = 0; i < MLX5_TXPP_REARM_CQ_SIZE; i++) {
+   cqe->op_own = (MLX5_CQE_INVALID << 4) | MLX5_CQE_OWNER_MASK;
+   ++cqe;
+   }
+}
+
+static void
+mlx5_txpp_fill_wqe_rearm_queue(struct mlx5_dev_ctx_shared *sh)
+{
+   struct mlx5_txpp_wq *wq = &sh->txpp.rearm_queue;
+   struct mlx5_wqe *wqe = (struct mlx5_wqe *)(uintptr_t)wq->wqes;
+   uint32_t i;
+
+   for (i = 0; i < wq->sq_size; i += 2) {
+   struct mlx5_wqe_cseg *cs;
+   struct mlx5_wqe_qseg *qs;
+   uint32_t index;
+
+   /* Build SEND_EN request with slave WQE index. */
+   cs = &wqe[i + 0].cseg;
+   cs->opcode = RTE_BE32(MLX5_OPCODE_SEND_EN | 0);
+   cs->sq_ds = rte_cpu_to_be_32((wq->sq->id << 8) | 2);
+   cs->flags = RTE_BE32(MLX5_COMP_ALWAYS <<
+MLX5_COMP_MODE_OFFSET);
+   cs->misc = RTE_BE32(0);
+   qs = RTE_PTR_ADD(cs, sizeof(struct mlx5_wqe_cseg));
+   index = (i * MLX5_TXPP_REARM / 2 + MLX5_TXPP_REARM) &
+   ((1 << MLX5_WQ_INDEX_WIDTH) - 1);
+   qs->max_index = rte_cpu_to_be_32(index);
+   qs->qpn_cqn = rte_cpu_to_be_32(sh->txpp.clock_queue.sq->id);
+   /* Build WAIT request with slave CQE index. */
+   cs = &wqe[i + 1].cseg;
+   cs->opcode = RTE_BE32(MLX5_OPCODE_WAIT | 0);
+   cs->sq_ds = rte_cpu_to_be_32((wq->sq->id << 8) | 2);
+   cs->flags = RTE_BE32(MLX5_COMP_ONLY_ERR <<
+MLX5_COMP_MODE_OFFSET);
+   cs->misc = RTE_BE32(0);
+   qs = RTE_PTR_ADD(cs, sizeof(struct mlx5_wqe_cseg));
+   

[dpdk-dev] [PATCH v1 05/16] net/mlx5: create clock queue for packet pacing

2020-07-10 Thread Viacheslav Ovsiienko
This patch creates the special completion queue providing
reference completions to schedule packet send from
other transmitting queues.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/Makefile|   1 +
 drivers/net/mlx5/linux/mlx5_os.c |   3 +
 drivers/net/mlx5/meson.build |   1 +
 drivers/net/mlx5/mlx5.c  |   2 +
 drivers/net/mlx5/mlx5.h  |  47 +
 drivers/net/mlx5/mlx5_defs.h |   7 +
 drivers/net/mlx5/mlx5_trigger.c  |  16 +-
 drivers/net/mlx5/mlx5_txpp.c | 446 +++
 8 files changed, 518 insertions(+), 5 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_txpp.c

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index a458402..9eaac6b 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -11,6 +11,7 @@ LIB = librte_pmd_mlx5.a
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rxq.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_txq.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_txpp.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rxtx.c
 ifneq ($(filter y,$(CONFIG_RTE_ARCH_X86_64) \
$(CONFIG_RTE_ARCH_PPC_64) \
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 7abb85d..ff93095 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1863,6 +1863,9 @@
 {
int dbmap_env;
int err = 0;
+
+   sh->numa_node = spawn->pci_dev->device.numa_node;
+   pthread_mutex_init(&sh->txpp.mutex, NULL);
/*
 * Configure environment variable "MLX5_BF_SHUT_UP"
 * before the device creation. The rdma_core library
diff --git a/drivers/net/mlx5/meson.build b/drivers/net/mlx5/meson.build
index e95ce02..c06b153 100644
--- a/drivers/net/mlx5/meson.build
+++ b/drivers/net/mlx5/meson.build
@@ -26,6 +26,7 @@ sources = files(
'mlx5_stats.c',
'mlx5_trigger.c',
'mlx5_txq.c',
+   'mlx5_txpp.c',
'mlx5_vlan.c',
'mlx5_utils.c',
 )
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 612d38c..ee721fd 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -767,6 +767,7 @@ struct mlx5_dev_ctx_shared *
pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex);
return sh;
 error:
+   pthread_mutex_destroy(&sh->txpp.mutex);
pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex);
MLX5_ASSERT(sh);
if (sh->cnt_id_tbl) {
@@ -856,6 +857,7 @@ struct mlx5_dev_ctx_shared *
claim_zero(mlx5_glue->close_device(sh->ctx));
if (sh->flow_id_pool)
mlx5_flow_id_pool_release(sh->flow_id_pool);
+   pthread_mutex_destroy(&sh->txpp.mutex);
rte_free(sh);
 exit:
pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 799b8e3..be28d80 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -531,6 +531,44 @@ struct mlx5_flow_id_pool {
uint32_t max_id; /**< Maximum id can be allocated from the pool. */
 };
 
+/* Tx pacing queue structure - for Clock and Rearm queues. */
+struct mlx5_txpp_wq {
+   /* Completion Queue related data.*/
+   struct mlx5_devx_obj *cq;
+   struct mlx5dv_devx_umem *cq_umem;
+   union {
+   volatile void *cq_buf;
+   volatile struct mlx5_cqe *cqes;
+   };
+   volatile uint32_t *cq_dbrec;
+   uint32_t cq_ci:24;
+   uint32_t arm_sn:2;
+   /* Send Queue related data.*/
+   struct mlx5_devx_obj *sq;
+   struct mlx5dv_devx_umem *sq_umem;
+   union {
+   volatile void *sq_buf;
+   volatile struct mlx5_wqe *wqes;
+   };
+   uint16_t sq_size; /* Number of WQEs in the queue. */
+   uint16_t sq_ci; /* Next WQE to execute. */
+   volatile uint32_t *sq_dbrec;
+};
+
+/* Tx packet pacing structure. */
+struct mlx5_dev_txpp {
+   pthread_mutex_t mutex; /* Pacing create/destroy mutex. */
+   uint32_t refcnt; /* Pacing reference counter. */
+   uint32_t freq; /* Timestamp frequency, Hz. */
+   uint32_t tick; /* Completion tick duration in nanoseconds. */
+   uint32_t test; /* Packet pacing test mode. */
+   int32_t skew; /* Scheduling skew. */
+   uint32_t eqn; /* Event Queue number. */
+   struct rte_intr_handle intr_handle; /* Periodic interrupt. */
+   struct mlx5dv_devx_event_channel *echan; /* Event Channel. */
+   struct mlx5_txpp_wq clock_queue; /* Clock Queue. */
+};
+
 /*
  * Shared Infiniband device context for Master/Representors
  * which belong to same IB device with multiple IB ports.
@@ -547,9 +585,12 @@ struct mlx5_dev_ctx_shared {
char ibdev_name[DEV_SYSFS_NAME_MAX]; /* SYSFS dev name. */
char ibdev_path[DEV_SYSFS_PATH_MAX]; /* SYSFS dev path for secondary */
struct mlx5_dev_attr device_attr; /* Device properties. */
+   int numa_node; /* Numa no

[dpdk-dev] [PATCH v1 08/16] net/mlx5: allocate packet pacing context

2020-07-10 Thread Viacheslav Ovsiienko
This patch allocates the Packet Pacing context from the kernel,
configures one according to requested pace send scheduling
granularuty and assigns to Clock Queue.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.h  |  2 ++
 drivers/net/mlx5/mlx5_txpp.c | 71 
 2 files changed, 73 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index a1956cc..c1eafed 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -568,6 +568,8 @@ struct mlx5_dev_txpp {
struct mlx5dv_devx_event_channel *echan; /* Event Channel. */
struct mlx5_txpp_wq clock_queue; /* Clock Queue. */
struct mlx5_txpp_wq rearm_queue; /* Clock Queue. */
+   struct mlx5dv_pp *pp; /* Packet pacing context. */
+   uint16_t pp_id; /* Packet pacing context index. */
 };
 
 /*
diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c
index 34ac493..ebc24ba 100644
--- a/drivers/net/mlx5/mlx5_txpp.c
+++ b/drivers/net/mlx5/mlx5_txpp.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "mlx5.h"
 #include "mlx5_rxtx.h"
@@ -49,6 +50,69 @@
 }
 
 static void
+mlx5_txpp_free_pp_index(struct mlx5_dev_ctx_shared *sh)
+{
+   if (sh->txpp.pp) {
+   mlx5_glue->dv_free_pp(sh->txpp.pp);
+   sh->txpp.pp = NULL;
+   sh->txpp.pp_id = 0;
+   }
+}
+
+/* Allocate Packet Pacing index from kernel via mlx5dv call. */
+static int
+mlx5_txpp_alloc_pp_index(struct mlx5_dev_ctx_shared *sh)
+{
+#ifdef HAVE_MLX5DV_PP_ALLOC
+   uint32_t pp[MLX5_ST_SZ_DW(set_pp_rate_limit_context)];
+   uint64_t rate;
+
+   MLX5_ASSERT(!sh->txpp.pp);
+   memset(&pp, 0, sizeof(pp));
+   rate = NS_PER_S / sh->txpp.tick;
+   if (rate * sh->txpp.tick != NS_PER_S)
+   DRV_LOG(WARNING, "Packet pacing frequency is not precize.");
+   if (sh->txpp.test) {
+   uint32_t len;
+
+   len = RTE_MAX(MLX5_TXPP_TEST_PKT_SIZE,
+ (size_t)RTE_ETHER_MIN_LEN);
+   MLX5_SET(set_pp_rate_limit_context, &pp,
+burst_upper_bound, len);
+   MLX5_SET(set_pp_rate_limit_context, &pp,
+typical_packet_size, len);
+   /* Convert packets per second into kilobits. */
+   rate = (rate * len) / (1000ul / CHAR_BIT);
+   DRV_LOG(INFO, "Packet pacing rate set to %" PRIu64, rate);
+   }
+   MLX5_SET(set_pp_rate_limit_context, &pp, rate_limit, rate);
+   MLX5_SET(set_pp_rate_limit_context, &pp, rate_mode,
+sh->txpp.test ? MLX5_DATA_RATE : MLX5_WQE_RATE);
+   sh->txpp.pp = mlx5_glue->dv_alloc_pp
+   (sh->ctx, sizeof(pp), &pp,
+MLX5DV_PP_ALLOC_FLAGS_DEDICATED_INDEX);
+   if (sh->txpp.pp == NULL) {
+   DRV_LOG(ERR, "Failed to allocate packet pacing index.");
+   rte_errno = errno;
+   return -errno;
+   }
+   if (!sh->txpp.pp->index) {
+   DRV_LOG(ERR, "Zero packet pacing index allocated.");
+   mlx5_txpp_free_pp_index(sh);
+   rte_errno = ENOTSUP;
+   return -ENOTSUP;
+   }
+   sh->txpp.pp_id = sh->txpp.pp->index;
+   return 0;
+#else
+   RTE_SET_USED(sh);
+   DRV_LOG(ERR, "Allocating pacing index is not supported.");
+   rte_errno = ENOTSUP;
+   return -ENOTSUP;
+#endif
+}
+
+static void
 mlx5_txpp_destroy_send_queue(struct mlx5_txpp_wq *wq)
 {
if (wq->sq)
@@ -457,6 +521,7 @@
}
sq_attr.state = MLX5_SQC_STATE_RST;
sq_attr.cqn = wq->cq->id;
+   sq_attr.packet_pacing_rate_limit_index = sh->txpp.pp_id;
sq_attr.wq_attr.cd_slave = 1;
sq_attr.wq_attr.uar_page = sh->tx_uar->page_id;
sq_attr.wq_attr.wq_type = MLX5_WQ_TYPE_CYCLIC;
@@ -503,6 +568,7 @@
  * - Clock CQ/SQ
  * - Rearm CQ/SQ
  * - attaches rearm interrupt handler
+ * - starts Clock Queue
  *
  * Returns 0 on success, negative otherwise
  */
@@ -520,6 +586,9 @@
ret = mlx5_txpp_create_eqn(sh);
if (ret)
goto exit;
+   ret = mlx5_txpp_alloc_pp_index(sh);
+   if (ret)
+   goto exit;
ret = mlx5_txpp_create_clock_queue(sh);
if (ret)
goto exit;
@@ -530,6 +599,7 @@
if (ret) {
mlx5_txpp_destroy_rearm_queue(sh);
mlx5_txpp_destroy_clock_queue(sh);
+   mlx5_txpp_free_pp_index(sh);
mlx5_txpp_destroy_eqn(sh);
sh->txpp.tick = 0;
sh->txpp.test = 0;
@@ -550,6 +620,7 @@
 {
mlx5_txpp_destroy_rearm_queue(sh);
mlx5_txpp_destroy_clock_queue(sh);
+   mlx5_txpp_free_pp_index(sh);
mlx5_txpp_destroy_eqn(sh);
sh->txpp.tick = 0;
sh->txpp.test = 0;
-- 
1.8.3.1



[dpdk-dev] [PATCH v1 10/16] net/mlx5: prepare Tx queue structures to support timestamp

2020-07-10 Thread Viacheslav Ovsiienko
The fields to support send scheduling on dynamic timestamp
field are introduced and initialized on device start.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_rxtx.h|  4 
 drivers/net/mlx5/mlx5_trigger.c |  2 ++
 drivers/net/mlx5/mlx5_txq.c | 32 
 3 files changed, 38 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 8a8d2b5..974a847 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -313,6 +313,9 @@ struct mlx5_txq_data {
volatile uint32_t *cq_db; /* Completion queue doorbell. */
uint16_t port_id; /* Port ID of device. */
uint16_t idx; /* Queue index. */
+   uint64_t ts_mask; /* Timestamp flag dynamic mask. */
+   int32_t ts_offset; /* Timestamp field dynamic offset. */
+   struct mlx5_dev_ctx_shared *sh; /* Shared context. */
struct mlx5_txq_stats stats; /* TX queue counters. */
 #ifndef RTE_ARCH_64
rte_spinlock_t *uar_lock;
@@ -468,6 +471,7 @@ struct mlx5_txq_ctrl *mlx5_txq_hairpin_new
 void txq_alloc_elts(struct mlx5_txq_ctrl *txq_ctrl);
 void txq_free_elts(struct mlx5_txq_ctrl *txq_ctrl);
 uint64_t mlx5_get_tx_port_offloads(struct rte_eth_dev *dev);
+void mlx5_txq_dynf_timestamp_set(struct rte_eth_dev *dev);
 
 /* mlx5_rxtx.c */
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index 449dd95..b713974 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -331,6 +331,8 @@
}
/* Set a mask and offset of dynamic metadata flows into Rx queues*/
mlx5_flow_rxq_dynf_metadata_set(dev);
+   /* Set a mask and offset of scheduling on timestamp into Tx queues*/
+   mlx5_txq_dynf_timestamp_set(dev);
/*
 * In non-cached mode, it only needs to start the default mreg copy
 * action and no flow created by application exists anymore.
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index a6f7e1c..d3b2863 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -1778,3 +1778,35 @@ struct mlx5_txq_ctrl *
}
return ret;
 }
+
+/**
+ * Set the Tx queue dynamic timestamp (mask and offset)
+ *
+ * @param[in] dev
+ *   Pointer to the Ethernet device structure.
+ */
+void
+mlx5_txq_dynf_timestamp_set(struct rte_eth_dev *dev)
+{
+   struct mlx5_priv *priv = dev->data->dev_private;
+   struct mlx5_dev_ctx_shared *sh = priv->sh;
+   struct mlx5_txq_data *data;
+   int off, nbit;
+   unsigned int i;
+   uint64_t mask = 0;
+
+   nbit = rte_mbuf_dynflag_lookup
+   (RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL);
+   off = rte_mbuf_dynfield_lookup
+   (RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL);
+   if (nbit > 0 && off >= 0 && sh->txpp.refcnt)
+   mask = 1ULL << nbit;
+   for (i = 0; i != priv->txqs_n; ++i) {
+   data = (*priv->txqs)[i];
+   if (!data)
+   continue;
+   data->sh = sh;
+   data->ts_mask = mask;
+   data->ts_offset = off;
+   }
+}
-- 
1.8.3.1



[dpdk-dev] [PATCH v1 09/16] net/mlx5: introduce clock queue service routine

2020-07-10 Thread Viacheslav Ovsiienko
Service routine is invoked periodically on Rearm Queue
completion interrupts, typically once per some milliseconds
(1-16) to track clock jitter and wander in robust fashion.
It performs the following:

- fetches the completed CQEs for Rearm Queue
- restarts Rearm Queue on errors
- pushes new requests to Rearm Queue to make it
  continuously running and pushing cross-channel requests
  to Clock Queue
- reads and caches the Clock Queue CQE to be used in datapath
- gathers statistics to estimate clock jitter and wander
- gathers Clock Queue errors statistics

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.h  |  15 ++
 drivers/net/mlx5/mlx5_defs.h |   1 +
 drivers/net/mlx5/mlx5_rxtx.h |  20 +++
 drivers/net/mlx5/mlx5_txpp.c | 318 +++
 4 files changed, 354 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index c1eafed..52b38cc 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -555,6 +555,12 @@ struct mlx5_txpp_wq {
volatile uint32_t *sq_dbrec;
 };
 
+/* Tx packet pacing internal timestamp. */
+struct mlx5_txpp_ts {
+   rte_atomic64_t ci_ts;
+   rte_atomic64_t ts;
+};
+
 /* Tx packet pacing structure. */
 struct mlx5_dev_txpp {
pthread_mutex_t mutex; /* Pacing create/destroy mutex. */
@@ -570,6 +576,15 @@ struct mlx5_dev_txpp {
struct mlx5_txpp_wq rearm_queue; /* Clock Queue. */
struct mlx5dv_pp *pp; /* Packet pacing context. */
uint16_t pp_id; /* Packet pacing context index. */
+   uint16_t ts_n; /* Number of captured timestamps. */
+   uint16_t ts_p; /* Pointer to statisticks timestamp. */
+   struct mlx5_txpp_ts *tsa; /* Timestamps sliding window stats. */
+   struct mlx5_txpp_ts ts; /* Cached completion id/timestamp. */
+   uint32_t sync_lost:1; /* ci/timestamp synchronization lost. */
+   /* Statistics counters. */
+   rte_atomic32_t err_miss_int; /* Missed service interrupt. */
+   rte_atomic32_t err_rearm_queue; /* Rearm Queue errors. */
+   rte_atomic32_t err_clock_queue; /* Clock Queue errors. */
 };
 
 /*
diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 35f02cb..b640d4a 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -172,6 +172,7 @@
 #define MLX5_TXDB_HEURISTIC 2
 
 /* Tx accurate scheduling on timestamps parameters. */
+#define MLX5_TXPP_WAIT_INIT_TS 1000ul /* How long to wait timestamp. */
 #define MLX5_TXPP_CLKQ_SIZE 1
 #define MLX5_TXPP_REARM((1UL << MLX5_WQ_INDEX_WIDTH) / 4)
 #define MLX5_TXPP_REARM_SQ_SIZE (((1UL << MLX5_CQ_INDEX_WIDTH) / \
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 1b797da..8a8d2b5 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -695,4 +696,23 @@ int mlx5_dma_unmap(struct rte_pci_device *pdev, void 
*addr, uint64_t iova,
mlx5_tx_dbrec_cond_wmb(txq, wqe, 1);
 }
 
+/**
+ * Convert timestamp from HW format to linear counter
+ * from Packet Pacing Clock Queue CQE timestamp format.
+ *
+ * @param sh
+ *   Pointer to the device shared context. Might be needed
+ *   to convert according current device configuration.
+ * @param ts
+ *   Timestamp from CQE to convert.
+ * @return
+ *   UTC in nanoseconds
+ */
+static __rte_always_inline uint64_t
+mlx5_txpp_convert_rx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t ts)
+{
+   RTE_SET_USED(sh);
+   return (ts & UINT32_MAX) + (ts >> 32) * NS_PER_S;
+}
+
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c
index ebc24ba..3736f7a 100644
--- a/drivers/net/mlx5/mlx5_txpp.c
+++ b/drivers/net/mlx5/mlx5_txpp.c
@@ -1,6 +1,9 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright 2020 Mellanox Technologies, Ltd
  */
+#include 
+#include 
+
 #include 
 #include 
 #include 
@@ -144,6 +147,33 @@
struct mlx5_txpp_wq *wq = &sh->txpp.clock_queue;
 
mlx5_txpp_destroy_send_queue(wq);
+   if (sh->txpp.tsa) {
+   rte_free(sh->txpp.tsa);
+   sh->txpp.tsa = NULL;
+   }
+}
+
+static void
+mlx5_txpp_doorbell_rearm_queue(struct mlx5_dev_ctx_shared *sh, uint16_t ci)
+{
+   struct mlx5_txpp_wq *wq = &sh->txpp.rearm_queue;
+   union {
+   uint32_t w32[2];
+   uint64_t w64;
+   } cs;
+
+   wq->sq_ci = ci + 1;
+   cs.w32[0] = rte_cpu_to_be_32(rte_be_to_cpu_32
+  (wq->wqes[ci & (wq->sq_size - 1)].ctrl[0]) | (ci - 1) << 8);
+   cs.w32[1] = wq->wqes[ci & (wq->sq_size - 1)].ctrl[1];
+   /* Update SQ doorbell record with new SQ ci. */
+   rte_compiler_barrier();
+   *wq->sq_dbrec = rte_cpu_to_be_32(wq->sq_ci);
+   /* Make sure the doorbell record is updated. */
+   rte_wmb();
+   /* Write to doorbel register to start processing. */
+   __mlx5_uar_write64_re

[dpdk-dev] [PATCH v1 11/16] net/mlx5: convert timestamp to completion index

2020-07-10 Thread Viacheslav Ovsiienko
The application provides timestamps in Tx mbuf as clocks,
the hardware performs scheduling on Clock Queue completion index
match. This patch introduces the timestamp-to-completion-index
inline routine.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.h  |  2 ++
 drivers/net/mlx5/mlx5_rxtx.h | 55 
 drivers/net/mlx5/mlx5_txpp.c |  5 
 3 files changed, 62 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 52b38cc..a9a60fb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -585,6 +585,8 @@ struct mlx5_dev_txpp {
rte_atomic32_t err_miss_int; /* Missed service interrupt. */
rte_atomic32_t err_rearm_queue; /* Rearm Queue errors. */
rte_atomic32_t err_clock_queue; /* Clock Queue errors. */
+   rte_atomic32_t err_ts_past; /* Timestamp in the past. */
+   rte_atomic32_t err_ts_future; /* Timestamp in the distant future. */
 };
 
 /*
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 974a847..d082cd7 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -719,4 +719,59 @@ int mlx5_dma_unmap(struct rte_pci_device *pdev, void 
*addr, uint64_t iova,
return (ts & UINT32_MAX) + (ts >> 32) * NS_PER_S;
 }
 
+/**
+ * Convert timestamp from mbuf format to linear counter
+ * of Clock Queue completions (24 bits)
+ *
+ * @param sh
+ *   Pointer to the device shared context to fetch Tx
+ *   packet pacing timestamp and parameters.
+ * @param ts
+ *   Timestamp from mbuf to convert.
+ * @return
+ *   positive or zero value - completion ID to wait
+ *   negative value - conversion error
+ */
+static __rte_always_inline int32_t
+mlx5_txpp_convert_tx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t mts)
+{
+   uint64_t ts, ci;
+   uint32_t tick;
+
+   do {
+   /*
+* Read atomically two uint64_t fields and compare lsb bits.
+* It there is no match - the timestamp was updated in
+* the service thread, data should be re-read.
+*/
+   rte_compiler_barrier();
+   ci = rte_atomic64_read(&sh->txpp.ts.ci_ts);
+   ts = rte_atomic64_read(&sh->txpp.ts.ts);
+   rte_compiler_barrier();
+   if (!((ts ^ ci) << (64 - MLX5_CQ_INDEX_WIDTH)))
+   break;
+   } while (true);
+   /* Perform the skew correction, positive value to send earlier. */
+   mts -= sh->txpp.skew;
+   mts -= ts;
+   if (unlikely(mts >= UINT64_MAX / 2)) {
+   /* We have negative integer, mts is in the past. */
+   rte_atomic32_inc(&sh->txpp.err_ts_past);
+   return -1;
+   }
+   tick = sh->txpp.tick;
+   MLX5_ASSERT(tick);
+   /* Convert delta to completions, round up. */
+   mts = (mts + tick - 1) / tick;
+   if (unlikely(mts >= (1 << MLX5_CQ_INDEX_WIDTH) / 2 - 1)) {
+   /* We have mts is too distant future. */
+   rte_atomic32_inc(&sh->txpp.err_ts_future);
+   return -1;
+   }
+   mts <<= 64 - MLX5_CQ_INDEX_WIDTH;
+   ci += mts;
+   ci >>= 64 - MLX5_CQ_INDEX_WIDTH;
+   return ci;
+}
+
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c
index 3736f7a..93dbeb2 100644
--- a/drivers/net/mlx5/mlx5_txpp.c
+++ b/drivers/net/mlx5/mlx5_txpp.c
@@ -840,6 +840,11 @@
int flags;
int ret;
 
+   rte_atomic32_set(&sh->txpp.err_miss_int, 0);
+   rte_atomic32_set(&sh->txpp.err_rearm_queue, 0);
+   rte_atomic32_set(&sh->txpp.err_clock_queue, 0);
+   rte_atomic32_set(&sh->txpp.err_ts_past, 0);
+   rte_atomic32_set(&sh->txpp.err_ts_future, 0);
/* Attach interrupt handler to process Rearm Queue completions. */
flags = fcntl(sh->txpp.echan->fd, F_GETFL);
ret = fcntl(sh->txpp.echan->fd, F_SETFL, flags | O_NONBLOCK);
-- 
1.8.3.1



[dpdk-dev] [PATCH v1 14/16] net/mlx5: add read device clock support

2020-07-10 Thread Viacheslav Ovsiienko
If send schedule feature is engaged there is the Clock Queue
created, that reports reliable the currect device clock counter
value. The device clock counter can be read directly from the
Clock Queue CQE.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/linux/mlx5_os.c |  4 ++-
 drivers/net/mlx5/mlx5.h  |  1 +
 drivers/net/mlx5/mlx5_txpp.c | 55 
 3 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index ff93095..c2326a5 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -2342,7 +2342,7 @@
.xstats_get_names = mlx5_xstats_get_names,
.fw_version_get = mlx5_fw_version_get,
.dev_infos_get = mlx5_dev_infos_get,
-   .read_clock = mlx5_read_clock,
+   .read_clock = mlx5_txpp_read_clock,
.dev_supported_ptypes_get = mlx5_dev_supported_ptypes_get,
.vlan_filter_set = mlx5_vlan_filter_set,
.rx_queue_setup = mlx5_rx_queue_setup,
@@ -2391,6 +2391,7 @@
.xstats_get_names = mlx5_xstats_get_names,
.fw_version_get = mlx5_fw_version_get,
.dev_infos_get = mlx5_dev_infos_get,
+   .read_clock = mlx5_txpp_read_clock,
.rx_descriptor_status = mlx5_rx_descriptor_status,
.tx_descriptor_status = mlx5_tx_descriptor_status,
.rxq_info_get = mlx5_rxq_info_get,
@@ -2421,6 +2422,7 @@
.xstats_get_names = mlx5_xstats_get_names,
.fw_version_get = mlx5_fw_version_get,
.dev_infos_get = mlx5_dev_infos_get,
+   .read_clock = mlx5_txpp_read_clock,
.dev_supported_ptypes_get = mlx5_dev_supported_ptypes_get,
.vlan_filter_set = mlx5_vlan_filter_set,
.rx_queue_setup = mlx5_rx_queue_setup,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index a9a60fb..31cd37f 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1010,5 +1010,6 @@ void mlx5_os_set_reg_mr_cb(mlx5_reg_mr_t *reg_mr_cb,
 
 int mlx5_txpp_start(struct rte_eth_dev *dev);
 void mlx5_txpp_stop(struct rte_eth_dev *dev);
+int mlx5_txpp_read_clock(struct rte_eth_dev *dev, uint64_t *timestamp);
 
 #endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c
index 93dbeb2..202e6b3 100644
--- a/drivers/net/mlx5/mlx5_txpp.c
+++ b/drivers/net/mlx5/mlx5_txpp.c
@@ -1035,3 +1035,58 @@
MLX5_ASSERT(!ret);
RTE_SET_USED(ret);
 }
+
+/*
+ * Read the current clock counter of an Ethernet device
+ *
+ * This returns the current raw clock value of an Ethernet device. It is
+ * a raw amount of ticks, with no given time reference.
+ * The value returned here is from the same clock than the one
+ * filling timestamp field of Rx/Tx packets when using hardware timestamp
+ * offload. Therefore it can be used to compute a precise conversion of
+ * the device clock to the real time.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param clock
+ *   Pointer to the uint64_t that holds the raw clock value.
+ *
+ * @return
+ *   - 0: Success.
+ *   - -ENOTSUP: The function is not supported in this mode. Requires
+ * packet pacing module configured and started (tx_pp devarg)
+ */
+int
+mlx5_txpp_read_clock(struct rte_eth_dev *dev, uint64_t *timestamp)
+{
+   struct mlx5_priv *priv = dev->data->dev_private;
+   struct mlx5_dev_ctx_shared *sh = priv->sh;
+   int ret;
+
+   if (sh->txpp.refcnt) {
+   struct mlx5_txpp_wq *wq = &sh->txpp.clock_queue;
+   struct mlx5_cqe *cqe = (struct mlx5_cqe *)(uintptr_t)wq->cqes;
+   union {
+   rte_int128_t u128;
+   struct mlx5_cqe_ts cts;
+   } to;
+   uint64_t ts;
+
+   mlx5_atomic_read_cqe((rte_int128_t *)&cqe->timestamp, &to.u128);
+   if (to.cts.op_own >> 4) {
+   DRV_LOG(DEBUG, "Clock Queue error sync lost.");
+   rte_atomic32_inc(&sh->txpp.err_clock_queue);
+   sh->txpp.sync_lost = 1;
+   return -EIO;
+   }
+   ts = rte_be_to_cpu_64(to.cts.timestamp);
+   ts = mlx5_txpp_convert_rx_ts(sh, ts);
+   *timestamp = ts;
+   return 0;
+   }
+   /* Not supported in isolated mode - kernel does not see the CQEs. */
+   if (priv->isolated || rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return -ENOTSUP;
+   ret = mlx5_read_clock(dev, timestamp);
+   return ret;
+}
-- 
1.8.3.1



[dpdk-dev] [PATCH v1 13/16] net/mlx5: add scheduling support to send routine template

2020-07-10 Thread Viacheslav Ovsiienko
This patch adds send scheduling on timestamps into tx_burst
routine template. The feature is controlled by static configuration
flag, the actual routines supporting the new feature are generated
over this updated template.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_rxtx.c | 162 ++-
 1 file changed, 161 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 1339744..cdf5cc9 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -2404,6 +2404,37 @@ enum mlx5_txcmp_code {
 }
 
 /**
+ * Build the Synchronize Queue Segment with specified completion index.
+ *
+ * @param txq
+ *   Pointer to TX queue structure.
+ * @param loc
+ *   Pointer to burst routine local context.
+ * @param wqe
+ *   Pointer to WQE to fill with built Control Segment.
+ * @param wci
+ *   Completion index in Clock Queue to wait.
+ * @param olx
+ *   Configured Tx offloads mask. It is fully defined at
+ *   compile time and may be used for optimization.
+ */
+static __rte_always_inline void
+mlx5_tx_wseg_init(struct mlx5_txq_data *restrict txq,
+ struct mlx5_txq_local *restrict loc __rte_unused,
+ struct mlx5_wqe *restrict wqe,
+ unsigned int wci,
+ unsigned int olx __rte_unused)
+{
+   struct mlx5_wqe_qseg *qs;
+
+   qs = RTE_PTR_ADD(wqe, MLX5_WSEG_SIZE);
+   qs->max_index = rte_cpu_to_be_32(wci);
+   qs->qpn_cqn = rte_cpu_to_be_32(txq->sh->txpp.clock_queue.cq->id);
+   qs->reserved0 = RTE_BE32(0);
+   qs->reserved1 = RTE_BE32(0);
+}
+
+/**
  * Build the Ethernet Segment without inlined data.
  * Supports Software Parser, Checksums and VLAN
  * insertion Tx offload features.
@@ -3241,6 +3272,59 @@ enum mlx5_txcmp_code {
 }
 
 /**
+ * The routine checks timestamp flag in the current packet,
+ * and push WAIT WQE into the queue if sheduling is required.
+ *
+ * @param txq
+ *   Pointer to TX queue structure.
+ * @param loc
+ *   Pointer to burst routine local context.
+ * @param olx
+ *   Configured Tx offloads mask. It is fully defined at
+ *   compile time and may be used for optimization.
+ *
+ * @return
+ *   MLX5_TXCMP_CODE_EXIT - sending is done or impossible.
+ *   MLX5_TXCMP_CODE_SINGLE - continue processing with the packet.
+ *   MLX5_TXCMP_CODE_MULTI - the WAIT inserted, continue processing.
+ * Local context variables partially updated.
+ */
+static __rte_always_inline enum mlx5_txcmp_code
+mlx5_tx_schedule_send(struct mlx5_txq_data *restrict txq,
+ struct mlx5_txq_local *restrict loc,
+ unsigned int olx)
+{
+   if (MLX5_TXOFF_CONFIG(TXPP) &&
+   loc->mbuf->ol_flags & txq->ts_mask) {
+   struct mlx5_wqe *wqe;
+   uint64_t ts;
+   int32_t wci;
+
+   /*
+* Estimate the required space quickly and roughly.
+* We would like to ensure the packet can be pushed
+* to the queue and we won't get the orphan WAIT WQE.
+*/
+   if (loc->wqe_free <= MLX5_WQE_SIZE_MAX / MLX5_WQE_SIZE ||
+   loc->elts_free < NB_SEGS(loc->mbuf))
+   return MLX5_TXCMP_CODE_EXIT;
+   /* Convert the timestamp into completion to wait. */
+   ts = *RTE_MBUF_DYNFIELD(loc->mbuf, txq->ts_offset, uint64_t *);
+   wci = mlx5_txpp_convert_tx_ts(txq->sh, ts);
+   if (unlikely(wci < 0))
+   return MLX5_TXCMP_CODE_SINGLE;
+   /* Build the WAIT WQE with specified completion. */
+   wqe = txq->wqes + (txq->wqe_ci & txq->wqe_m);
+   mlx5_tx_cseg_init(txq, loc, wqe, 2, MLX5_OPCODE_WAIT, olx);
+   mlx5_tx_wseg_init(txq, loc, wqe, wci, olx);
+   ++txq->wqe_ci;
+   --loc->wqe_free;
+   return MLX5_TXCMP_CODE_MULTI;
+   }
+   return MLX5_TXCMP_CODE_SINGLE;
+}
+
+/**
  * Tx one packet function for multi-segment TSO. Supports all
  * types of Tx offloads, uses MLX5_OPCODE_TSO to build WQEs,
  * sends one packet per WQE.
@@ -3269,6 +3353,16 @@ enum mlx5_txcmp_code {
struct mlx5_wqe *restrict wqe;
unsigned int ds, dlen, inlen, ntcp, vlan = 0;
 
+   if (MLX5_TXOFF_CONFIG(TXPP)) {
+   enum mlx5_txcmp_code wret;
+
+   /* Generate WAIT for scheduling if requested. */
+   wret = mlx5_tx_schedule_send(txq, loc, olx);
+   if (wret == MLX5_TXCMP_CODE_EXIT)
+   return MLX5_TXCMP_CODE_EXIT;
+   if (wret == MLX5_TXCMP_CODE_ERROR)
+   return MLX5_TXCMP_CODE_ERROR;
+   }
/*
 * Calculate data length to be inlined to estimate
 * the required space in WQE ring buffer.
@@ -3360,6 +3454,16 @@ enum mlx5_txcmp_code {
unsigned int ds, nseg;
 
MLX5_ASSERT(NB_SE

[dpdk-dev] [PATCH v1 16/16] common/mlx5: add register access DevX routine

2020-07-10 Thread Viacheslav Ovsiienko
The DevX routine to read/write NIC registers via DevX API is added.
This is the preparation step to check timestamp modes and units
and gather the extended statistics.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/common/mlx5/mlx5_devx_cmds.c| 57 +
 drivers/common/mlx5/mlx5_devx_cmds.h|  4 ++
 drivers/common/mlx5/mlx5_prm.h  | 25 +++
 drivers/common/mlx5/rte_common_mlx5_version.map |  1 +
 4 files changed, 87 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c 
b/drivers/common/mlx5/mlx5_devx_cmds.c
index 093636c..5b99e11 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -12,6 +12,63 @@
 
 
 /**
+ * Perform access to the registers. Reads data from and writes data to
+ * the specified register.
+ *
+ * @param[in] ctx
+ *   Context returned from mlx5 open_device() glue function.
+ * @param[in] reg_id
+ *   Register identifier according to the PRM.
+ * @param[in] arg
+ *   Register access auxiliary parameter according to the PRM.
+ * @param[inout] value
+ *   Pointer to the value to be wriiten to the register or
+ *   to the buffer where the read data to be stored.
+ * @param[in] write
+ *   Non-zero value means write to the register should be performed,
+ *   otherwise read access will be performed.
+ *
+ * @return
+ *   0 on success, a negative value otherwise.
+ */
+int
+mlx5_devx_cmd_register_access(void *ctx, uint16_t reg_id,
+ uint32_t arg, uint32_t *value,
+ uint32_t write)
+{
+   uint32_t in[MLX5_ST_SZ_DW(access_register_in)]   = {0};
+   uint32_t out[MLX5_ST_SZ_DW(access_register_out)] = {0};
+   int status, rc;
+
+   MLX5_SET(access_register_in, in, opcode, MLX5_CMD_OP_ACCESS_REGISTER);
+   MLX5_SET(access_register_in, in, op_mod, write ?
+   MLX5_ACCESS_REGISTER_IN_OP_MOD_WRITE :
+   MLX5_ACCESS_REGISTER_IN_OP_MOD_READ);
+   MLX5_SET(access_register_in, in, register_id, reg_id);
+   MLX5_SET(access_register_in, in, argument, arg);
+   if (write && value)
+   MLX5_SET(access_register_in, in, register_data, *value);
+   rc = mlx5_glue->devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out));
+   if (rc)
+   goto error;
+   status = MLX5_GET(access_register_out, out, status);
+   if (status) {
+   int syndrome = MLX5_GET(access_register_out, out, syndrome);
+
+   DRV_LOG(DEBUG, "Failed to access NIC register 0x%X, "
+  "status %x, syndrome = %x",
+  reg_id, status, syndrome);
+   return -1;
+   }
+   if (value && !write)
+   *value = MLX5_GET(access_register_out, out, register_data);
+   return 0;
+error:
+   rc = (rc > 0) ? -rc : rc;
+   return rc;
+};
+
+/**
  * Allocate flow counters via devx interface.
  *
  * @param[in] ctx
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h 
b/drivers/common/mlx5/mlx5_devx_cmds.h
index c79b349..119479d 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -383,6 +383,10 @@ int mlx5_devx_cmd_modify_qp_state(struct mlx5_devx_obj *qp,
 int mlx5_devx_cmd_modify_rqt(struct mlx5_devx_obj *rqt,
 struct mlx5_devx_rqt_attr *rqt_attr);
 
+__rte_internal
+int mlx5_devx_cmd_register_access(void *ctx, uint16_t reg_id,
+ uint32_t arg, uint32_t *value,
+ uint32_t write);
 /**
  * Create virtio queue counters object DevX API.
  *
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 8705b42..6575edc 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -776,6 +776,7 @@ enum {
MLX5_CMD_OP_SUSPEND_QP = 0x50F,
MLX5_CMD_OP_RESUME_QP = 0x510,
MLX5_CMD_OP_QUERY_NIC_VPORT_CONTEXT = 0x754,
+   MLX5_CMD_OP_ACCESS_REGISTER = 0x805,
MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN = 0x816,
MLX5_CMD_OP_CREATE_TIR = 0x900,
MLX5_CMD_OP_CREATE_SQ = 0X904,
@@ -2545,6 +2546,30 @@ struct mlx5_ifc_set_pp_rate_limit_context_bits {
u8 reserved_at_60[0x120];
 };
 
+struct mlx5_ifc_access_register_out_bits {
+   u8 status[0x8];
+   u8 reserved_at_8[0x18];
+   u8 syndrome[0x20];
+   u8 reserved_at_40[0x40];
+   u8 register_data[0][0x20];
+};
+
+enum {
+   MLX5_ACCESS_REGISTER_IN_OP_MOD_WRITE  = 0x0,
+   MLX5_ACCESS_REGISTER_IN_OP_MOD_READ   = 0x1,
+};
+
+struct mlx5_ifc_access_register_in_bits {
+   u8 opcode[0x10];
+   u8 reserved_at_10[0x10];
+   u8 reserved_at_20[0x10];
+   u8 op_mod[0x10];
+   u8 reserved_at_40[0x10];
+   u8 register_id[0x10];
+   u8 argument[0x20];
+   u8 register_data[0][0x20];
+};
+
 /* CQE format mask. */
 #define MLX5E_CQE_FORMAT_MAS

[dpdk-dev] [PATCH v1 12/16] net/mlx5: prepare Tx datapath to support sheduling

2020-07-10 Thread Viacheslav Ovsiienko
The new static control flag is introduced to control
routine generating from template, enabling the scheduling
on timestamps.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_rxtx.c | 72 ++--
 drivers/net/mlx5/mlx5_txq.c  |  2 ++
 2 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index c456d20..1339744 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -66,6 +66,7 @@ enum mlx5_txcmp_code {
 #define MLX5_TXOFF_CONFIG_METADATA (1u << 6) /* Flow metadata. */
 #define MLX5_TXOFF_CONFIG_EMPW (1u << 8) /* Enhanced MPW supported.*/
 #define MLX5_TXOFF_CONFIG_MPW (1u << 9) /* Legacy MPW supported.*/
+#define MLX5_TXOFF_CONFIG_TXPP (1u << 10) /* Sheduling on timestamp.*/
 
 /* The most common offloads groups. */
 #define MLX5_TXOFF_CONFIG_NONE 0
@@ -5268,6 +5269,32 @@ enum mlx5_txcmp_code {
MLX5_TXOFF_CONFIG_INLINE | MLX5_TXOFF_CONFIG_VLAN |
MLX5_TXOFF_CONFIG_METADATA)
 
+/* Generate routines with timestamp scheduling. */
+MLX5_TXOFF_DECL(full_ts_nompw,
+   MLX5_TXOFF_CONFIG_FULL | MLX5_TXOFF_CONFIG_TXPP)
+
+MLX5_TXOFF_DECL(full_ts,
+   MLX5_TXOFF_CONFIG_FULL | MLX5_TXOFF_CONFIG_TXPP |
+   MLX5_TXOFF_CONFIG_EMPW)
+
+MLX5_TXOFF_DECL(none_ts,
+   MLX5_TXOFF_CONFIG_NONE | MLX5_TXOFF_CONFIG_TXPP |
+   MLX5_TXOFF_CONFIG_EMPW)
+
+MLX5_TXOFF_DECL(mdi_ts,
+   MLX5_TXOFF_CONFIG_INLINE | MLX5_TXOFF_CONFIG_METADATA |
+   MLX5_TXOFF_CONFIG_TXPP | MLX5_TXOFF_CONFIG_EMPW)
+
+MLX5_TXOFF_DECL(mti_ts,
+   MLX5_TXOFF_CONFIG_MULTI | MLX5_TXOFF_CONFIG_TSO |
+   MLX5_TXOFF_CONFIG_INLINE | MLX5_TXOFF_CONFIG_METADATA |
+   MLX5_TXOFF_CONFIG_TXPP | MLX5_TXOFF_CONFIG_EMPW)
+
+MLX5_TXOFF_DECL(mtiv_ts,
+   MLX5_TXOFF_CONFIG_MULTI | MLX5_TXOFF_CONFIG_TSO |
+   MLX5_TXOFF_CONFIG_INLINE | MLX5_TXOFF_CONFIG_VLAN |
+   MLX5_TXOFF_CONFIG_METADATA | MLX5_TXOFF_CONFIG_TXPP |
+   MLX5_TXOFF_CONFIG_EMPW)
 /*
  * Generate routines with Legacy Multi-Packet Write support.
  * This mode is supported by ConnectX-4 Lx only and imposes
@@ -5372,6 +5399,32 @@ enum mlx5_txcmp_code {
MLX5_TXOFF_CONFIG_INLINE | MLX5_TXOFF_CONFIG_VLAN |
MLX5_TXOFF_CONFIG_METADATA | MLX5_TXOFF_CONFIG_EMPW)
 
+MLX5_TXOFF_INFO(full_ts_nompw,
+   MLX5_TXOFF_CONFIG_FULL | MLX5_TXOFF_CONFIG_TXPP)
+
+MLX5_TXOFF_INFO(full_ts,
+   MLX5_TXOFF_CONFIG_FULL | MLX5_TXOFF_CONFIG_TXPP |
+   MLX5_TXOFF_CONFIG_EMPW)
+
+MLX5_TXOFF_INFO(none_ts,
+   MLX5_TXOFF_CONFIG_NONE | MLX5_TXOFF_CONFIG_TXPP |
+   MLX5_TXOFF_CONFIG_EMPW)
+
+MLX5_TXOFF_INFO(mdi_ts,
+   MLX5_TXOFF_CONFIG_INLINE | MLX5_TXOFF_CONFIG_METADATA |
+   MLX5_TXOFF_CONFIG_TXPP | MLX5_TXOFF_CONFIG_EMPW)
+
+MLX5_TXOFF_INFO(mti_ts,
+   MLX5_TXOFF_CONFIG_MULTI | MLX5_TXOFF_CONFIG_TSO |
+   MLX5_TXOFF_CONFIG_INLINE | MLX5_TXOFF_CONFIG_METADATA |
+   MLX5_TXOFF_CONFIG_TXPP | MLX5_TXOFF_CONFIG_EMPW)
+
+MLX5_TXOFF_INFO(mtiv_ts,
+   MLX5_TXOFF_CONFIG_MULTI | MLX5_TXOFF_CONFIG_TSO |
+   MLX5_TXOFF_CONFIG_INLINE | MLX5_TXOFF_CONFIG_VLAN |
+   MLX5_TXOFF_CONFIG_METADATA | MLX5_TXOFF_CONFIG_TXPP |
+   MLX5_TXOFF_CONFIG_EMPW)
+
 MLX5_TXOFF_INFO(full,
MLX5_TXOFF_CONFIG_MULTI | MLX5_TXOFF_CONFIG_TSO |
MLX5_TXOFF_CONFIG_SWP | MLX5_TXOFF_CONFIG_CSUM |
@@ -5518,6 +5571,14 @@ enum mlx5_txcmp_code {
/* We should support VLAN insertion. */
olx |= MLX5_TXOFF_CONFIG_VLAN;
}
+   if (tx_offloads & DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP &&
+   rte_mbuf_dynflag_lookup
+   (RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME, NULL) > 0 &&
+   rte_mbuf_dynfield_lookup
+   (RTE_MBUF_DYNFIELD_TIMESTAMP_NAME, NULL) > 0) {
+   /* Offload configured, dynamic entities registered. */
+   olx |= MLX5_TXOFF_CONFIG_TXPP;
+   }
if (priv->txqs_n && (*priv->txqs)[0]) {
struct mlx5_txq_data *txd = (*priv->txqs)[0];
 
@@ -5587,6 +5648,9 @@ enum mlx5_txcmp_code {
if ((olx ^ tmp) & MLX5_TXOFF_CONFIG_INLINE)
/* Do not enable inlining if not configured. */
continue;
+   if ((olx ^ tmp) & MLX5_TXOFF_CONFIG_TXPP)
+   /* Do not enable scheduling if not configured. */
+   continue;
/*
 * Some routine meets the requirements.
 * Check whether it has minimal amount
@@ -5631,6 +5695,8 @@ enum mlx5_txcmp_code {
DRV_LOG(DEBUG, "\tVLANI (VLAN insertion)");
if (txoff_func[m].olx & MLX5_TXOFF_CONFIG_METADATA)

[dpdk-dev] [PATCH v1 15/16] net/mlx5: provide the send scheduling error statistics

2020-07-10 Thread Viacheslav Ovsiienko
The mlx5 PMD exposes the following new introduced
extended statistics counter to report the errors
of packet send scheduling on timestamps:

  - txpp_err_miss_int - rearm queue interrupt was not handled
was not handled in time and service routine might miss
the completions

  - txpp_err_rearm_queue - reports errors in rearm queue
  - txpp_err_clock_queue - reports errors in clock queue

  - txpp_err_ts_past - timestamps in the packet being sent
were found in the past, timestamps were ignored

  - txpp_err_ts_future - timestamps in the packet being sent
were found in the too distant future (beyond HW/clock queue
capabilities to schedule, typically it is about 16M of
tx_pp devarg periods)

  - txpp_jitter - estimated jitter in device clocks between
8K completions of Clock Queue.

  - txpp_wander - estimated wander in device clocks between
16M completions of Clock Queue.

  - txpp_sync_lost - error flag, the Clock Queue completions
synchronization is lost, accurate packet scheduling can
not be handled, timestamps are being ignored, the restart
of all ports using scheduling must be performed.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.h   |   7 ++
 drivers/net/mlx5/mlx5_stats.c |   7 +-
 drivers/net/mlx5/mlx5_txpp.c  | 219 ++
 3 files changed, 231 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 31cd37f..5c82a25 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1011,5 +1011,12 @@ void mlx5_os_set_reg_mr_cb(mlx5_reg_mr_t *reg_mr_cb,
 int mlx5_txpp_start(struct rte_eth_dev *dev);
 void mlx5_txpp_stop(struct rte_eth_dev *dev);
 int mlx5_txpp_read_clock(struct rte_eth_dev *dev, uint64_t *timestamp);
+int mlx5_txpp_xstats_get(struct rte_eth_dev *dev,
+struct rte_eth_xstat *stats,
+unsigned int n, unsigned int n_used);
+int mlx5_txpp_xstats_reset(struct rte_eth_dev *dev);
+int mlx5_txpp_xstats_get_names(struct rte_eth_dev *dev,
+  struct rte_eth_xstat_name *xstats_names,
+  unsigned int n, unsigned int n_used);
 
 #endif /* RTE_PMD_MLX5_H_ */
diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
index a9b33ee..e30542e 100644
--- a/drivers/net/mlx5/mlx5_stats.c
+++ b/drivers/net/mlx5/mlx5_stats.c
@@ -75,6 +75,7 @@
}
}
}
+   mlx5_stats_n = mlx5_txpp_xstats_get(dev, stats, n, mlx5_stats_n);
return mlx5_stats_n;
 }
 
@@ -237,7 +238,7 @@
xstats_ctrl->base[i] = counters[i];
xstats_ctrl->hw_stats[i] = 0;
}
-
+   mlx5_txpp_xstats_reset(dev);
return 0;
 }
 
@@ -255,7 +256,7 @@
  *   Number of xstats names.
  */
 int
-mlx5_xstats_get_names(struct rte_eth_dev *dev __rte_unused,
+mlx5_xstats_get_names(struct rte_eth_dev *dev,
  struct rte_eth_xstat_name *xstats_names, unsigned int n)
 {
unsigned int i;
@@ -271,5 +272,7 @@
xstats_names[i].name[RTE_ETH_XSTATS_NAME_SIZE - 1] = 0;
}
}
+   mlx5_xstats_n = mlx5_txpp_xstats_get_names(dev, xstats_names,
+  n, mlx5_xstats_n);
return mlx5_xstats_n;
 }
diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c
index 202e6b3..cbd0683 100644
--- a/drivers/net/mlx5/mlx5_txpp.c
+++ b/drivers/net/mlx5/mlx5_txpp.c
@@ -15,6 +15,17 @@
 #include "mlx5_rxtx.h"
 #include "mlx5_common_os.h"
 
+static const char * const mlx5_txpp_stat_names[] = {
+   "txpp_err_miss_int", /* Missed service interrupt. */
+   "txpp_err_rearm_queue", /* Rearm Queue errors. */
+   "txpp_err_clock_queue", /* Clock Queue errors. */
+   "txpp_err_ts_past", /* Timestamp in the past. */
+   "txpp_err_ts_future", /* Timestamp in the distant future. */
+   "txpp_jitter", /* Timestamp jitter (one Clock Queue completion). */
+   "txpp_wander", /* Timestamp jitter (half of Clock Queue completions). */
+   "txpp_sync_lost", /* Scheduling synchronization lost. */
+};
+
 /* Destroy Event Queue Notification Channel. */
 static void
 mlx5_txpp_destroy_eqn(struct mlx5_dev_ctx_shared *sh)
@@ -1090,3 +1101,211 @@
ret = mlx5_read_clock(dev, timestamp);
return ret;
 }
+
+/**
+ * DPDK callback to clear device extended statistics.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ *
+ * @return
+ *   0 on success and stats is reset, negative errno value otherwise and
+ *   rte_errno is set.
+ */
+int mlx5_txpp_xstats_reset(struct rte_eth_dev *dev)
+{
+   struct mlx5_priv *priv = dev->data->dev_private;
+   struct mlx5_dev_ctx_shared *sh = priv->sh;
+
+   rte_atomic32_set(&sh->txpp.err_miss_int, 0);
+   rte_atomic32_set(&sh->txpp.err_rearm_queue, 0);
+   rte_atomic32_set(&sh->txpp.err_clock_queue, 0);
+

Re: [dpdk-dev] [PATCH] bus/pci: fix mmap PCI resource

2020-07-10 Thread David Marchand
On Wed, Jul 8, 2020 at 11:26 AM  wrote:
>
> From: Alvin Zhang 
>
> When mapping a PCI BAR containing an MSI-X table, some devices do not
> need to actually map this BAR or only need to map part of them, which
> may cause the mapping to fail. Now some checks are added and a non-NULL
> initial value is set to the variable to avoid this situation.
>
> Fixes: 2fd3567e5425 ("pci: use OS generic memory mapping functions")
> Cc: tal...@mellanox.com
>
> Signed-off-by: Alvin Zhang 
> ---
>  drivers/bus/pci/linux/pci_vfio.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/bus/pci/linux/pci_vfio.c 
> b/drivers/bus/pci/linux/pci_vfio.c
> index fdeb9a8..9143bfc 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -547,6 +547,14 @@
> bar_index,
> memreg[0].offset, memreg[0].size,
> memreg[1].offset, memreg[1].size);
> +
> +   if (memreg[0].size == 0 && memreg[1].size == 0) {
> +   /* No need to map this BAR */
> +   RTE_LOG(DEBUG, EAL, "Skipping BAR%d\n", bar_index);
> +   bar->size = 0;
> +   bar->addr = 0;
> +   return 0;
> +   }

We already have a check on bar size == 0.
Why would we have this condition?
Broken hw?


> } else {
> memreg[0].offset = bar->offset;
> memreg[0].size = bar->size;
> @@ -556,7 +564,9 @@
> bar_addr = mmap(bar->addr, bar->size, 0, MAP_PRIVATE |
> MAP_ANONYMOUS | additional_flags, -1, 0);
> if (bar_addr != MAP_FAILED) {
> -   void *map_addr = NULL;
> +   /* Set non NULL initial value for in case of no PCI mapping */
> +   void *map_addr = bar_addr;
> +

It took me some time to understand this code...
Anyway, we have a regression in the librte_pci.
This is where the fix should be.

We can cleanup this code later.

> if (memreg[0].size) {
> /* actual map of first part */
> map_addr = pci_map_resource(bar_addr, vfio_dev_fd,
> --
> 1.8.3.1
>


Thanks.

-- 
David Marchand



Re: [dpdk-dev] [PATCH] bus/pci: fix mmap PCI resource

2020-07-10 Thread Thomas Monjalon
10/07/2020 11:54, David Marchand:
> On Wed, Jul 8, 2020 at 11:26 AM  wrote:
> > From: Alvin Zhang 
> >
> > When mapping a PCI BAR containing an MSI-X table, some devices do not
> > need to actually map this BAR or only need to map part of them, which
> > may cause the mapping to fail. Now some checks are added and a non-NULL
> > initial value is set to the variable to avoid this situation.

Note: this regression would not have happened if we had some CI tests
for simple device probing.
Please let's invest more in CI.


> > Fixes: 2fd3567e5425 ("pci: use OS generic memory mapping functions")
> > Cc: tal...@mellanox.com

No he was not Cc in the thread. Same for Anatoly.
Adding more people in Cc...

> > Signed-off-by: Alvin Zhang 
> > ---
> > --- a/drivers/bus/pci/linux/pci_vfio.c
> > +++ b/drivers/bus/pci/linux/pci_vfio.c
> > @@ -547,6 +547,14 @@
> > bar_index,
> > memreg[0].offset, memreg[0].size,
> > memreg[1].offset, memreg[1].size);
> > +
> > +   if (memreg[0].size == 0 && memreg[1].size == 0) {
> > +   /* No need to map this BAR */
> > +   RTE_LOG(DEBUG, EAL, "Skipping BAR%d\n", bar_index);
> > +   bar->size = 0;
> > +   bar->addr = 0;
> > +   return 0;
> > +   }
> 
> We already have a check on bar size == 0.
> Why would we have this condition?
> Broken hw?
> 
> 
> > } else {
> > memreg[0].offset = bar->offset;
> > memreg[0].size = bar->size;
> > @@ -556,7 +564,9 @@
> > bar_addr = mmap(bar->addr, bar->size, 0, MAP_PRIVATE |
> > MAP_ANONYMOUS | additional_flags, -1, 0);
> > if (bar_addr != MAP_FAILED) {
> > -   void *map_addr = NULL;
> > +   /* Set non NULL initial value for in case of no PCI mapping 
> > */
> > +   void *map_addr = bar_addr;
> > +
> 
> It took me some time to understand this code...
> Anyway, we have a regression in the librte_pci.
> This is where the fix should be.

Yes, I am going to send a fix.

> We can cleanup this code later.

Yes please, this function isn't understandable and lack of comments.
Anatoly please?




[dpdk-dev] [PATCH 0/9] python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
This patchset adds deprecation notices to python scripts,
warning of the removal of python2 support from the DPDK 20.11 release.

Louise Kilheeney (9):
  usertools/cpu_layout: add python2 deprecation notice
  usertools/dpdk-telemetry-client: python2 deprecation notice
  usertools/dpdk-devbind: add python2 deprecation notice
  devtools/update_version_map: add python2 deprecation notice
  app/test-cmdline: add python2 deprecation notice
  app/test: add python2 deprecation notice
  usertools/dpdk-pmdinfo: add python2 deprecation notice
  app/test-bbdev: python3 compatibility changes
  app/test-bbdev: add python2 deprecation notice

 app/test-bbdev/test-bbdev.py   | 9 +++--
 app/test-cmdline/cmdline_test.py   | 3 +++
 app/test/autotest.py   | 4 
 devtools/update_version_map_abi.py | 4 
 usertools/cpu_layout.py| 4 
 usertools/dpdk-devbind.py  | 4 
 usertools/dpdk-pmdinfo.py  | 4 +++-
 usertools/dpdk-telemetry-client.py | 4 
 8 files changed, 33 insertions(+), 3 deletions(-)

-- 
2.17.1



[dpdk-dev] [PATCH 7/9] usertools/dpdk-pmdinfo: add python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
Cc: Neil Horman 

Signed-off-by: Louise Kilheeney 
---
 usertools/dpdk-pmdinfo.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/usertools/dpdk-pmdinfo.py b/usertools/dpdk-pmdinfo.py
index 12f20735e..f9ed75517 100755
--- a/usertools/dpdk-pmdinfo.py
+++ b/usertools/dpdk-pmdinfo.py
@@ -28,7 +28,9 @@
 pcidb = None
 
 # ===
-
+if sys.version_info.major < 3:
+print("WARNING: Python 2 is deprecated for use in DPDK, and will not 
work in future releases.", file=sys.stderr)
+print("Please use Python 3 instead", file=sys.stderr)
 
 class Vendor:
 """
-- 
2.17.1



[dpdk-dev] [PATCH 3/9] usertools/dpdk-devbind: add python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
add python2 deprecation notice

Signed-off-by: Louise Kilheeney 
---
 usertools/dpdk-devbind.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py
index dc008823f..86b6b53c4 100755
--- a/usertools/dpdk-devbind.py
+++ b/usertools/dpdk-devbind.py
@@ -10,6 +10,10 @@
 import subprocess
 from os.path import exists, abspath, dirname, basename
 
+if sys.version_info.major < 3:
+print("WARNING: Python 2 is deprecated for use in DPDK, and will not work 
in future releases.", file=sys.stderr)
+print("Please use Python 3 instead", file=sys.stderr)
+
 # The PCI base class for all devices
 network_class = {'Class': '02', 'Vendor': None, 'Device': None,
 'SVendor': None, 'SDevice': None}
-- 
2.17.1



[dpdk-dev] [PATCH 2/9] usertools/dpdk-telemetry-client: python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
add python2 depecation notice

Cc: Kevin Laatz 

Signed-off-by: Louise Kilheeney 
---
 usertools/dpdk-telemetry-client.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/usertools/dpdk-telemetry-client.py 
b/usertools/dpdk-telemetry-client.py
index 35edb7cd2..98d28fa89 100755
--- a/usertools/dpdk-telemetry-client.py
+++ b/usertools/dpdk-telemetry-client.py
@@ -23,6 +23,10 @@
 except NameError:
 raw_input = input  # Python 3
 
+if sys.version_info.major < 3:
+print("WARNING: Python 2 is deprecated for use in DPDK, and will not work 
in future releases.", file=sys.stderr)
+print("Please use Python 3 instead", file=sys.stderr)
+
 class Socket:
 
 def __init__(self):
-- 
2.17.1



[dpdk-dev] [PATCH 1/9] usertools/cpu_layout: add python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
add python2 deprecation notice

Signed-off-by: Louise Kilheeney 
---
 usertools/cpu_layout.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/usertools/cpu_layout.py b/usertools/cpu_layout.py
index 6f129b1db..5423c7965 100755
--- a/usertools/cpu_layout.py
+++ b/usertools/cpu_layout.py
@@ -10,6 +10,10 @@
 except NameError:
 xrange = range # Python 3
 
+if sys.version_info.major < 3:
+print("WARNING: Python 2 is deprecated for use in DPDK, and will not work 
in future releases.", file=sys.stderr)
+print("Please use Python 3 instead", file=sys.stderr)
+
 sockets = []
 cores = []
 core_map = {}
-- 
2.17.1



[dpdk-dev] [PATCH 5/9] app/test-cmdline: add python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
Cc: Olivier Matz 

Signed-off-by: Louise Kilheeney 
---
 app/test-cmdline/cmdline_test.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/app/test-cmdline/cmdline_test.py b/app/test-cmdline/cmdline_test.py
index 3a8fac426..954428e2b 100755
--- a/app/test-cmdline/cmdline_test.py
+++ b/app/test-cmdline/cmdline_test.py
@@ -19,6 +19,9 @@ def runTest(child, test):
 return 0
 child.expect(test["Result"], 1)
 
+if sys.version_info.major < 3:
+print("WARNING: Python 2 is deprecated for use in DPDK, and will not work 
in future releases.", file=sys.stderr)
+print("Please use Python 3 instead", file=sys.stderr)
 
 #
 # history test is a special case
-- 
2.17.1



[dpdk-dev] [PATCH 6/9] app/test: add python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
add python2 deprecation notice

Signed-off-by: Louise Kilheeney 
---
 app/test/autotest.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/app/test/autotest.py b/app/test/autotest.py
index b42f48879..cf7584ccd 100644
--- a/app/test/autotest.py
+++ b/app/test/autotest.py
@@ -17,6 +17,10 @@ def usage():
 usage()
 sys.exit(1)
 
+if sys.version_info.major < 3:
+print("WARNING: Python 2 is deprecated for use in DPDK, and will not work 
in future releases.", file=sys.stderr)
+print("Please use Python 3 instead", file=sys.stderr)
+
 target = sys.argv[2]
 
 test_whitelist = None
-- 
2.17.1



[dpdk-dev] [PATCH 4/9] devtools/update_version_map: add python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
Cc: Neil Horman 
Cc: Ray Kinsella 

Signed-off-by: Louise Kilheeney 
---
 devtools/update_version_map_abi.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/devtools/update_version_map_abi.py 
b/devtools/update_version_map_abi.py
index e2104e61e..80a61641e 100755
--- a/devtools/update_version_map_abi.py
+++ b/devtools/update_version_map_abi.py
@@ -160,6 +160,10 @@ def __generate_internal_abi(f_out, lines):
 print("};", file=f_out)
 
 def __main():
+if sys.version_info.major < 3:
+print("WARNING: Python 2 is deprecated for use in DPDK, and will not 
work in future releases.", file=sys.stderr)
+print("Please use Python 3 instead", file=sys.stderr)
+
 arg_parser = argparse.ArgumentParser(
 description='Merge versions in linker version script.')
 
-- 
2.17.1



[dpdk-dev] [PATCH 9/9] app/test-bbdev: add python2 deprecation notice

2020-07-10 Thread Louise Kilheeney
Cc: Nicolas Chautru 

Signed-off-by: Louise Kilheeney 
---
 app/test-bbdev/test-bbdev.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/app/test-bbdev/test-bbdev.py b/app/test-bbdev/test-bbdev.py
index e127fb2eb..5ae2dc6c4 100755
--- a/app/test-bbdev/test-bbdev.py
+++ b/app/test-bbdev/test-bbdev.py
@@ -16,6 +16,10 @@ def kill(process):
 print("ERROR: Test app timed out")
 process.kill()
 
+if sys.version_info.major < 3:
+print("WARNING: Python 2 is deprecated for use in DPDK, and will not work 
in future releases.", file=sys.stderr)
+print("Please use Python 3 instead", file=sys.stderr)
+
 if "RTE_SDK" in os.environ:
 dpdk_path = os.environ["RTE_SDK"]
 else:
-- 
2.17.1



[dpdk-dev] [PATCH 8/9] app/test-bbdev: python3 compatibility changes

2020-07-10 Thread Louise Kilheeney
use of the print function required for python3
compatibility.

Cc: Nicolas Chautru 

Signed-off-by: Louise Kilheeney 
---
 app/test-bbdev/test-bbdev.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/app/test-bbdev/test-bbdev.py b/app/test-bbdev/test-bbdev.py
index 0194be046..e127fb2eb 100755
--- a/app/test-bbdev/test-bbdev.py
+++ b/app/test-bbdev/test-bbdev.py
@@ -3,6 +3,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
+from __future__ import print_function
 import sys
 import os
 import argparse
@@ -12,7 +13,7 @@
 from threading import Timer
 
 def kill(process):
-print "ERROR: Test app timed out"
+print("ERROR: Test app timed out")
 process.kill()
 
 if "RTE_SDK" in os.environ:
@@ -66,7 +67,7 @@ def kill(process):
 args = parser.parse_args()
 
 if not os.path.exists(args.testapp_path):
-print "No such file: " + args.testapp_path
+print("No such file: " + args.testapp_path)
 sys.exit(1)
 
 params = [args.testapp_path]
-- 
2.17.1



Re: [dpdk-dev] Weird 2 KB MBUF data room requirement

2020-07-10 Thread Bruce Richardson
On Fri, Jul 10, 2020 at 10:21:40AM +0200, Morten Brørup wrote:
> Dear Ethernet PMD developers,
> 
> According to rte_mbuf_core.h, RTE_MBUF_DEFAULT_DATAROOM is 2048 bytes because 
> some NICs need at least 2 KB buffer to receive standard Ethernet frames 
> without splitting them into multiple segments.
> 
> This is a serious waste of memory, considering that standard Ethernet frames 
> are max 1518 bytes.
> 
> How wide spread is this limitation... is it common or a rare exception?
> 
> Where is it documented which NICs suffer from this limitation?
> 
> Do any Intel NICs suffer from this limitation?
> 
> 
> NB: We are targeting an MBUF total size (incl. memzone element overhead) of 
> 2^N, and this limitation would increase our MBUF total size to 4 KB.
> 
> 
> Med venlig hilsen / kind regards
> - Morten Brørup
> 

AFAIK: the NICs supported by the ixgbe driver only allow the size to be
specified in KB granularity.

However, it may be safe to have a driver modification whereby anything over
1600 bytes is considered as 2KB if jumbo frame support is disabled. I don't
think anyone has actually looked into doing so though, or if there are
other hidden gotchas about attempting to do so.

/Bruce


Re: [dpdk-dev] DPDK hugepage memory fragmentation

2020-07-10 Thread Bruce Richardson
On Fri, Jul 10, 2020 at 02:52:16PM +0530, Kamaraj P wrote:
> Hello All,
> 
> We are running to run DPDK based application in a container mode,
> When we do multiple start/stop of our container application, the DPDK
> initialization seems to be failing.
> This is because the hugepage memory fragementated and is not able to find
> the continuous allocation of the memory to initialize the buffer in the
> dpdk init.
> 
> As part of the cleanup of the container, we do call rte_eal_cleanup() to
> cleanup the memory w.r.t our application. However after iterations we still
> see the memory allocation failure due to the fragmentation issue.
> 
> We also tried to set the "--huge-unlink" as an argument before when we
> called the rte_eal_init() and it did not help.
> 
> Could you please suggest if there is an option or any existing patches
> available to clean up the memory to avoid fragmentation issues in the
> future.
> 
> Please advise.
> 
What version of DPDK are you using, and what kernel driver for NIC
interfacing are you using?
DPDK versions since 18.05 should be more forgiving of fragmented memory,
especially if using the vfio-pci kernel driver.

Regards,
/Bruce


Re: [dpdk-dev] [PATCH 0/9] python2 deprecation notice

2020-07-10 Thread Bruce Richardson
On Fri, Jul 10, 2020 at 11:10:46AM +0100, Louise Kilheeney wrote:
> This patchset adds deprecation notices to python scripts,
> warning of the removal of python2 support from the DPDK 20.11 release.
> 
> Louise Kilheeney (9):
>   usertools/cpu_layout: add python2 deprecation notice
>   usertools/dpdk-telemetry-client: python2 deprecation notice
>   usertools/dpdk-devbind: add python2 deprecation notice
>   devtools/update_version_map: add python2 deprecation notice
>   app/test-cmdline: add python2 deprecation notice
>   app/test: add python2 deprecation notice
>   usertools/dpdk-pmdinfo: add python2 deprecation notice
>   app/test-bbdev: python3 compatibility changes
>   app/test-bbdev: add python2 deprecation notice
> 
>  app/test-bbdev/test-bbdev.py   | 9 +++--
>  app/test-cmdline/cmdline_test.py   | 3 +++
>  app/test/autotest.py   | 4 
>  devtools/update_version_map_abi.py | 4 
>  usertools/cpu_layout.py| 4 
>  usertools/dpdk-devbind.py  | 4 
>  usertools/dpdk-pmdinfo.py  | 4 +++-
>  usertools/dpdk-telemetry-client.py | 4 
>  8 files changed, 33 insertions(+), 3 deletions(-)
> 
> -- 
Thanks for setting us up for Python 2 support removal.

Series-acked-by: Bruce Richardson 


Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors

2020-07-10 Thread Neil Horman
On Wed, Jul 08, 2020 at 12:22:12PM +0200, David Marchand wrote:
> abidiff can provide some more information about the ABI difference it
> detected.
> In all cases, a discussion on the mailing must happen but we can give
> some hints to know if this is a problem with the script calling abidiff,
> a potential ABI breakage or an unambiguous ABI breakage.
> 
> Signed-off-by: David Marchand 
> ---
>  devtools/check-abi.sh | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
> index e17fedbd9f..521e2cce7c 100755
> --- a/devtools/check-abi.sh
> +++ b/devtools/check-abi.sh
> @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
>   error=1
>   continue
>   fi
> - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
> + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
> + abiret=$?
>   echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS 
> $dump $dump2'"
>   error=1
> - fi
> + echo
> + if [ $(($abiret & 3)) != 0 ]; then
> + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report 
> this to dev@dpdk.org."
> + fi
> + if [ $(($abiret & 4)) != 0 ]; then
> + echo "ABIDIFF_ABI_CHANGE, this change requires a review 
> (abidiff flagged this as a potential issue)."
> + fi
> + if [ $(($abiret & 8)) != 0 ]; then
> + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change 
> breaks the ABI."
> + fi
> + echo
> + }
>  done
>  
>  [ -z "$error" ] || [ -n "$warnonly" ]
> -- 
> 2.23.0
> 
> 
this looks pretty reasonable to me, sure.
Acked-by: Neil Horman 


Re: [dpdk-dev] [PATCH 4/9] devtools/update_version_map: add python2 deprecation notice

2020-07-10 Thread Neil Horman
On Fri, Jul 10, 2020 at 11:10:50AM +0100, Louise Kilheeney wrote:
> Cc: Neil Horman 
> Cc: Ray Kinsella 
> 
> Signed-off-by: Louise Kilheeney 
> ---
>  devtools/update_version_map_abi.py | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/devtools/update_version_map_abi.py 
> b/devtools/update_version_map_abi.py
> index e2104e61e..80a61641e 100755
> --- a/devtools/update_version_map_abi.py
> +++ b/devtools/update_version_map_abi.py
> @@ -160,6 +160,10 @@ def __generate_internal_abi(f_out, lines):
>  print("};", file=f_out)
>  
>  def __main():
> +if sys.version_info.major < 3:
> +print("WARNING: Python 2 is deprecated for use in DPDK, and will not 
> work in future releases.", file=sys.stderr)
> +print("Please use Python 3 instead", file=sys.stderr)
> +
>  arg_parser = argparse.ArgumentParser(
>  description='Merge versions in linker version script.')
>  
> -- 
> 2.17.1
> 
> 
Acked-by: Neil Horman <


Re: [dpdk-dev] [PATCH 7/9] usertools/dpdk-pmdinfo: add python2 deprecation notice

2020-07-10 Thread Neil Horman
On Fri, Jul 10, 2020 at 11:10:53AM +0100, Louise Kilheeney wrote:
> Cc: Neil Horman 
> 
> Signed-off-by: Louise Kilheeney 
> ---
>  usertools/dpdk-pmdinfo.py | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/usertools/dpdk-pmdinfo.py b/usertools/dpdk-pmdinfo.py
> index 12f20735e..f9ed75517 100755
> --- a/usertools/dpdk-pmdinfo.py
> +++ b/usertools/dpdk-pmdinfo.py
> @@ -28,7 +28,9 @@
>  pcidb = None
>  
>  # ===
> -
> +if sys.version_info.major < 3:
> +print("WARNING: Python 2 is deprecated for use in DPDK, and will not 
> work in future releases.", file=sys.stderr)
> +print("Please use Python 3 instead", file=sys.stderr)
>  
>  class Vendor:
>  """
> -- 
> 2.17.1
> 
> 
Acked-by: Neil Horman 


[dpdk-dev] [PATCH] net: fix checksum on big endian CPUs

2020-07-10 Thread Hongzhi Guo
With current code, the checksum of odd-length buffers is wrong on
big endian CPUs: the last byte is not properly summed to the
accumulator.

Fix this by left-shifting the remaining byte by 8. For instance,
if the last byte is 0x42, we should add 0x4200 to the accumulator
on big endian CPUs.

This change is similar to what is suggested in Errata 3133 of
RFC 1071.

Fixes: 6006818cfb26("net: new checksum functions")
Cc: sta...@dpdk.org

Signed-off-by: Hongzhi Guo 
---
v2:
* Explain the logic in the commit log
* Fixed commit title
---
---
 lib/librte_net/rte_ip.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
index 292f63fd7..4fb0e314a 100644
--- a/lib/librte_net/rte_ip.h
+++ b/lib/librte_net/rte_ip.h
@@ -139,8 +139,11 @@ __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
}
 
/* if length is in odd bytes */
-   if (len == 1)
-   sum += *((const uint8_t *)u16_buf);
+   if (len == 1) {
+   uint16_t left = 0;
+   *(uint8_t *)&left = *(const uint8_t *)u16_buf;
+   sum += left;
+   }
 
return sum;
 }
-- 
2.21.0.windows.1




[dpdk-dev] [PATCH] pci: keep API compatibility with mmap values

2020-07-10 Thread Thomas Monjalon
The function pci_map_resource() returns MAP_FAILED in case of error.
When replacing the call to mmap() by rte_mem_map(),
the error code became NULL, breaking the API.
This function is probably not used outside of DPDK,
but it is still a problem for two reasons:
- the deprecation process was not followed
- the Linux function pci_vfio_mmap_bar() is broken for i40e

The error code is reverted to the Unix value MAP_FAILED.
Windows needs to define this special value (-1 as in Unix).
After proper deprecation process, the API could be changed again
if really needed.

Because of the switch from mmap() to rte_mem_map(),
another part of the API was changed: "int additional_flags"
are defined as "additional flags for the mapping range"
without mentioning it was directly used in mmap().
Currently it is directly used in rte_mem_map(),
that's why the values rte_map_flags must be mapped (sic) on the mmap ones
in case of Unix OS.

These are side effects of a badly defined API using Unix values.

Bugzilla ID: 503
Fixes: 2fd3567e5425 ("pci: use OS generic memory mapping functions")
Cc: tal...@mellanox.com

Reported-by: David Marchand 
Signed-off-by: Thomas Monjalon 
---
 drivers/bus/pci/bsd/pci.c | 2 +-
 drivers/bus/pci/linux/pci_uio.c   | 2 +-
 drivers/bus/pci/linux/pci_vfio.c  | 4 ++--
 drivers/bus/pci/pci_common_uio.c  | 2 +-
 lib/librte_eal/include/rte_eal_paging.h   | 8 
 lib/librte_eal/windows/include/sys/mman.h | 9 +
 lib/librte_pci/rte_pci.c  | 1 +
 lib/librte_pci/rte_pci.h  | 2 +-
 8 files changed, 24 insertions(+), 6 deletions(-)
 create mode 100644 lib/librte_eal/windows/include/sys/mman.h

diff --git a/drivers/bus/pci/bsd/pci.c b/drivers/bus/pci/bsd/pci.c
index 8bc473eb9a..6ec27b4b5b 100644
--- a/drivers/bus/pci/bsd/pci.c
+++ b/drivers/bus/pci/bsd/pci.c
@@ -192,7 +192,7 @@ pci_uio_map_resource_by_index(struct rte_pci_device *dev, 
int res_idx,
mapaddr = pci_map_resource(NULL, fd, (off_t)offset,
(size_t)dev->mem_resource[res_idx].len, 0);
close(fd);
-   if (mapaddr == NULL)
+   if (mapaddr == MAP_FAILED)
goto error;
 
maps[map_idx].phaddr = dev->mem_resource[res_idx].phys_addr;
diff --git a/drivers/bus/pci/linux/pci_uio.c b/drivers/bus/pci/linux/pci_uio.c
index b622001539..097dc19225 100644
--- a/drivers/bus/pci/linux/pci_uio.c
+++ b/drivers/bus/pci/linux/pci_uio.c
@@ -345,7 +345,7 @@ pci_uio_map_resource_by_index(struct rte_pci_device *dev, 
int res_idx,
mapaddr = pci_map_resource(pci_map_addr, fd, 0,
(size_t)dev->mem_resource[res_idx].len, 0);
close(fd);
-   if (mapaddr == NULL)
+   if (mapaddr == MAP_FAILED)
goto error;
 
pci_map_addr = RTE_PTR_ADD(mapaddr,
diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index fdeb9a8caf..07e072e13f 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -566,7 +566,7 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct 
mapped_pci_resource *vfio_res,
}
 
/* if there's a second part, try to map it */
-   if (map_addr != NULL
+   if (map_addr != MAP_FAILED
&& memreg[1].offset && memreg[1].size) {
void *second_addr = RTE_PTR_ADD(bar_addr,
(uintptr_t)(memreg[1].offset -
@@ -578,7 +578,7 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct 
mapped_pci_resource *vfio_res,
RTE_MAP_FORCE_ADDRESS);
}
 
-   if (map_addr == NULL) {
+   if (map_addr == NULL || map_addr == MAP_FAILED) {
munmap(bar_addr, bar->size);
bar_addr = MAP_FAILED;
RTE_LOG(ERR, EAL, "Failed to map pci BAR%d\n",
diff --git a/drivers/bus/pci/pci_common_uio.c b/drivers/bus/pci/pci_common_uio.c
index 793dfd0a7c..f4dca9da91 100644
--- a/drivers/bus/pci/pci_common_uio.c
+++ b/drivers/bus/pci/pci_common_uio.c
@@ -58,7 +58,7 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
"Cannot mmap device resource file %s to 
address: %p\n",
uio_res->maps[i].path,
uio_res->maps[i].addr);
-   if (mapaddr != NULL) {
+   if (mapaddr != MAP_FAILED) {
/* unmap addrs correctly mapped */
for (j = 0; j < i; j++)
pci_unmap_resource(
diff --git a/lib/librte_eal/include/rte_eal_paging.h 
b/lib/librte_eal/include/rte_eal_paging.h
index ed98e70e9e..680a7f2505 100644
--- a/lib/librte_eal/include/rte_eal_paging.h
+++ b/lib/librte_eal/include/rt

[dpdk-dev] [dpdk-announce] DPDK 18.11.9 (LTS) released

2020-07-10 Thread Kevin Traynor
Hi all,

Here is a new LTS release:
https://fast.dpdk.org/rel/dpdk-18.11.9.tar.xz

The git tree is at:
https://dpdk.org/browse/dpdk-stable/?h=18.11

It has about 200 bugfixes since the prevsious release.

Thanks to the authors who helped with backports and to
the following who helped with validation:

Intel, Red Hat, Mellanox, OVS project and Microsoft.

Kevin.

---
 app/test-crypto-perf/main.c|   3 +-
 app/test-eventdev/test_pipeline_common.c   |  10 +-
 app/test-pmd/Makefile  |   6 +
 app/test-pmd/cmdline.c |   8 +-
 app/test-pmd/config.c  |  26 +-
 app/test-pmd/csumonly.c|  12 +-
 app/test-pmd/meson.build   |   5 +
 app/test-pmd/parameters.c  |   2 +-
 app/test-pmd/testpmd.c |   4 +-
 config/meson.build |   4 +
 devtools/check-symbol-change.sh|  10 +-
 devtools/checkpatches.sh   |   8 +
 doc/api/doxy-api-index.md  |   2 +-
 doc/api/doxy-api.conf.in   |   1 +
 doc/guides/conf.py |  22 +-
 doc/guides/contributing/documentation.rst  |  12 +-
 doc/guides/contributing/patches.rst|  20 +-
 doc/guides/contributing/stable.rst |   6 +-
 doc/guides/cryptodevs/aesni_gcm.rst|  14 +
 doc/guides/cryptodevs/aesni_mb.rst |  14 +
 doc/guides/eventdevs/index.rst |   2 +-
 doc/guides/freebsd_gsg/install_from_ports.rst  |   2 +-
 doc/guides/linux_gsg/eal_args.include.rst  |   2 +-
 doc/guides/linux_gsg/nic_perf_intel_platform.rst   |   2 +-
 doc/guides/nics/enic.rst   |   2 +-
 doc/guides/nics/fail_safe.rst  |   2 +-
 doc/guides/nics/features/avf.ini   |   1 -
 doc/guides/nics/features/avf_vec.ini   |   1 -
 doc/guides/nics/features/i40e.ini  |   1 -
 doc/guides/nics/features/igb.ini   |   1 +
 doc/guides/nics/features/ixgbe.ini |   1 +
 doc/guides/nics/i40e.rst   |   9 +
 doc/guides/prog_guide/cryptodev_lib.rst|   2 +-
 doc/guides/rel_notes/release_18_11.rst | 361 +
 .../sample_app_ug/l2_forward_real_virtual.rst  |   9 -
 doc/guides/sample_app_ug/link_status_intr.rst  |   7 -
 doc/guides/sample_app_ug/multi_process.rst |   2 +-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst|   2 +-
 drivers/Makefile   |   2 +-
 drivers/bus/fslmc/qbman/qbman_debug.c  |   9 +-
 drivers/bus/ifpga/ifpga_bus.c  |   1 +
 drivers/bus/pci/linux/pci.c|   5 +
 drivers/bus/pci/linux/pci_vfio.c   |  37 ++
 drivers/bus/pci/pci_common.c   |   6 +-
 drivers/bus/pci/pci_common_uio.c   |   1 +
 drivers/bus/pci/private.h  |  10 -
 drivers/bus/vmbus/linux/vmbus_uio.c|   2 +-
 drivers/bus/vmbus/vmbus_common.c   |   2 +-
 drivers/common/cpt/cpt_pmd_logs.h  |   2 +-
 drivers/compress/octeontx/otx_zip_pmd.c|   2 +-
 drivers/compress/zlib/zlib_pmd.c   |   2 +
 drivers/compress/zlib/zlib_pmd_private.h   |   2 +-
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c   |   2 +
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h   |   2 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |   2 +
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   2 +-
 drivers/crypto/caam_jr/Makefile|   7 +
 drivers/crypto/caam_jr/caam_jr.c   |  23 +-
 drivers/crypto/caam_jr/caam_jr_hw_specific.h   |   2 +-
 drivers/crypto/caam_jr/caam_jr_pvt.h   |   9 +-
 drivers/crypto/caam_jr/caam_jr_uio.c   |  34 +-
 drivers/crypto/caam_jr/meson.build |   5 +
 drivers/crypto/ccp/ccp_dev.c   |   2 +-
 drivers/crypto/dpaa2_sec/Makefile  |   7 +
 drivers/crypto/dpaa2_sec/meson.build   |   5 +
 drivers/crypto/dpaa_sec/Makefile   |   7 +
 drivers/crypto/dpaa_sec/meson.build|   5 +
 drivers/crypto/kasumi/rte_kasumi_pmd.c |   1 +
 drivers/crypto/kasumi/rte_kasumi_pmd_private.h |   4 +-
 drivers/crypto/mvsam/rte_mrvl_pmd.c|   1 +
 drivers/crypto/mvsam/rte_mrvl_pmd_private.h|   2 +-
 drivers/crypto/octeontx/otx_cryptodev.c|   4 +
 drivers/crypto/octeontx/otx_cryptodev.h|   2 +-
 drivers/crypto/openssl/rte_openssl_pmd.c   |  24 +
 drivers/crypto/openssl/rte_openssl_pmd_private.h   |   2 +-
 drivers/crypto/qat/qat_sym_session.c   |   9 +-

Re: [dpdk-dev] [PATCH] net: fix checksum on big endian CPUs

2020-07-10 Thread Morten Brørup
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Hongzhi Guo
> Sent: Friday, July 10, 2020 1:43 PM
> 
> With current code, the checksum of odd-length buffers is wrong on
> big endian CPUs: the last byte is not properly summed to the
> accumulator.
> 
> Fix this by left-shifting the remaining byte by 8. For instance,
> if the last byte is 0x42, we should add 0x4200 to the accumulator
> on big endian CPUs.
> 
> This change is similar to what is suggested in Errata 3133 of
> RFC 1071.
> 
> Fixes: 6006818cfb26("net: new checksum functions")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Hongzhi Guo 
> ---
> v2:
> * Explain the logic in the commit log
> * Fixed commit title
> ---
> ---
>  lib/librte_net/rte_ip.h | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
> index 292f63fd7..4fb0e314a 100644
> --- a/lib/librte_net/rte_ip.h
> +++ b/lib/librte_net/rte_ip.h
> @@ -139,8 +139,11 @@ __rte_raw_cksum(const void *buf, size_t len,
> uint32_t sum)
>   }
> 
>   /* if length is in odd bytes */
> - if (len == 1)
> - sum += *((const uint8_t *)u16_buf);
> + if (len == 1) {
> + uint16_t left = 0;
> + *(uint8_t *)&left = *(const uint8_t *)u16_buf;
> + sum += left;
> + }
> 
>   return sum;
>  }
> --
> 2.21.0.windows.1
> 

This is correct for both big and little endian CPUs.

Reviewed-by: Morten Brørup 



Re: [dpdk-dev] [PATCH v10 0/3] RCU integration with LPM library

2020-07-10 Thread David Marchand
On Fri, Jul 10, 2020 at 4:22 AM Ruifeng Wang  wrote:
>
> This patchset integrates RCU QSBR support with LPM library.
>
> Resource reclaimation implementation was splitted from the original
> series, and has already been part of RCU library. Rework the series
> to base LPM integration on RCU reclaimation APIs.
>
> New API rte_lpm_rcu_qsbr_add is introduced for application to
> register a RCU variable that LPM library will use. This provides
> user the handle to enable RCU that integrated in LPM library.
>
> Functional tests and performance tests are added to cover the
> integration with RCU.

Series applied.

A comment though.

I am surprised to see the defer queue is still exposed out of lpm.

+int rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_lpm_rcu_config *cfg,
+struct rte_rcu_qsbr_dq **dq);

If this is intended, we will need unit tests for this parameter as I
could see none.
Else, it can be removed.

Please send a followup patch for rc2.
Thanks.


-- 
David Marchand



Re: [dpdk-dev] [PATCH] bus/pci: fix mmap PCI resource

2020-07-10 Thread Thomas Monjalon
10/07/2020 12:07, Thomas Monjalon:
> 10/07/2020 11:54, David Marchand:
> > On Wed, Jul 8, 2020 at 11:26 AM  wrote:
> > > From: Alvin Zhang 
> > >
> > > When mapping a PCI BAR containing an MSI-X table, some devices do not
> > > need to actually map this BAR or only need to map part of them, which
> > > may cause the mapping to fail. Now some checks are added and a non-NULL
> > > initial value is set to the variable to avoid this situation.
[...]
> > > --- a/drivers/bus/pci/linux/pci_vfio.c
> > > +++ b/drivers/bus/pci/linux/pci_vfio.c
> > > @@ -547,6 +547,14 @@
> > > bar_index,
> > > memreg[0].offset, memreg[0].size,
> > > memreg[1].offset, memreg[1].size);
> > > +
> > > +   if (memreg[0].size == 0 && memreg[1].size == 0) {
> > > +   /* No need to map this BAR */
> > > +   RTE_LOG(DEBUG, EAL, "Skipping BAR%d\n", 
> > > bar_index);
> > > +   bar->size = 0;
> > > +   bar->addr = 0;
> > > +   return 0;
> > > +   }
> > 
> > We already have a check on bar size == 0.
> > Why would we have this condition?
> > Broken hw?
> > 
> > 
> > > } else {
> > > memreg[0].offset = bar->offset;
> > > memreg[0].size = bar->size;
> > > @@ -556,7 +564,9 @@
> > > bar_addr = mmap(bar->addr, bar->size, 0, MAP_PRIVATE |
> > > MAP_ANONYMOUS | additional_flags, -1, 0);
> > > if (bar_addr != MAP_FAILED) {
> > > -   void *map_addr = NULL;
> > > +   /* Set non NULL initial value for in case of no PCI 
> > > mapping */
> > > +   void *map_addr = bar_addr;
> > > +
> > 
> > It took me some time to understand this code...
> > Anyway, we have a regression in the librte_pci.
> > This is where the fix should be.
> 
> Yes, I am going to send a fix.

Patch sent: https://patches.dpdk.org/patch/73741/

This patch is marked as rejected, but please follow-up on cleanup.

> > We can cleanup this code later.
> 
> Yes please, this function isn't understandable and lack of comments.
> Anatoly please?





Re: [dpdk-dev] [PATCH v6 1/2] mbuf: introduce accurate packet Tx scheduling

2020-07-10 Thread Slava Ovsiienko
Hi, Ferruh


Thanks a lot for the review.

> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, July 10, 2020 2:47
> To: Slava Ovsiienko ; dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; olivier.m...@6wind.com;
> bernard.iremon...@intel.com; tho...@monjalon.com; Andrew Rybchenko
> 
> Subject: Re: [dpdk-dev] [PATCH v6 1/2] mbuf: introduce accurate packet Tx
> scheduling
> 
> On 7/9/2020 1:36 PM, Viacheslav Ovsiienko wrote:
> > There is the requirement on some networks for precise traffic timing
> > management. The ability to send (and, generally speaking, receive) the
> > packets at the very precisely specified moment of time provides the
> > opportunity to support the connections with Time Division Multiplexing
> > using the contemporary general purpose NIC without involving an
> > auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> > interface is one of the promising features for potentially usage of
> > the precise time management for the egress packets.
> 
> Is this a HW support, or is the scheduling planned to be done in the driver?
Yes, mlx5 PMD feature v1 is sent: http://patches.dpdk.org/patch/73714/

> 
> >
> > The main objective of this RFC is to specify the way how applications
> 
> It is no more RFC.
Oops, miscopy. Thanks.

> 
> > can provide the moment of time at what the packet transmission must be
> > started and to describe in preliminary the supporting this feature
> > from
> > mlx5 PMD side.
> 
> I was about the ask this, will there be a PMD counterpart implementation of
> the feature? It would be better to have it as part of this set.
> What is the plan for the PMD implementation?
Please, see above.
> 
> >
> > The new dynamic timestamp field is proposed, it provides some timing
> > information, the units and time references (initial phase) are not
> > explicitly defined but are maintained always the same for a given port.
> > Some devices allow to query rte_eth_read_clock() that will return the
> > current device timestamp. The dynamic timestamp flag tells whether the
> > field contains actual timestamp value. For the packets being sent this
> > value can be used by PMD to schedule packet sending.
> >
> > The device clock is opaque entity, the units and frequency are vendor
> > specific and might depend on hardware capabilities and configurations.
> > If might (or not) be synchronized with real time via PTP, might (or
> > not) be synchronous with CPU clock (for example if NIC and CPU share
> > the same clock source there might be no any drift between the NIC and
> > CPU clocks), etc.
> >
> > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation and
> > obsoleting, these dynamic flag and field will be used to manage the
> > timestamps on receiving datapath as well. Having the dedicated flags
> > for Rx/Tx timestamps allows applications not to perform explicit flags
> > reset on forwarding and not to promote received timestamps to the
> > transmitting datapath by default. The static PKT_RX_TIMESTAMP is
> > considered as candidate to become the dynamic flag.
> 
> Is there a deprecation notice for 'PKT_RX_TIMESTAMP'? Is this decided?
No, we are going to discuss that, the Rx timestamp is a good candidate to be
moved out from the first mbuf cacheline to the dynamic field.
There are good chances we will deprecate fixed Rx timestamp flag/field,
that's why we'd prefer not to rely on ones anymore.

> 
> >
> > When PMD sees the "rte_dynfield_timestamp" set on the packet being
> > sent it tries to synchronize the time of packet appearing on the wire
> > with the specified packet timestamp. If the specified one is in the
> > past it should be ignored, if one is in the distant future it should
> > be capped with some reasonable value (in range of seconds). These
> > specific cases ("too late" and "distant future") can be optionally
> > reported via device xstats to assist applications to detect the
> > time-related problems.
> >
> > There is no any packet reordering according timestamps is supposed,
> > neither within packet burst, nor between packets, it is an entirely
> > application responsibility to generate packets and its timestamps in
> > desired order. The timestamps can be put only in the first packet in
> > the burst providing the entire burst scheduling.
> >
> > PMD reports the ability to synchronize packet sending on timestamp
> > with new offload flag:
> >
> > This is palliative and is going to be replaced with new eth_dev API
> > about reporting/managing the supported dynamic flags and its related
> > features. This API would break ABI compatibility and can't be
> > introduced at the moment, so is postponed to 20.11.
> 
> Good to hear that there will be a generic API to get supported dynamic flags.
> I was concerned about adding 'DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP'
> flag, since not sure if there will be any other PMD that will want to use it.
> The trouble is it is hard to remove a public macro after it is introduced, in 
> this
> release I t

Re: [dpdk-dev] [PATCH] net: fix checksum on big endian CPUs

2020-07-10 Thread Olivier Matz
On Fri, Jul 10, 2020 at 02:20:08PM +0200, Morten Brørup wrote:
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Hongzhi Guo
> > Sent: Friday, July 10, 2020 1:43 PM
> > 
> > With current code, the checksum of odd-length buffers is wrong on
> > big endian CPUs: the last byte is not properly summed to the
> > accumulator.
> > 
> > Fix this by left-shifting the remaining byte by 8. For instance,
> > if the last byte is 0x42, we should add 0x4200 to the accumulator
> > on big endian CPUs.
> > 
> > This change is similar to what is suggested in Errata 3133 of
> > RFC 1071.
> > 
> > Fixes: 6006818cfb26("net: new checksum functions")
> > Cc: sta...@dpdk.org
> > 
> > Signed-off-by: Hongzhi Guo 
> > ---
> > v2:
> > * Explain the logic in the commit log
> > * Fixed commit title
> > ---
> > ---
> >  lib/librte_net/rte_ip.h | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
> > index 292f63fd7..4fb0e314a 100644
> > --- a/lib/librte_net/rte_ip.h
> > +++ b/lib/librte_net/rte_ip.h
> > @@ -139,8 +139,11 @@ __rte_raw_cksum(const void *buf, size_t len,
> > uint32_t sum)
> > }
> > 
> > /* if length is in odd bytes */
> > -   if (len == 1)
> > -   sum += *((const uint8_t *)u16_buf);
> > +   if (len == 1) {
> > +   uint16_t left = 0;
> > +   *(uint8_t *)&left = *(const uint8_t *)u16_buf;
> > +   sum += left;
> > +   }
> > 
> > return sum;
> >  }
> > --
> > 2.21.0.windows.1
> > 
> 
> This is correct for both big and little endian CPUs.
> 
> Reviewed-by: Morten Brørup 

Acked-by: Olivier Matz 

Thanks!



[dpdk-dev] [PATCH v7 2/2] app/testpmd: add send scheduling test capability

2020-07-10 Thread Viacheslav Ovsiienko
This commit adds testpmd capability to provide timestamps on the packets
being sent in the txonly mode. This includes:

 - SEND_ON_TIMESTAMP support
   new device Tx offload capability support added, example:

 testpmd> port config 0 tx_offload send_on_timestamp on

 - set txtimes, registers field and flag, example:

 testpmd> set txtimes 100,0

   This command enables the packet send scheduling on timestamps if
   the first parameter is not zero, generic format:

 testpmd> set txtimes (inter),(intra)

   where:

 inter - is the delay between the bursts in the device clock units.
 If "intra" (next parameter) is zero, this is the time between the
 beginnings of the first packets in the neighbour bursts, if "intra"
 is not zero, "inter" specifies the time between the beginning of the
 first packet of the current burst and the beginning of the last packet
 of the previous burst. If "inter"parameter is zero the send scheduling
 on timestamps is disabled (default).

 intra - is the delay between the packets within the burst specified
 in the device clock units. The number of packets in the burst is
 defined by regular burst setting. If "intra" parameter is zero no
 timestamps provided in the packets excepting  the first one in
 the burst.

 As the result the bursts of packet will be transmitted with
 specific delay between the packets within the burst and specific
 delay between the bursts. The rte_eth_read_clock() is supposed to be
 engaged to get the current device clock value and provide
 the reference for the timestamps. If there is no supported
 rte_eth_read_clock() there will be no provided send scheduling
 on the device.

 - show txtimes, displays the timing settings
 - txonly burst time pattern

Signed-off-by: Viacheslav Ovsiienko 
---
 app/test-pmd/cmdline.c  | 63 --
 app/test-pmd/config.c   | 61 +
 app/test-pmd/testpmd.c  |  6 +++
 app/test-pmd/testpmd.h  |  4 ++
 app/test-pmd/txonly.c   | 83 +++--
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 38 -
 6 files changed, 246 insertions(+), 9 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 39ad938..def0709 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -294,6 +294,10 @@ static void cmd_help_long_parsed(void *parsed_result,
" Right now only applicable for CSUM and TXONLY"
" modes\n\n"
 
+   "set txtimes (x, y)\n"
+   "Set the scheduling on timestamps"
+   " timings for the TXOMLY mode\n\n"
+
"set corelist (x[,y]*)\n"
"Set the list of forwarding cores.\n\n"
 
@@ -3930,6 +3934,52 @@ struct cmd_set_txsplit_result {
},
 };
 
+/* *** SET TIMES FOR TXONLY PACKETS SCHEDULING ON TIMESTAMPS *** */
+
+struct cmd_set_txtimes_result {
+   cmdline_fixed_string_t cmd_keyword;
+   cmdline_fixed_string_t txtimes;
+   cmdline_fixed_string_t tx_times;
+};
+
+static void
+cmd_set_txtimes_parsed(void *parsed_result,
+  __rte_unused struct cmdline *cl,
+  __rte_unused void *data)
+{
+   struct cmd_set_txtimes_result *res;
+   unsigned int tx_times[2] = {0, 0};
+   unsigned int n_times;
+
+   res = parsed_result;
+   n_times = parse_item_list(res->tx_times, "tx times",
+ 2, tx_times, 0);
+   if (n_times == 2)
+   set_tx_pkt_times(tx_times);
+}
+
+cmdline_parse_token_string_t cmd_set_txtimes_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_txtimes_result,
+cmd_keyword, "set");
+cmdline_parse_token_string_t cmd_set_txtimes_name =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_txtimes_result,
+txtimes, "txtimes");
+cmdline_parse_token_string_t cmd_set_txtimes_value =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_txtimes_result,
+tx_times, NULL);
+
+cmdline_parse_inst_t cmd_set_txtimes = {
+   .f = cmd_set_txtimes_parsed,
+   .data = NULL,
+   .help_str = "set txtimes ,",
+   .tokens = {
+   (void *)&cmd_set_txtimes_keyword,
+   (void *)&cmd_set_txtimes_name,
+   (void *)&cmd_set_txtimes_value,
+   NULL,
+   },
+};
+
 /* *** ADD/REMOVE ALL VLAN IDENTIFIERS TO/FROM A PORT VLAN RX FILTER *** */
 struct cmd_rx_vlan_filter_all_result {
cmdline_fixed_string_t rx_vlan;
@@ -7418,6 +7468,8 @@ static void cmd_showcfg_parsed(void *parsed_result,
pkt_fwd_config_display(&cur_fwd_config);
else if (!strcmp(res->what, "txpkts"))
show_tx_pkt_segments();
+   else if (!strcm

[dpdk-dev] [PATCH v7 1/2] mbuf: introduce accurate packet Tx scheduling

2020-07-10 Thread Viacheslav Ovsiienko
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.

The main objective of this patchset is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature
from mlx5 PMD side [1].

The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.

The device clock is opaque entity, the units and frequency are
vendor specific and might depend on hardware capabilities and
configurations. If might (or not) be synchronized with real time
via PTP, might (or not) be synchronous with CPU clock (for example
if NIC and CPU share the same clock source there might be no
any drift between the NIC and CPU clocks), etc.

After PKT_RX_TIMESTAMP flag and fixed timestamp field supposed
deprecation and obsoleting, these dynamic flag and field might be
used to manage the timestamps on receiving datapath as well. Having
the dedicated flags for Rx/Tx timestamps allows applications not
to perform explicit flags reset on forwarding and not to promote
received timestamps to the transmitting datapath by default.
The static PKT_RX_TIMESTAMP is considered as candidate to become
the dynamic flag and this move should be discussed.

When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. If the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.

There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.

PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:

This is palliative and might be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.

For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.

The new testpmd command is proposed to configure sending pattern:

set tx_times ,

 - the delay between the packets within the burst
  specified in the device clock units. The number
  of packets in the burst is defined by txburst parameter

 - the delay between the bursts in the device clock units

As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_read_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.

[1] http://patches.dpdk.org/patch/73714/

Signed-off-by: Viacheslav Ovsiienko 

---
  v1->v4:
 - dedicated dynamic Tx timestamp flag instead of shared with Rx
  v4->v5:
 - elaborated commit message
 - more words about device clocks added,
 - note about dedicated Rx/Tx timestamp flags added
  v5->v6:
 - release notes are updated
  v6->v7:
 - commit message is updated
 - testpmd checks the supported offloads before registering
   dynamic timestamp flag/field
---
 doc/guides/rel_notes/release_20_08.rst |  7 +++
 lib/librte_ethdev/rte_ethdev.c |  1 +
 lib/librte_ethdev/rte_ethdev.h |  4 
 lib/librte_mbuf/rte_mbuf_dyn.h | 31 +++
 4 files changed, 43 insertions(+)

diff --git a/doc/guides/rel_notes/release_20_08.rst 
b/doc/guides/rel_notes/release_2

Re: [dpdk-dev] [PATCH] net: fix unneeded replacement of 0 by ffff for TCP checksum

2020-07-10 Thread Olivier Matz
On Fri, Jul 10, 2020 at 02:55:51PM +0800, Hongzhi Guo wrote:
> Per RFC768:
> If the computed checksum is zero, it is transmitted as all ones.
> An all zero transmitted checksum value means that the transmitter
> generated no checksum.
> 
> RFC793 for TCP has no such special treatment for the checksum of zero.
> 
> Fixes: 6006818cfb26 ("net: new checksum functions")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Hongzhi Guo 
> ---
> v2:
> * Fixed commit tile
> * Fixed the API comment
> ---
> ---
>  lib/librte_net/rte_ip.h | 18 +++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
> index 292f63fd7..d03c77120 100644
> --- a/lib/librte_net/rte_ip.h
> +++ b/lib/librte_net/rte_ip.h
> @@ -325,7 +325,7 @@ rte_ipv4_phdr_cksum(const struct rte_ipv4_hdr *ipv4_hdr, 
> uint64_t ol_flags)
>   *   The pointer to the beginning of the L4 header.
>   * @return
>   *   The complemented checksum to set in the IP packet
> - *   or 0 on error
> + *   or 0 if the IP length is invalid in the header.
>   */
>  static inline uint16_t
>  rte_ipv4_udptcp_cksum(const struct rte_ipv4_hdr *ipv4_hdr, const void 
> *l4_hdr)
> @@ -344,7 +344,13 @@ rte_ipv4_udptcp_cksum(const struct rte_ipv4_hdr 
> *ipv4_hdr, const void *l4_hdr)
>  
>   cksum = ((cksum & 0x) >> 16) + (cksum & 0x);
>   cksum = (~cksum) & 0x;
> - if (cksum == 0)
> + /*
> +  *Per RFC768:
> +  *If the computed checksum is zero for udp,
> +  *it is transmitted as all ones.
> +  *(the equivalent in one's complement arithmetic).
> +  */

There should be a space after the '*', and maybe it could be on
less lines.

Thomas, maybe you can do it when applying?


> + if (cksum == 0 && ipv4_hdr->next_proto_id == IPPROTO_UDP)
>   cksum = 0x;
>  
>   return (uint16_t)cksum;
> @@ -438,7 +444,13 @@ rte_ipv6_udptcp_cksum(const struct rte_ipv6_hdr 
> *ipv6_hdr, const void *l4_hdr)
>  
>   cksum = ((cksum & 0x) >> 16) + (cksum & 0x);
>   cksum = (~cksum) & 0x;
> - if (cksum == 0)
> + /*
> +  *Per RFC768:
> +  *If the computed checksum is zero for udp,
> +  *it is transmitted as all ones.
> +  *(the equivalent in one's complement arithmetic).
> +  */

Same here

> + if (cksum == 0 && ipv6_hdr->proto == IPPROTO_UDP)
>   cksum = 0x;
>  
>   return (uint16_t)cksum;
> -- 
> 2.21.0.windows.1
> 
> 

Acked-by: Olivier Matz 


Re: [dpdk-dev] [PATCH v6 2/2] app/testpmd: add send scheduling test capability

2020-07-10 Thread Slava Ovsiienko
Hi Ferruh,

Thanks a lot for the comments, addressed all of them.

With best regards, Slava

> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, July 10, 2020 2:58
> To: Slava Ovsiienko ; dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; olivier.m...@6wind.com;
> bernard.iremon...@intel.com; tho...@monjalon.com
> Subject: Re: [dpdk-dev] [PATCH v6 2/2] app/testpmd: add send scheduling
> test capability
> 
> On 7/9/2020 1:36 PM, Viacheslav Ovsiienko wrote:
> > This commit adds testpmd capability to provide timestamps on the
> > packets being sent in the txonly mode. This includes:
> >
> >  - SEND_ON_TIMESTAMP support
> >new device Tx offload capability support added, example:
> >
> >  testpmd> port config 0 tx_offload send_on_timestamp on
> >
> >  - set txtimes, registers field and flag, example:
> >
> >  testpmd> set txtimes 100,0
> >
> >This command enables the packet send scheduling on timestamps if
> >the first parameter is not zero, generic format:
> >
> >  testpmd> set txtimes (inter),(intra)
> >
> >where:
> >
> >  inter - is the delay between the bursts in the device clock units.
> >  intra - is the delay between the packets within the burst specified
> >  in the device clock units
> >
> >  As the result the bursts of packet will be transmitted with
> >  specific delay between the packets within the burst and specific
> >  delay between the bursts. The rte_eth_get_clock() is supposed to be
> >  engaged to get the current device clock value and provide
> >  the reference for the timestamps.
> >
> >  - show txtimes, displays the timing settings
> >  - txonly burst time pattern
> >
> > Signed-off-by: Viacheslav Ovsiienko 
> 
> <...>
> 
> > +cmdline_parse_inst_t cmd_set_txtimes = {
> > +   .f = cmd_set_txtimes_parsed,
> > +   .data = NULL,
> > +   .help_str = "set txtimes ,",
> > +   .tokens = {
> > +   (void *)&cmd_set_txtimes_keyword,
> > +   (void *)&cmd_set_txtimes_name,
> > +   (void *)&cmd_set_txtimes_value,
> > +   NULL,
> > +   },
> > +};
> 
> Can you please update 'cmd_help_long_parsed()' with command updates?
> 
> <...>
> 
> >  void
> > +show_tx_pkt_times(void)
> > +{
> > +   printf("Interburst gap: %u\n", tx_pkt_times[0]);
> > +   printf("Intraburst gap: %u\n", tx_pkt_times[1]); }
> > +
> > +void
> > +set_tx_pkt_times(unsigned int *tx_times) {
> > +   int offset;
> > +   int flag;
> > +
> > +   static const struct rte_mbuf_dynfield desc_offs = {
> > +   .name = RTE_MBUF_DYNFIELD_TIMESTAMP_NAME,
> > +   .size = sizeof(uint64_t),
> > +   .align = __alignof__(uint64_t),
> > +   };
> > +   static const struct rte_mbuf_dynflag desc_flag = {
> > +   .name = RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME,
> > +   };
> > +
> > +   offset = rte_mbuf_dynfield_register(&desc_offs);
> > +   if (offset < 0 && rte_errno != EEXIST)
> > +   printf("Dynamic timestamp field registration error: %d",
> > +  rte_errno);
> > +   flag = rte_mbuf_dynflag_register(&desc_flag);
> > +   if (flag < 0 && rte_errno != EEXIST)
> > +   printf("Dynamic timestamp flag registration error: %d",
> > +  rte_errno);
> 
> You are not checking at all if the device supports
> 'DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP' offload or if it is configured or
> not, but blindly registering the dynamic fields.
> 'DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP' seems not really used, as
> mentioned in prev patch I would be OK to drop the flag.
> 
> > +   tx_pkt_times[0] = tx_times[0];
> > +   tx_pkt_times[1] = tx_times[1];
> 
> I think it is better to rename 'tx_pkt_times[0]' -> 'tx_pkt_times_inter',
> 'tx_pkt_times[1]' --> 'tx_pkt_times_intra' to increase code readability.
> 
> <...>
> 
> > --- a/app/test-pmd/txonly.c
> > +++ b/app/test-pmd/txonly.c
> > @@ -53,6 +53,11 @@
> >  static struct rte_ipv4_hdr pkt_ip_hdr; /**< IP header of transmitted
> > packets. */  RTE_DEFINE_PER_LCORE(uint8_t, _ip_var); /**< IP address
> > variation */  static struct rte_udp_hdr pkt_udp_hdr; /**< UDP header
> > of tx packets. */
> > +RTE_DEFINE_PER_LCORE(uint64_t, ts_qskew); /**< Timestamp offset per
> > +queue */ static uint64_t ts_mask; /**< Timestamp dynamic flag mask */
> > +static int32_t ts_off; /**< Timestamp dynamic field offset */ static
> > +bool ts_enable; /**< Timestamp enable */ static uint64_t
> > +ts_init[RTE_MAX_ETHPORTS];
> 
> What do you think renaming the 'ts_' prefix to long 'timestamp_' prefix, will
> variable names be too long? When you are out of this patch context and
> reading code 'ts_init' is not that expressive.
> 
> <...>
> 
> > @@ -213,6 +219,50 @@
> > copy_buf_to_pkt(&pkt_udp_hdr, sizeof(pkt_udp_hdr), pkt,
> > sizeof(struct rte_ether_hdr) +
> > sizeof(struct rte_ipv4_hdr));
> > +   if (unlikely(ts_enable)) {
> > +   uint64_t skew = RTE_PER_LCORE(ts_qskew);
> > +   struct {
> > +   rte_be32_t signa

Re: [dpdk-dev] [PATCH] eal/linux: truncate thread name

2020-07-10 Thread Thomas Monjalon
10/07/2020 11:45, David Marchand:
> pthread_setname_np refuses names larger than 16 bytes (\0 included).
> Rather than return an error, truncate the name to this limit in the
> rte_thread_setname helper.
[...]
> --- a/lib/librte_eal/linux/eal_thread.c
> +++ b/lib/librte_eal/linux/eal_thread.c
> @@ -153,7 +153,10 @@ int rte_thread_setname(pthread_t id, const char *name)
>   int ret = ENOSYS;
>  #if defined(__GLIBC__) && defined(__GLIBC_PREREQ)
>  #if __GLIBC_PREREQ(2, 12)
> - ret = pthread_setname_np(id, name);
> + char truncated[16];

That's a pity POSIX is not defining a constant for this limit.

> +
> + strlcpy(truncated, name, sizeof(truncated));
> + ret = pthread_setname_np(id, truncated);
>  #endif
>  #endif

Acked-by: Thomas Monjalon 




Re: [dpdk-dev] [PATCH] eal/linux: truncate thread name

2020-07-10 Thread David Marchand
On Fri, Jul 10, 2020 at 2:41 PM Thomas Monjalon  wrote:
>
> 10/07/2020 11:45, David Marchand:
> > pthread_setname_np refuses names larger than 16 bytes (\0 included).
> > Rather than return an error, truncate the name to this limit in the
> > rte_thread_setname helper.
> [...]
> > --- a/lib/librte_eal/linux/eal_thread.c
> > +++ b/lib/librte_eal/linux/eal_thread.c
> > @@ -153,7 +153,10 @@ int rte_thread_setname(pthread_t id, const char *name)
> >   int ret = ENOSYS;
> >  #if defined(__GLIBC__) && defined(__GLIBC_PREREQ)
> >  #if __GLIBC_PREREQ(2, 12)
> > - ret = pthread_setname_np(id, name);
> > + char truncated[16];
>
> That's a pity POSIX is not defining a constant for this limit.

pthread_setname "_np" :-)


-- 
David Marchand



Re: [dpdk-dev] [dpdk-ci] [PATCH] bus/pci: fix mmap PCI resource

2020-07-10 Thread Lincoln Lavoie
On Fri, Jul 10, 2020 at 6:08 AM Thomas Monjalon  wrote:

> 10/07/2020 11:54, David Marchand:
> > On Wed, Jul 8, 2020 at 11:26 AM  wrote:
> > > From: Alvin Zhang 
> > >
> > > When mapping a PCI BAR containing an MSI-X table, some devices do not
> > > need to actually map this BAR or only need to map part of them, which
> > > may cause the mapping to fail. Now some checks are added and a non-NULL
> > > initial value is set to the variable to avoid this situation.
>
> Note: this regression would not have happened if we had some CI tests
> for simple device probing.
> Please let's invest more in CI.
>
> Are you referring to adding tests to specifically check these conditions,
or would this have been caught just from the continued expansion of testing
on real hardware / NICs, or both? It seems like the issue is caused by a
combination of hardware behaviors and "broken code".  My point is, without
having some of those behaviors in the CI, we might still not have caught
this issue, even with probing checks.  Of course, more checks are always a
good thing.

>
> > > Fixes: 2fd3567e5425 ("pci: use OS generic memory mapping functions")
> > > Cc: tal...@mellanox.com
>
> No he was not Cc in the thread. Same for Anatoly.
> Adding more people in Cc...
>
> > > Signed-off-by: Alvin Zhang 
> > > ---
> > > --- a/drivers/bus/pci/linux/pci_vfio.c
> > > +++ b/drivers/bus/pci/linux/pci_vfio.c
> > > @@ -547,6 +547,14 @@
> > > bar_index,
> > > memreg[0].offset, memreg[0].size,
> > > memreg[1].offset, memreg[1].size);
> > > +
> > > +   if (memreg[0].size == 0 && memreg[1].size == 0) {
> > > +   /* No need to map this BAR */
> > > +   RTE_LOG(DEBUG, EAL, "Skipping BAR%d\n",
> bar_index);
> > > +   bar->size = 0;
> > > +   bar->addr = 0;
> > > +   return 0;
> > > +   }
> >
> > We already have a check on bar size == 0.
> > Why would we have this condition?
> > Broken hw?
> >
> >
> > > } else {
> > > memreg[0].offset = bar->offset;
> > > memreg[0].size = bar->size;
> > > @@ -556,7 +564,9 @@
> > > bar_addr = mmap(bar->addr, bar->size, 0, MAP_PRIVATE |
> > > MAP_ANONYMOUS | additional_flags, -1, 0);
> > > if (bar_addr != MAP_FAILED) {
> > > -   void *map_addr = NULL;
> > > +   /* Set non NULL initial value for in case of no PCI
> mapping */
> > > +   void *map_addr = bar_addr;
> > > +
> >
> > It took me some time to understand this code...
> > Anyway, we have a regression in the librte_pci.
> > This is where the fix should be.
>
> Yes, I am going to send a fix.
>
> > We can cleanup this code later.
>
> Yes please, this function isn't understandable and lack of comments.
> Anatoly please?
>
>
>

-- 
*Lincoln Lavoie*
Senior Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylav...@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)



Re: [dpdk-dev] [PATCH] mempool: return ENOMEM if initial alloc size can not be satisfied

2020-07-10 Thread 王志克
Thanks, you are right.

Will send new patch.

Br,
Zhike Wang 
JDCloud, Product Development, IaaS   

Mobile/+86 13466719566
E- mail/wangzh...@jd.com
Address/5F Building A,North-Star Century Center,8 Beichen West Street,Chaoyang 
District Beijing
Https://JDCloud.com




-Original Message-
From: Andrew Rybchenko [mailto:arybche...@solarflare.com] 
Sent: Friday, July 03, 2020 5:23 PM
To: 王志克; dev@dpdk.org
Cc: olivier.m...@6wind.com
Subject: Re: [dpdk-dev] [PATCH] mempool: return ENOMEM if initial alloc size 
can not be satisfied

On 7/3/20 11:41 AM, Zhike Wang wrote:
> Signed-off-by: Zhike Wang 
> ---
>  lib/librte_mempool/rte_mempool.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/librte_mempool/rte_mempool.c 
> b/lib/librte_mempool/rte_mempool.c
> index 0bde995..b24feb6 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -622,6 +622,12 @@ struct pagesz_walk_arg {
>   goto fail;
>   }
>  
> + if (max_alloc_size < min_chunk_size) {
> + rte_errno = ENOMEM;
> + ret = -rte_errno;
> + goto fail;
> + }
> +
>   /* if we're trying to reserve contiguous memory, add appropriate
>* memzone flag.
>*/
> 

As far as I can see there is really a bug in nearby code, but
the fix suggested here is a wrong direction.

The check is already present below in do-while loop condition,
but it is wrong that max_alloc_size is divided by 2 in the
case of successful allocation as well.
If allocation is successful on the first attempt, typically
there is no problem since we allocated everything required and
we'll terminate the loop (if memory chunk is really sufficient
to populate required number of mempool elements).

However, if the first attempt fails, we try to allocate half
of mem_size and it succeed, we'll have one more iteration of
the for-loop to allocate memory for remaining elements and
should not try the next time with quarter of the mem_size.
mem_size will be recalculated, but max_alloc_size will limit
allocation attempt size.

So, I think it is required to add "mz != NULL ||" to the
if condition in do-while loop. It will guarantee that
max_alloc_size is reduced if and only if mz == NULL and
if it becomes smaller than min_chunk_size, if condition
after do-while loop will return error.


Re: [dpdk-dev] [PATCH v2] mempool/ring: add support for new ring sync modes

2020-07-10 Thread Olivier Matz
Hi Konstantin,

On Thu, Jul 09, 2020 at 05:55:30PM +, Ananyev, Konstantin wrote:
> Hi Olivier,
>  
> > Hi Konstantin,
> > 
> > On Mon, Jun 29, 2020 at 05:10:24PM +0100, Konstantin Ananyev wrote:
> > > v2:
> > >  - update Release Notes (as per comments)
> > >
> > > Two new sync modes were introduced into rte_ring:
> > > relaxed tail sync (RTS) and head/tail sync (HTS).
> > > This change provides user with ability to select these
> > > modes for ring based mempool via mempool ops API.
> > >
> > > Signed-off-by: Konstantin Ananyev 
> > > Acked-by: Gage Eads 
> > > ---
> > >  doc/guides/rel_notes/release_20_08.rst  |  6 ++
> > >  drivers/mempool/ring/rte_mempool_ring.c | 97 ++---
> > >  2 files changed, 94 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/doc/guides/rel_notes/release_20_08.rst 
> > > b/doc/guides/rel_notes/release_20_08.rst
> > > index eaaf11c37..7bdcf3aac 100644
> > > --- a/doc/guides/rel_notes/release_20_08.rst
> > > +++ b/doc/guides/rel_notes/release_20_08.rst
> > > @@ -84,6 +84,12 @@ New Features
> > >* Dump ``rte_flow`` memory consumption.
> > >* Measure packet per second forwarding.
> > >
> > > +* **Added support for new sync modes into mempool ring driver.**
> > > +
> > > +  Added ability to select new ring synchronisation modes:
> > > +  ``relaxed tail sync (ring_mt_rts)`` and ``head/tail sync 
> > > (ring_mt_hts)``
> > > +  via mempool ops API.
> > > +
> > >
> > >  Removed Items
> > >  -
> > > diff --git a/drivers/mempool/ring/rte_mempool_ring.c 
> > > b/drivers/mempool/ring/rte_mempool_ring.c
> > > index bc123fc52..15ec7dee7 100644
> > > --- a/drivers/mempool/ring/rte_mempool_ring.c
> > > +++ b/drivers/mempool/ring/rte_mempool_ring.c
> > > @@ -25,6 +25,22 @@ common_ring_sp_enqueue(struct rte_mempool *mp, void * 
> > > const *obj_table,
> > >   obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > >  }
> > >
> > > +static int
> > > +rts_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
> > > + unsigned int n)
> > > +{
> > > + return rte_ring_mp_rts_enqueue_bulk(mp->pool_data,
> > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > +}
> > > +
> > > +static int
> > > +hts_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
> > > + unsigned int n)
> > > +{
> > > + return rte_ring_mp_hts_enqueue_bulk(mp->pool_data,
> > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > +}
> > > +
> > >  static int
> > >  common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, 
> > > unsigned n)
> > >  {
> > > @@ -39,17 +55,30 @@ common_ring_sc_dequeue(struct rte_mempool *mp, void 
> > > **obj_table, unsigned n)
> > >   obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > >  }
> > >
> > > +static int
> > > +rts_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned 
> > > int n)
> > > +{
> > > + return rte_ring_mc_rts_dequeue_bulk(mp->pool_data,
> > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > +}
> > > +
> > > +static int
> > > +hts_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned 
> > > int n)
> > > +{
> > > + return rte_ring_mc_hts_dequeue_bulk(mp->pool_data,
> > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > +}
> > > +
> > >  static unsigned
> > >  common_ring_get_count(const struct rte_mempool *mp)
> > >  {
> > >   return rte_ring_count(mp->pool_data);
> > >  }
> > >
> > > -
> > >  static int
> > > -common_ring_alloc(struct rte_mempool *mp)
> > > +ring_alloc(struct rte_mempool *mp, uint32_t rg_flags)
> > >  {
> > > - int rg_flags = 0, ret;
> > > + int ret;
> > >   char rg_name[RTE_RING_NAMESIZE];
> > >   struct rte_ring *r;
> > >
> > > @@ -60,12 +89,6 @@ common_ring_alloc(struct rte_mempool *mp)
> > >   return -rte_errno;
> > >   }
> > >
> > > - /* ring flags */
> > > - if (mp->flags & MEMPOOL_F_SP_PUT)
> > > - rg_flags |= RING_F_SP_ENQ;
> > > - if (mp->flags & MEMPOOL_F_SC_GET)
> > > - rg_flags |= RING_F_SC_DEQ;
> > > -
> > >   /*
> > >* Allocate the ring that will be used to store objects.
> > >* Ring functions will return appropriate errors if we are
> > > @@ -82,6 +105,40 @@ common_ring_alloc(struct rte_mempool *mp)
> > >   return 0;
> > >  }
> > >
> > > +static int
> > > +common_ring_alloc(struct rte_mempool *mp)
> > > +{
> > > + uint32_t rg_flags;
> > > +
> > > + rg_flags = 0;
> > 
> > Maybe it could go on the same line
> > 
> > > +
> > > + /* ring flags */
> > 
> > Not sure we need to keep this comment
> > 
> > > + if (mp->flags & MEMPOOL_F_SP_PUT)
> > > + rg_flags |= RING_F_SP_ENQ;
> > > + if (mp->flags & MEMPOOL_F_SC_GET)
> > > + rg_flags |= RING_F_SC_DEQ;
> > > +
> > > + return ring_alloc(mp, rg_flags);
> > > +}
> > > +
> > > +static int
> > > +rts_ring_alloc(struct rte_mempool *mp)
> > > +{
> > > + if ((mp->flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET)) != 0)
> > > + return -EINVAL;
> > 
> > Why do we need this? It is a problem to allow

Re: [dpdk-dev] Weird 2 KB MBUF data room requirement

2020-07-10 Thread Olivier Matz
Hi,

On Fri, Jul 10, 2020 at 11:26:09AM +0100, Bruce Richardson wrote:
> On Fri, Jul 10, 2020 at 10:21:40AM +0200, Morten Brørup wrote:
> > Dear Ethernet PMD developers,
> > 
> > According to rte_mbuf_core.h, RTE_MBUF_DEFAULT_DATAROOM is 2048 bytes 
> > because some NICs need at least 2 KB buffer to receive standard Ethernet 
> > frames without splitting them into multiple segments.
> > 
> > This is a serious waste of memory, considering that standard Ethernet 
> > frames are max 1518 bytes.
> > 
> > How wide spread is this limitation... is it common or a rare exception?
> > 
> > Where is it documented which NICs suffer from this limitation?
> > 
> > Do any Intel NICs suffer from this limitation?
> > 
> > 
> > NB: We are targeting an MBUF total size (incl. memzone element overhead) of 
> > 2^N, and this limitation would increase our MBUF total size to 4 KB.
> > 
> > 
> > Med venlig hilsen / kind regards
> > - Morten Brørup
> > 
> 
> AFAIK: the NICs supported by the ixgbe driver only allow the size to be
> specified in KB granularity.
> 
> However, it may be safe to have a driver modification whereby anything over
> 1600 bytes is considered as 2KB if jumbo frame support is disabled. I don't
> think anyone has actually looked into doing so though, or if there are
> other hidden gotchas about attempting to do so.

If I remember well, the niantic NICs (and probably some others) can
have their rx size configured with 512, 1024, 2048, ...
This is the size that should be available from the given data pointer,
i.e. it does not include the headroom.

I suppose that if we configure the NIC with 2K but give less than 2K, the NIC
may write after the buffer when receiving a large packet.


Olivier


Re: [dpdk-dev] [PATCH v7 02/25] ethdev: add a link status text representation

2020-07-10 Thread Yigit, Ferruh
On 7/10/2020 8:02 AM, Ivan Dyukov wrote:
> This commit add function which treat link status structure
> and format it to text representation.
> 
> Signed-off-by: Ivan Dyukov 

<...>

> +static int
> +rte_eth_link_strf_parser(char *str, size_t len, const char *const fmt,
> +const struct rte_eth_link *link)
> +{
> + size_t offset = 0;
> + const char *fmt_cur = fmt;
> + char *str_cur = str;
> + double gbits = (double)link->link_speed / 1000.;
> + static const char autoneg_str[]   = "Autoneg";
> + static const char fixed_str[] = "Fixed";
> + static const char fdx_str[]   = "FDX";
> + static const char hdx_str[]   = "HDX";
> + static const char unknown_str[]   = "Unknown";
> + static const char up_str[]= "Up";
> + static const char down_str[]  = "Down";
> + char gbits_str[20];
> + char mbits_str[20];
> +
> + /* preformat complex formatting to easily concatinate it further */
> + snprintf(mbits_str, sizeof(mbits_str), "%u", link->link_speed);
> + snprintf(gbits_str, sizeof(gbits_str), "%.1f", gbits);
> + /* init str before formatting */
> + str[0] = 0;
> + while (*fmt_cur) {
> + /* check str bounds */
> + if (offset > (len - 1)) {
> + str[len - 1] = '\0';
> + return -1;
> + }
> + if (*fmt_cur == '%') {
> + /* set null terminator to current position,
> +  * it's required for strlcat
> +  */
> + *str_cur = '\0';
> + switch (*++fmt_cur) {
> + /* Speed in Mbits/s */
> + case 'M':
> + if (link->link_speed ==
> + ETH_SPEED_NUM_UNKNOWN)
> + offset = strlcat(str, unknown_str,
> +  len);
> + else
> + offset = strlcat(str, mbits_str, len);
> + break;
> + /* Speed in Gbits/s */
> + case 'G':
> + if (link->link_speed ==
> + ETH_SPEED_NUM_UNKNOWN)
> + offset = strlcat(str, unknown_str,
> +  len);
> + else
> + offset = strlcat(str, gbits_str, len);
> + break;
> + /* Link status */
> + case 'S':
> + offset = strlcat(str, link->link_status ?
> + up_str : down_str, len);
> + break;
> + /* Link autoneg */
> + case 'A':
> + offset = strlcat(str, link->link_autoneg ?
> + autoneg_str : fixed_str, len);
> + break;
> + /* Link duplex */
> + case 'D':
> + offset = strlcat(str, link->link_duplex ?
> + fdx_str : hdx_str, len);
> + break;
> + /* ignore unknown specifier */
> + default:
> + *str_cur = '%';
> + offset++;
> + fmt_cur--;
> + break;

What do you think ignoring the unknown specifiers and keep continue
processing the string, instead of break? Just keep unknown specifier
as it is in the output string.

> +
> + }
> + if (offset > (len - 1))
> + return -1;

For me "offset >= len" is simpler than "offset > (len - 1)", also it prevents 
any possible error when "len == 0" ('len' is unsigned) although I can see you 
have check for it.
Anyway both deos same thing, up to you.

> +
> + str_cur = str + offset;
> + } else {
> + *str_cur++ = *fmt_cur;
> + offset++;

Why keeping both offset and the pointer ('str_cur'), just offset should be 
enough "str[offset++] = *fmt_cur;", I think this simplifies a little bit.

> + }
> + fmt_cur++;
> + }
> + *str_cur = '\0';
> + return offset;
> +}
> +
> +int
> +rte_eth_link_printf(const char *const fmt,
> + const struct rte_eth_link *link)
> +{
> + char text[200];
> + int ret;
> +
> + ret = rte_eth_link_strf(text, 200, fmt, link);
> + if (ret > 0)
> + printf("%s", text);
> + return ret;
> +}
> +
> +int
> +rte_eth_link_strf(char *str, size_t len, const char *const f

[dpdk-dev] [PATCH v2] app/test-eventdev: Fix pipeline atq

2020-07-10 Thread Apeksha Gupta
if-check is required to check the capabilitiy of all type queue.

Fixes: 6bf570a9911 ("app/eventdev: add pipeline atq test")
Cc: sta...@dpdk.org

Signed-off-by: Apeksha Gupta 
---
 app/test-eventdev/test_pipeline_atq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/app/test-eventdev/test_pipeline_atq.c 
b/app/test-eventdev/test_pipeline_atq.c
index 8e8686c145..0872b25b53 100644
--- a/app/test-eventdev/test_pipeline_atq.c
+++ b/app/test-eventdev/test_pipeline_atq.c
@@ -495,6 +495,8 @@ pipeline_atq_capability_check(struct evt_options *opt)
evt_nr_active_lcores(opt->wlcores),
dev_info.max_event_ports);
}
+   if (!evt_has_all_types_queue(opt->dev_id))
+   return false;
 
return true;
 }
-- 
2.17.1



Re: [dpdk-dev] [PATCH] net: fix unneeded replacement of 0 by ffff for TCP checksum

2020-07-10 Thread Morten Brørup
> From: Olivier Matz [mailto:olivier.m...@6wind.com]
> Sent: Friday, July 10, 2020 2:41 PM
> 
> On Fri, Jul 10, 2020 at 02:55:51PM +0800, Hongzhi Guo wrote:
> > Per RFC768:
> > If the computed checksum is zero, it is transmitted as all ones.
> > An all zero transmitted checksum value means that the transmitter
> > generated no checksum.
> >
> > RFC793 for TCP has no such special treatment for the checksum of
> zero.
> >
> > Fixes: 6006818cfb26 ("net: new checksum functions")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Hongzhi Guo 
> > ---
> > v2:
> > * Fixed commit tile
> > * Fixed the API comment
> > ---
> > ---
> >  lib/librte_net/rte_ip.h | 18 +++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
> > index 292f63fd7..d03c77120 100644
> > --- a/lib/librte_net/rte_ip.h
> > +++ b/lib/librte_net/rte_ip.h
> > @@ -325,7 +325,7 @@ rte_ipv4_phdr_cksum(const struct rte_ipv4_hdr
> *ipv4_hdr, uint64_t ol_flags)
> >   *   The pointer to the beginning of the L4 header.
> >   * @return
> >   *   The complemented checksum to set in the IP packet
> > - *   or 0 on error
> > + *   or 0 if the IP length is invalid in the header.
> >   */
> >  static inline uint16_t
> >  rte_ipv4_udptcp_cksum(const struct rte_ipv4_hdr *ipv4_hdr, const
> void *l4_hdr)

0 is a valid return value, so I suggest omitting it from the return value 
description:

  * @return
- *   The complemented checksum to set in the IP packet
- *   or 0 on error
+ *   The complemented checksum to set in the IP packet.

The comparison "if (l3_len < sizeof(struct rte_ipv4_hdr))" is only there to 
protect against invalid input; it prevents l4_len from becoming negative.

For the same reason, unlikely() should be added to this comparison.

Otherwise,

Acked-by: Morten Brørup 



Re: [dpdk-dev] [PATCH] net: fix unneeded replacement of 0 by ffff for TCP checksum

2020-07-10 Thread Olivier Matz
On Fri, Jul 10, 2020 at 03:10:34PM +0200, Morten Brørup wrote:
> > From: Olivier Matz [mailto:olivier.m...@6wind.com]
> > Sent: Friday, July 10, 2020 2:41 PM
> > 
> > On Fri, Jul 10, 2020 at 02:55:51PM +0800, Hongzhi Guo wrote:
> > > Per RFC768:
> > > If the computed checksum is zero, it is transmitted as all ones.
> > > An all zero transmitted checksum value means that the transmitter
> > > generated no checksum.
> > >
> > > RFC793 for TCP has no such special treatment for the checksum of
> > zero.
> > >
> > > Fixes: 6006818cfb26 ("net: new checksum functions")
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Hongzhi Guo 
> > > ---
> > > v2:
> > > * Fixed commit tile
> > > * Fixed the API comment
> > > ---
> > > ---
> > >  lib/librte_net/rte_ip.h | 18 +++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
> > > index 292f63fd7..d03c77120 100644
> > > --- a/lib/librte_net/rte_ip.h
> > > +++ b/lib/librte_net/rte_ip.h
> > > @@ -325,7 +325,7 @@ rte_ipv4_phdr_cksum(const struct rte_ipv4_hdr
> > *ipv4_hdr, uint64_t ol_flags)
> > >   *   The pointer to the beginning of the L4 header.
> > >   * @return
> > >   *   The complemented checksum to set in the IP packet
> > > - *   or 0 on error
> > > + *   or 0 if the IP length is invalid in the header.
> > >   */
> > >  static inline uint16_t
> > >  rte_ipv4_udptcp_cksum(const struct rte_ipv4_hdr *ipv4_hdr, const
> > void *l4_hdr)
> 
> 0 is a valid return value, so I suggest omitting it from the return value 
> description:
> 
>   * @return
> - *   The complemented checksum to set in the IP packet
> - *   or 0 on error
> + *   The complemented checksum to set in the IP packet.
> 
> The comparison "if (l3_len < sizeof(struct rte_ipv4_hdr))" is only there to 
> protect against invalid input; it prevents l4_len from becoming negative.

I don't get why "0 if the IP length is invalid in the header" should
be removed from the comment: 0 is both a valid return value and
the value returned on invalid packet.

> For the same reason, unlikely() should be added to this comparison.

Maybe yes, but that's another story I think.

> Otherwise,
> 
> Acked-by: Morten Brørup 
> 


Re: [dpdk-dev] [PATCH v7 21/25] examples/ntb: new link status print format

2020-07-10 Thread Yigit, Ferruh
> -Original Message-
> From: Ivan Dyukov 
> Sent: Friday, July 10, 2020 8:02 AM
> To: dev@dpdk.org; i.dyu...@samsung.com; v.kurams...@samsung.com;
> tho...@monjalon.net; david.march...@redhat.com; Yigit, Ferruh
> ; arybche...@solarflare.com; Zhao1, Wei
> ; Guo, Jia ; Xing, Beilei
> ; Yang, Qiming ; Lu,
> Wenzhuo ; m...@smartsharesystems.com;
> step...@networkplumber.org; Chautru, Nicolas
> ; Richardson, Bruce
> ; Ananyev, Konstantin
> ; Dumitrescu, Cristian
> ; Nicolau, Radu ;
> akhil.go...@nxp.com; Doherty, Declan ;
> sk...@marvell.com; pbhagavat...@marvell.com; jer...@marvell.com;
> kirankum...@marvell.com; Hunt, David ; Burakov,
> Anatoly ; Li, Xiaoyun ;
> Wu, Jingjing ; Mcnamara, John
> ; Singh, Jasvinder
> ; Marohn, Byron ;
> Wang, Yipeng1 
> Subject: [PATCH v7 21/25] examples/ntb: new link status print format
> 
> Add usage of rte_eth_link_strf function to example applications
> 
> Signed-off-by: Ivan Dyukov 
> ---
>  examples/ntb/ntb_fwd.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c index
> eba8ebf9f..84fe374c4 100644
> --- a/examples/ntb/ntb_fwd.c
> +++ b/examples/ntb/ntb_fwd.c
> @@ -729,6 +729,7 @@ start_pkt_fwd(void)
>   struct rte_eth_link eth_link;
>   uint32_t lcore_id;
>   int ret, i;
> + char link_status_text[60];
> 
>   ret = ntb_fwd_config_setup();
>   if (ret < 0) {
> @@ -747,11 +748,10 @@ start_pkt_fwd(void)
>   return;
>   }
>   if (eth_link.link_status) {
> - printf("Eth%u Link Up. Speed %u Mbps -
> %s\n",
> - eth_port_id, eth_link.link_speed,
> - (eth_link.link_duplex ==
> -  ETH_LINK_FULL_DUPLEX) ?
> - ("full-duplex") : ("half-duplex"));
> + rte_eth_link_strf(link_status_text, 60, NULL,
> + &link);

s/link/eth_link/

.../examples/ntb/ntb_fwd.c:752:11: error: passing argument 4 of 
‘rte_eth_link_strf’ from incompatible pointer type 
[-Werror=incompatible-pointer-types]
  752 |   &link);
  |   ^
  |   |
  |   int (*)(const char *, const char *)

> + printf("Eth%u %s", eth_port_id,
> +link_status_text);
>   break;
>   }
>   }
> --
> 2.17.1



Re: [dpdk-dev] [PATCH] net: fix unneeded replacement of 0 by ffff for TCP checksum

2020-07-10 Thread Morten Brørup
> From: Olivier Matz [mailto:olivier.m...@6wind.com]
> Sent: Friday, July 10, 2020 3:16 PM
> 
> On Fri, Jul 10, 2020 at 03:10:34PM +0200, Morten Brørup wrote:
> > > From: Olivier Matz [mailto:olivier.m...@6wind.com]
> > > Sent: Friday, July 10, 2020 2:41 PM
> > >
> > > On Fri, Jul 10, 2020 at 02:55:51PM +0800, Hongzhi Guo wrote:
> > > > Per RFC768:
> > > > If the computed checksum is zero, it is transmitted as all ones.
> > > > An all zero transmitted checksum value means that the transmitter
> > > > generated no checksum.
> > > >
> > > > RFC793 for TCP has no such special treatment for the checksum of
> > > zero.
> > > >
> > > > Fixes: 6006818cfb26 ("net: new checksum functions")
> > > > Cc: sta...@dpdk.org
> > > >
> > > > Signed-off-by: Hongzhi Guo 
> > > > ---
> > > > v2:
> > > > * Fixed commit tile
> > > > * Fixed the API comment
> > > > ---
> > > > ---
> > > >  lib/librte_net/rte_ip.h | 18 +++---
> > > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
> > > > index 292f63fd7..d03c77120 100644
> > > > --- a/lib/librte_net/rte_ip.h
> > > > +++ b/lib/librte_net/rte_ip.h
> > > > @@ -325,7 +325,7 @@ rte_ipv4_phdr_cksum(const struct rte_ipv4_hdr
> > > *ipv4_hdr, uint64_t ol_flags)
> > > >   *   The pointer to the beginning of the L4 header.
> > > >   * @return
> > > >   *   The complemented checksum to set in the IP packet
> > > > - *   or 0 on error
> > > > + *   or 0 if the IP length is invalid in the header.
> > > >   */
> > > >  static inline uint16_t
> > > >  rte_ipv4_udptcp_cksum(const struct rte_ipv4_hdr *ipv4_hdr, const
> > > void *l4_hdr)
> >
> > 0 is a valid return value, so I suggest omitting it from the return
> value description:
> >
> >   * @return
> > - *   The complemented checksum to set in the IP packet
> > - *   or 0 on error
> > + *   The complemented checksum to set in the IP packet.
> >
> > The comparison "if (l3_len < sizeof(struct rte_ipv4_hdr))" is only
> there to protect against invalid input; it prevents l4_len from
> becoming negative.
> 
> I don't get why "0 if the IP length is invalid in the header" should
> be removed from the comment: 0 is both a valid return value and
> the value returned on invalid packet.

To avoid confusion. We do not want people to add error handling for a return 
value of 0.

0 is not a special value or an error, so it does not deserve explicit 
mentioning.

If we want to mention the return value for garbage input, we should not use the 
wording "or 0", because this suggests that 0 is not a normal return value.

> 
> > For the same reason, unlikely() should be added to this comparison.
> 
> Maybe yes, but that's another story I think.

Agree. I was just mentioning it so it can be done when modifying the function 
anyway.

> 
> > Otherwise,
> >
> > Acked-by: Morten Brørup 
> >



Re: [dpdk-dev] [PATCH] pci: keep API compatibility with mmap values

2020-07-10 Thread David Marchand
On Fri, Jul 10, 2020 at 1:53 PM Thomas Monjalon  wrote:
>
> The function pci_map_resource() returns MAP_FAILED in case of error.
> When replacing the call to mmap() by rte_mem_map(),
> the error code became NULL, breaking the API.
> This function is probably not used outside of DPDK,
> but it is still a problem for two reasons:
> - the deprecation process was not followed
> - the Linux function pci_vfio_mmap_bar() is broken for i40e
>
> The error code is reverted to the Unix value MAP_FAILED.
> Windows needs to define this special value (-1 as in Unix).
> After proper deprecation process, the API could be changed again
> if really needed.
>
> Because of the switch from mmap() to rte_mem_map(),
> another part of the API was changed: "int additional_flags"
> are defined as "additional flags for the mapping range"
> without mentioning it was directly used in mmap().
> Currently it is directly used in rte_mem_map(),
> that's why the values rte_map_flags must be mapped (sic) on the mmap ones
> in case of Unix OS.
>
> These are side effects of a badly defined API using Unix values.
>
> Bugzilla ID: 503
> Fixes: 2fd3567e5425 ("pci: use OS generic memory mapping functions")
> Cc: tal...@mellanox.com
>
> Reported-by: David Marchand 
> Signed-off-by: Thomas Monjalon 
> ---
>  drivers/bus/pci/bsd/pci.c | 2 +-
>  drivers/bus/pci/linux/pci_uio.c   | 2 +-
>  drivers/bus/pci/linux/pci_vfio.c  | 4 ++--
>  drivers/bus/pci/pci_common_uio.c  | 2 +-
>  lib/librte_eal/include/rte_eal_paging.h   | 8 
>  lib/librte_eal/windows/include/sys/mman.h | 9 +
>  lib/librte_pci/rte_pci.c  | 1 +
>  lib/librte_pci/rte_pci.h  | 2 +-
>  8 files changed, 24 insertions(+), 6 deletions(-)
>  create mode 100644 lib/librte_eal/windows/include/sys/mman.h
>
> diff --git a/drivers/bus/pci/bsd/pci.c b/drivers/bus/pci/bsd/pci.c
> index 8bc473eb9a..6ec27b4b5b 100644
> --- a/drivers/bus/pci/bsd/pci.c
> +++ b/drivers/bus/pci/bsd/pci.c
> @@ -192,7 +192,7 @@ pci_uio_map_resource_by_index(struct rte_pci_device *dev, 
> int res_idx,
> mapaddr = pci_map_resource(NULL, fd, (off_t)offset,
> (size_t)dev->mem_resource[res_idx].len, 0);
> close(fd);
> -   if (mapaddr == NULL)
> +   if (mapaddr == MAP_FAILED)
> goto error;
>
> maps[map_idx].phaddr = dev->mem_resource[res_idx].phys_addr;
> diff --git a/drivers/bus/pci/linux/pci_uio.c b/drivers/bus/pci/linux/pci_uio.c
> index b622001539..097dc19225 100644
> --- a/drivers/bus/pci/linux/pci_uio.c
> +++ b/drivers/bus/pci/linux/pci_uio.c
> @@ -345,7 +345,7 @@ pci_uio_map_resource_by_index(struct rte_pci_device *dev, 
> int res_idx,
> mapaddr = pci_map_resource(pci_map_addr, fd, 0,
> (size_t)dev->mem_resource[res_idx].len, 0);
> close(fd);
> -   if (mapaddr == NULL)
> +   if (mapaddr == MAP_FAILED)
> goto error;
>
> pci_map_addr = RTE_PTR_ADD(mapaddr,
> diff --git a/drivers/bus/pci/linux/pci_vfio.c 
> b/drivers/bus/pci/linux/pci_vfio.c
> index fdeb9a8caf..07e072e13f 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -566,7 +566,7 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct 
> mapped_pci_resource *vfio_res,
> }
>
> /* if there's a second part, try to map it */
> -   if (map_addr != NULL
> +   if (map_addr != MAP_FAILED
> && memreg[1].offset && memreg[1].size) {
> void *second_addr = RTE_PTR_ADD(bar_addr,
> (uintptr_t)(memreg[1].offset -
> @@ -578,7 +578,7 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct 
> mapped_pci_resource *vfio_res,
> 
> RTE_MAP_FORCE_ADDRESS);
> }
>
> -   if (map_addr == NULL) {
> +   if (map_addr == NULL || map_addr == MAP_FAILED) {
> munmap(bar_addr, bar->size);
> bar_addr = MAP_FAILED;
> RTE_LOG(ERR, EAL, "Failed to map pci BAR%d\n",
> diff --git a/drivers/bus/pci/pci_common_uio.c 
> b/drivers/bus/pci/pci_common_uio.c
> index 793dfd0a7c..f4dca9da91 100644
> --- a/drivers/bus/pci/pci_common_uio.c
> +++ b/drivers/bus/pci/pci_common_uio.c
> @@ -58,7 +58,7 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
> "Cannot mmap device resource file %s 
> to address: %p\n",
> uio_res->maps[i].path,
> uio_res->maps[i].addr);
> -   if (mapaddr != NULL) {
> +   if (mapaddr != MAP_FAILED) {
> /* unmap addrs correctly mapped */
> for (j = 0; j < i; j++)
>   

Re: [dpdk-dev] [PATCH 0/9] python2 deprecation notice

2020-07-10 Thread Robin Jarry
Hi Louise,

2020-07-10, Louise Kilheeney:
> This patchset adds deprecation notices to python scripts,
> warning of the removal of python2 support from the DPDK 20.11 release.

While showing warnings to users about Python 2 support drop in 20.11 is
good, it seems like the shebangs in a lot of these scripts still refer
to "python".

dpdk$ git describe 
v20.05-623-geff30b59cc2e
dpdk$ git grep '#.*!.*python\>'
app/test-bbdev/test-bbdev.py:1:#!/usr/bin/env python
app/test-cmdline/cmdline_test.py:1:#!/usr/bin/env python
app/test/autotest.py:1:#!/usr/bin/env python
buildtools/map_to_win.py:1:#!/usr/bin/env python
config/arm/armv8_machine.py:1:#!/usr/bin/python
devtools/update_version_map_abi.py:1:#!/usr/bin/env python
usertools/cpu_layout.py:1:#!/usr/bin/env python
usertools/dpdk-devbind.py:1:#! /usr/bin/env python
usertools/dpdk-pmdinfo.py:1:#!/usr/bin/env python
usertools/dpdk-telemetry-client.py:1:#! /usr/bin/env python

On many distros, "python" still points (as of today) to python2. You
series will cause warnings that cannot be avoided.

Also, on some distros, "python" does not exist at all (RHEL 8 and CentOS
8 for example). And only "python2" or "python3" are available.

I wonder if it would not be better to find a way to make these shebangs
"dynamic" somehow. It is not trivial and I don't see any other solution
than plain modification of the shebangs at build time.

However, there is no way (to my knowledge) to specify which version of
python is "selected" during the build.

Does anyone have a proper solution?

-- 
Robin


Re: [dpdk-dev] [PATCH v3 3/3] lib/vhost: restrict pointer aliasing for packed vpmd

2020-07-10 Thread Adrian Moreno



On 7/10/20 4:38 AM, Joyce Kong wrote:
> Restrict pointer aliasing to allow the compiler to vectorize loop
> more aggressively.
> 
> With this patch, a 9.6% improvement is observed in throughput for
> the packed virtio-net PVP case, and a 2.8% improvement in throughput
> for the packed virtio-user PVP case. All performance data are measured
> on ThunderX-2 platform under 0.001% acceptable packet loss with 1 core
> on both vhost and virtio side.
> 
> Signed-off-by: Joyce Kong 
> Reviewed-by: Phil Yang 
> ---
>  drivers/net/virtio/virtio_rxtx_simple_neon.c |  5 +++--
>  lib/librte_vhost/virtio_net.c| 14 +++---
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c 
> b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> index a9b649814..02520fda8 100644
> --- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
> +++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> @@ -36,8 +36,9 @@
>   * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
>   */
>  uint16_t
> -virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
> - **__rte_restrict rx_pkts, uint16_t nb_pkts)
> +virtio_recv_pkts_vec(void *rx_queue,
> + struct rte_mbuf **__rte_restrict rx_pkts,
> + uint16_t nb_pkts)
>  {
>   struct virtnet_rx *rxvq = rx_queue;
>   struct virtqueue *vq = rxvq->vq;
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 236498f71..1d0be3dd4 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -1353,8 +1353,8 @@ virtio_dev_rx_single_packed(struct virtio_net *dev,
>  
>  static __rte_noinline uint32_t
>  virtio_dev_rx_packed(struct virtio_net *dev,
> -  struct vhost_virtqueue *vq,
> -  struct rte_mbuf **pkts,
> +  struct vhost_virtqueue *__rte_restrict vq,
> +  struct rte_mbuf **__rte_restrict pkts,
>uint32_t count)
>  {
>   uint32_t pkt_idx = 0;
> @@ -1439,7 +1439,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  
>  uint16_t
>  rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
> - struct rte_mbuf **pkts, uint16_t count)
> + struct rte_mbuf **__rte_restrict pkts, uint16_t count)
>  {
>   struct virtio_net *dev = get_device(vid);
>  
> @@ -2671,9 +2671,9 @@ free_zmbuf(struct vhost_virtqueue *vq)
>  
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
> -struct vhost_virtqueue *vq,
> +struct vhost_virtqueue *__rte_restrict vq,
>  struct rte_mempool *mbuf_pool,
> -struct rte_mbuf **pkts,
> +struct rte_mbuf **__rte_restrict pkts,
>  uint32_t count)
>  {
>   uint32_t pkt_idx = 0;
> @@ -2707,9 +2707,9 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
>  
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed(struct virtio_net *dev,
> -  struct vhost_virtqueue *vq,
> +  struct vhost_virtqueue *__rte_restrict vq,
>struct rte_mempool *mbuf_pool,
> -  struct rte_mbuf **pkts,
> +  struct rte_mbuf **__rte_restrict pkts,
>uint32_t count)
>  {
>   uint32_t pkt_idx = 0;
> 

The vhost part looks good to me.

Acked-by: Adrián Moreno 

-- 
Adrián Moreno



Re: [dpdk-dev] [PATCH] net: fix unneeded replacement of 0 by ffff for TCP checksum

2020-07-10 Thread Olivier Matz
On Fri, Jul 10, 2020 at 03:29:36PM +0200, Morten Brørup wrote:
> > From: Olivier Matz [mailto:olivier.m...@6wind.com]
> > Sent: Friday, July 10, 2020 3:16 PM
> > 
> > On Fri, Jul 10, 2020 at 03:10:34PM +0200, Morten Brørup wrote:
> > > > From: Olivier Matz [mailto:olivier.m...@6wind.com]
> > > > Sent: Friday, July 10, 2020 2:41 PM
> > > >
> > > > On Fri, Jul 10, 2020 at 02:55:51PM +0800, Hongzhi Guo wrote:
> > > > > Per RFC768:
> > > > > If the computed checksum is zero, it is transmitted as all ones.
> > > > > An all zero transmitted checksum value means that the transmitter
> > > > > generated no checksum.
> > > > >
> > > > > RFC793 for TCP has no such special treatment for the checksum of
> > > > zero.
> > > > >
> > > > > Fixes: 6006818cfb26 ("net: new checksum functions")
> > > > > Cc: sta...@dpdk.org
> > > > >
> > > > > Signed-off-by: Hongzhi Guo 
> > > > > ---
> > > > > v2:
> > > > > * Fixed commit tile
> > > > > * Fixed the API comment
> > > > > ---
> > > > > ---
> > > > >  lib/librte_net/rte_ip.h | 18 +++---
> > > > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h
> > > > > index 292f63fd7..d03c77120 100644
> > > > > --- a/lib/librte_net/rte_ip.h
> > > > > +++ b/lib/librte_net/rte_ip.h
> > > > > @@ -325,7 +325,7 @@ rte_ipv4_phdr_cksum(const struct rte_ipv4_hdr
> > > > *ipv4_hdr, uint64_t ol_flags)
> > > > >   *   The pointer to the beginning of the L4 header.
> > > > >   * @return
> > > > >   *   The complemented checksum to set in the IP packet
> > > > > - *   or 0 on error
> > > > > + *   or 0 if the IP length is invalid in the header.
> > > > >   */
> > > > >  static inline uint16_t
> > > > >  rte_ipv4_udptcp_cksum(const struct rte_ipv4_hdr *ipv4_hdr, const
> > > > void *l4_hdr)
> > >
> > > 0 is a valid return value, so I suggest omitting it from the return
> > value description:
> > >
> > >   * @return
> > > - *   The complemented checksum to set in the IP packet
> > > - *   or 0 on error
> > > + *   The complemented checksum to set in the IP packet.
> > >
> > > The comparison "if (l3_len < sizeof(struct rte_ipv4_hdr))" is only
> > there to protect against invalid input; it prevents l4_len from
> > becoming negative.
> > 
> > I don't get why "0 if the IP length is invalid in the header" should
> > be removed from the comment: 0 is both a valid return value and
> > the value returned on invalid packet.
> 
> To avoid confusion. We do not want people to add error handling for a return 
> value of 0.
> 
> 0 is not a special value or an error, so it does not deserve explicit 
> mentioning.
> 
> If we want to mention the return value for garbage input, we should not use 
> the wording "or 0", because this suggests that 0 is not a normal return value.

Ok, got it.

So maybe this?

 The complemented checksum to set in the IP packet. If
 the IP length is invalid in the header, it returns 0.


> 
> > 
> > > For the same reason, unlikely() should be added to this comparison.
> > 
> > Maybe yes, but that's another story I think.
> 
> Agree. I was just mentioning it so it can be done when modifying the function 
> anyway.
> 
> > 
> > > Otherwise,
> > >
> > > Acked-by: Morten Brørup 
> > >
> 


  1   2   3   >