RE: [PATCH v2 0/3] net/ice: simplified to 3 layer Tx scheduler

2024-01-05 Thread Wu, Wenjun1
> -Original Message-
> From: Zhang, Qi Z 
> Sent: Friday, January 5, 2024 10:11 PM
> To: Yang, Qiming ; Wu, Wenjun1
> 
> Cc: dev@dpdk.org; Zhang, Qi Z 
> Subject: [PATCH v2 0/3] net/ice: simplified to 3 layer Tx scheduler
> 
> Remove dummy layers, code refactor, complete document
> 
> Qi Zhang (3):
>   net/ice: hide port and TC layer in Tx sched tree
>   net/ice: refactor tm config data structure
>   doc: update ice document for qos
> 
> v2:
> - fix typos.
> 
>  doc/guides/nics/ice.rst  |  19 +++
>  drivers/net/ice/ice_ethdev.h |  12 +-
>  drivers/net/ice/ice_tm.c | 285 +++
>  3 files changed, 112 insertions(+), 204 deletions(-)
> 
> --
> 2.31.1

Acked-by: Wenjun Wu 


[PATCH v2 0/2] net/cpfl: support flow offloading for P4

2024-01-05 Thread wenjing . qiao
From: Wenjing Qiao 

Enable TDI flow engine which can program hardware offloading rules
for a P4 programmable network controller.

Wenjing Qiao (2):
  net/cpfl: parse flow offloading hint from P4 context file
  net/cpfl: add TDI to flow engine

v2:
- fix typos.
- parse vsi id for key.

 doc/guides/nics/cpfl.rst|   10 +
 doc/guides/nics/features/cpfl.ini   |1 +
 drivers/net/cpfl/cpfl_ethdev.h  |   17 +-
 drivers/net/cpfl/cpfl_flow.c|   13 +-
 drivers/net/cpfl/cpfl_flow.h|1 +
 drivers/net/cpfl/cpfl_flow_engine_fxp.c |   21 +-
 drivers/net/cpfl/cpfl_flow_parser.c |   68 +-
 drivers/net/cpfl/cpfl_flow_parser.h |2 +-
 drivers/net/cpfl/cpfl_fxp_rule.h|   12 +
 drivers/net/cpfl/cpfl_tdi.c | 1282 +
 drivers/net/cpfl/cpfl_tdi.h |  123 ++
 drivers/net/cpfl/cpfl_tdi_parser.c  | 1721 +++
 drivers/net/cpfl/cpfl_tdi_parser.h  |  294 
 drivers/net/cpfl/meson.build|2 +
 14 files changed, 3533 insertions(+), 34 deletions(-)
 create mode 100644 drivers/net/cpfl/cpfl_tdi.c
 create mode 100644 drivers/net/cpfl/cpfl_tdi.h
 create mode 100644 drivers/net/cpfl/cpfl_tdi_parser.c
 create mode 100644 drivers/net/cpfl/cpfl_tdi_parser.h

-- 
2.34.1



[PATCH v2 1/2] net/cpfl: parse flow offloading hint from P4 context file

2024-01-05 Thread wenjing . qiao
From: Wenjing Qiao 

To supporting P4-programmed network controller, reuse devargs
"flow_parser" to specify the path of a p4 context JSON configure
file. The cpfl PMD use the JSON configuration file to translate
rte_flow tokens into low level hardware representation.

Note, the p4 context JSON file is generated by the P4 compiler
and is intended to work exclusively with a specific P4 pipeline
configuration, which must be compiled and programmed into the hardware.

Signed-off-by: Wenjing Qiao 
---
 drivers/net/cpfl/cpfl_ethdev.h  |9 +-
 drivers/net/cpfl/cpfl_flow.c|   10 +-
 drivers/net/cpfl/cpfl_flow_engine_fxp.c |9 +-
 drivers/net/cpfl/cpfl_flow_parser.c |   60 +-
 drivers/net/cpfl/cpfl_flow_parser.h |2 +-
 drivers/net/cpfl/cpfl_tdi_parser.c  | 1721 +++
 drivers/net/cpfl/cpfl_tdi_parser.h  |  294 
 drivers/net/cpfl/meson.build|1 +
 8 files changed, 2084 insertions(+), 22 deletions(-)
 create mode 100644 drivers/net/cpfl/cpfl_tdi_parser.c
 create mode 100644 drivers/net/cpfl/cpfl_tdi_parser.h

diff --git a/drivers/net/cpfl/cpfl_ethdev.h b/drivers/net/cpfl/cpfl_ethdev.h
index 457db6d6be..e580f80f2f 100644
--- a/drivers/net/cpfl/cpfl_ethdev.h
+++ b/drivers/net/cpfl/cpfl_ethdev.h
@@ -185,6 +185,12 @@ struct cpfl_repr {
bool func_up; /* If the represented function is up */
 };
 
+struct cpfl_flow_parser {
+   struct cpfl_flow_js_parser *fixed_parser;
+   struct cpfl_tdi_program *p4_parser;
+   bool is_p4_parser;
+};
+
 struct cpfl_metadata_chunk {
int type;
uint8_t data[CPFL_META_CHUNK_LENGTH];
@@ -218,8 +224,7 @@ struct cpfl_adapter_ext {
 
rte_spinlock_t repr_lock;
struct rte_hash *repr_allowlist_hash;
-
-   struct cpfl_flow_js_parser *flow_parser;
+   struct cpfl_flow_parser flow_parser;
struct rte_bitmap *mod_bm;
void *mod_bm_mem;
 
diff --git a/drivers/net/cpfl/cpfl_flow.c b/drivers/net/cpfl/cpfl_flow.c
index 3ba6c0f0e7..1c4131da2c 100644
--- a/drivers/net/cpfl/cpfl_flow.c
+++ b/drivers/net/cpfl/cpfl_flow.c
@@ -6,6 +6,7 @@
 
 #include "cpfl_flow.h"
 #include "cpfl_flow_parser.h"
+#include "cpfl_tdi_parser.h"
 
 TAILQ_HEAD(cpfl_flow_engine_list, cpfl_flow_engine);
 
@@ -331,9 +332,14 @@ cpfl_flow_init(struct cpfl_adapter_ext *ad, struct 
cpfl_devargs *devargs)
 void
 cpfl_flow_uninit(struct cpfl_adapter_ext *ad)
 {
-   if (ad->flow_parser == NULL)
+   if (ad->flow_parser.fixed_parser == NULL && ad->flow_parser.p4_parser 
== NULL)
return;
 
-   cpfl_parser_destroy(ad->flow_parser);
+   if (ad->flow_parser.fixed_parser)
+   cpfl_parser_destroy(ad->flow_parser.fixed_parser);
+
+   if (ad->flow_parser.p4_parser)
+   cpfl_tdi_program_destroy(ad->flow_parser.p4_parser);
+
cpfl_flow_engine_uninit(ad);
 }
diff --git a/drivers/net/cpfl/cpfl_flow_engine_fxp.c 
b/drivers/net/cpfl/cpfl_flow_engine_fxp.c
index 8a4e1419b4..f269ff97e1 100644
--- a/drivers/net/cpfl/cpfl_flow_engine_fxp.c
+++ b/drivers/net/cpfl/cpfl_flow_engine_fxp.c
@@ -503,20 +503,25 @@ cpfl_fxp_parse_pattern_action(struct rte_eth_dev *dev,
struct cpfl_rule_info_meta *rim;
int ret;
 
+   if (adapter->flow_parser.is_p4_parser)
+   return -EINVAL;
+
ret = cpfl_fxp_get_metadata_port(itf, actions);
if (!ret) {
PMD_DRV_LOG(ERR, "Fail to save metadata.");
return -EINVAL;
}
 
-   ret = cpfl_flow_parse_items(itf, adapter->flow_parser, pattern, attr, 
&pr_action);
+   ret = cpfl_flow_parse_items(itf, adapter->flow_parser.fixed_parser, 
pattern, attr,
+   &pr_action);
if (ret) {
PMD_DRV_LOG(ERR, "No Match pattern support.");
return -EINVAL;
}
 
if (cpfl_is_mod_action(actions)) {
-   ret = cpfl_flow_parse_actions(adapter->flow_parser, actions, 
mr_action);
+   ret = cpfl_flow_parse_actions(adapter->flow_parser.fixed_parser,
+ actions, mr_action);
if (ret) {
PMD_DRV_LOG(ERR, "action parse fails.");
return -EINVAL;
diff --git a/drivers/net/cpfl/cpfl_flow_parser.c 
b/drivers/net/cpfl/cpfl_flow_parser.c
index a8f0488f21..e7f8a8a6cc 100644
--- a/drivers/net/cpfl/cpfl_flow_parser.c
+++ b/drivers/net/cpfl/cpfl_flow_parser.c
@@ -5,6 +5,7 @@
 #include 
 
 #include "cpfl_flow_parser.h"
+#include "cpfl_tdi_parser.h"
 
 static enum rte_flow_item_type
 cpfl_get_item_type_by_str(const char *type)
@@ -938,36 +939,65 @@ cpfl_parser_init(json_t *ob_root, struct 
cpfl_flow_js_parser *parser)
return 0;
 }
 
+static int
+cpfl_check_is_p4_mode(json_t *ob_root)
+{
+   return json_object_get(ob_root, "patterns") ? false : true;
+}
+
 int
-cpfl_parser_create(struct cpfl_flow_js_parser **flow_parser, const char 
*filename)
+cpfl_parser_crea

[PATCH v2 2/2] net/cpfl: add TDI to flow engine

2024-01-05 Thread wenjing . qiao
From: Wenjing Qiao 

Add TDI implementation to a flow engine.

Signed-off-by: Wenjing Qiao 
---
 doc/guides/nics/cpfl.rst|   10 +
 doc/guides/nics/features/cpfl.ini   |1 +
 drivers/net/cpfl/cpfl_ethdev.h  |8 +
 drivers/net/cpfl/cpfl_flow.c|5 +-
 drivers/net/cpfl/cpfl_flow.h|1 +
 drivers/net/cpfl/cpfl_flow_engine_fxp.c |   12 -
 drivers/net/cpfl/cpfl_flow_parser.c |8 +
 drivers/net/cpfl/cpfl_fxp_rule.h|   12 +
 drivers/net/cpfl/cpfl_tdi.c | 1282 +++
 drivers/net/cpfl/cpfl_tdi.h |  123 +++
 drivers/net/cpfl/meson.build|1 +
 11 files changed, 1450 insertions(+), 13 deletions(-)
 create mode 100644 drivers/net/cpfl/cpfl_tdi.c
 create mode 100644 drivers/net/cpfl/cpfl_tdi.h

diff --git a/doc/guides/nics/cpfl.rst b/doc/guides/nics/cpfl.rst
index 9b7a99c894..591bd496e6 100644
--- a/doc/guides/nics/cpfl.rst
+++ b/doc/guides/nics/cpfl.rst
@@ -213,6 +213,16 @@ low level hardware resources.
   flow create X ingress group M pattern eth dst is 00:01:00:00:03:14 / 
ipv4 src is 192.168.0.1 \
   dst is 192.168.0.2 / tcp / end actions port_representor port_id Y / end
 
+#. Create one flow for TDI engine to forward ETH-IPV4-TCP from I/O port to a 
local(CPF's) vport. Flow should
+   be created on vport X. Group M should be table id. Prog name N should be 
action id. Prog arguments
+   port_representor Y means forward packet to local vport Y::
+
+   .. code-block:: console
+
+  flow create X ingress group M pattern prog key is 0x00 / prog key is 
0x00010314 / prog key
+  is 0x001122334455 / prog key is 0xC0A80001 / prog key is 0xC0A80002 / 
prog key is 0x1451 / prog key
+  is 0x157C / end actions prog name N arguments port_representor Y  end / 
end
+
 #. Send a matched packet, and it should be displayed on PMD::
 
.. code-block:: console
diff --git a/doc/guides/nics/features/cpfl.ini 
b/doc/guides/nics/features/cpfl.ini
index 4eadaca6e7..85b8011a54 100644
--- a/doc/guides/nics/features/cpfl.ini
+++ b/doc/guides/nics/features/cpfl.ini
@@ -33,6 +33,7 @@ tcp  = Y
 udp  = Y
 vlan = Y
 vxlan= Y
+flex = Y
 
 [rte_flow actions]
 count= Y
diff --git a/drivers/net/cpfl/cpfl_ethdev.h b/drivers/net/cpfl/cpfl_ethdev.h
index e580f80f2f..7dfa4a0183 100644
--- a/drivers/net/cpfl/cpfl_ethdev.h
+++ b/drivers/net/cpfl/cpfl_ethdev.h
@@ -185,10 +185,18 @@ struct cpfl_repr {
bool func_up; /* If the represented function is up */
 };
 
+struct cpfl_tdi_table_node;
+TAILQ_HEAD(cpfl_tdi_table_list, cpfl_tdi_table_node);
+
+struct cpfl_tdi_action_node;
+TAILQ_HEAD(cpfl_tdi_action_list, cpfl_tdi_action_node);
+
 struct cpfl_flow_parser {
struct cpfl_flow_js_parser *fixed_parser;
struct cpfl_tdi_program *p4_parser;
bool is_p4_parser;
+   struct cpfl_tdi_table_list tdi_table_list;
+   struct cpfl_tdi_action_list tdi_action_list;
 };
 
 struct cpfl_metadata_chunk {
diff --git a/drivers/net/cpfl/cpfl_flow.c b/drivers/net/cpfl/cpfl_flow.c
index 1c4131da2c..15c7cc6d8b 100644
--- a/drivers/net/cpfl/cpfl_flow.c
+++ b/drivers/net/cpfl/cpfl_flow.c
@@ -6,6 +6,7 @@
 
 #include "cpfl_flow.h"
 #include "cpfl_flow_parser.h"
+#include "cpfl_tdi.h"
 #include "cpfl_tdi_parser.h"
 
 TAILQ_HEAD(cpfl_flow_engine_list, cpfl_flow_engine);
@@ -338,8 +339,10 @@ cpfl_flow_uninit(struct cpfl_adapter_ext *ad)
if (ad->flow_parser.fixed_parser)
cpfl_parser_destroy(ad->flow_parser.fixed_parser);
 
-   if (ad->flow_parser.p4_parser)
+   if (ad->flow_parser.p4_parser) {
+   cpfl_tdi_free_table_list(&ad->flow_parser);
cpfl_tdi_program_destroy(ad->flow_parser.p4_parser);
+   }
 
cpfl_flow_engine_uninit(ad);
 }
diff --git a/drivers/net/cpfl/cpfl_flow.h b/drivers/net/cpfl/cpfl_flow.h
index 1bde847763..1de9c25b17 100644
--- a/drivers/net/cpfl/cpfl_flow.h
+++ b/drivers/net/cpfl/cpfl_flow.h
@@ -15,6 +15,7 @@ extern const struct rte_flow_ops cpfl_flow_ops;
 enum cpfl_flow_engine_type {
CPFL_FLOW_ENGINE_NONE = 0,
CPFL_FLOW_ENGINE_FXP,
+   CPFL_FLOW_ENGINE_TDI,
 };
 
 typedef int (*engine_init_t)(struct cpfl_adapter_ext *ad);
diff --git a/drivers/net/cpfl/cpfl_flow_engine_fxp.c 
b/drivers/net/cpfl/cpfl_flow_engine_fxp.c
index f269ff97e1..6a5e7ed770 100644
--- a/drivers/net/cpfl/cpfl_flow_engine_fxp.c
+++ b/drivers/net/cpfl/cpfl_flow_engine_fxp.c
@@ -27,23 +27,11 @@
 #include "cpfl_fxp_rule.h"
 #include "cpfl_flow_parser.h"
 
-#define CPFL_COOKIE_DEF0x1000
-#define CPFL_MOD_COOKIE_DEF0x1237561
 #define CPFL_PREC_DEF  1
 #define CPFL_PREC_SET  5
 #define CPFL_TYPE_ID   3
 #define CPFL_OFFSET0x0a
-#define CPFL_HOST_ID_DEF   0
 #define CPFL_PF_NUM_DEF0
-#define CPFL_PORT_NUM_DEF  0
-#define CPFL_RESP_REQ_DEF  2
-#define CP

RE: [PATCH v2 2/3] net/ice: refactor tm config data structure

2024-01-05 Thread Zhang, Qi Z



> -Original Message-
> From: Zhang, Qi Z 
> Sent: Friday, January 5, 2024 10:11 PM
> To: Yang, Qiming ; Wu, Wenjun1
> 
> Cc: dev@dpdk.org; Zhang, Qi Z 
> Subject: [PATCH v2 2/3] net/ice: refactor tm config data structure
> 
> Simplified struct ice_tm_conf by removing per level node list.
> 
> Signed-off-by: Qi Zhang 
> ---
>  drivers/net/ice/ice_ethdev.h |   5 +-
>  drivers/net/ice/ice_tm.c | 210 +++
>  2 files changed, 88 insertions(+), 127 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h index
> ae22c29ffc..008a7a23b9 100644
> --- a/drivers/net/ice/ice_ethdev.h
> +++ b/drivers/net/ice/ice_ethdev.h
> @@ -472,6 +472,7 @@ struct ice_tm_node {
>   uint32_t id;
>   uint32_t priority;
>   uint32_t weight;
> + uint32_t level;
>   uint32_t reference_count;
>   struct ice_tm_node *parent;
>   struct ice_tm_node **children;
> @@ -492,10 +493,6 @@ enum ice_tm_node_type {  struct ice_tm_conf {
>   struct ice_shaper_profile_list shaper_profile_list;
>   struct ice_tm_node *root; /* root node - port */
> - struct ice_tm_node_list qgroup_list; /* node list for all the queue
> groups */
> - struct ice_tm_node_list queue_list; /* node list for all the queues */
> - uint32_t nb_qgroup_node;
> - uint32_t nb_queue_node;
>   bool committed;
>   bool clear_on_fail;
>  };
> diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index
> 7ae68c683b..7c662f8a85 100644
> --- a/drivers/net/ice/ice_tm.c
> +++ b/drivers/net/ice/ice_tm.c
> @@ -43,66 +43,30 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
>   /* initialize node configuration */
>   TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
>   pf->tm_conf.root = NULL;
> - TAILQ_INIT(&pf->tm_conf.qgroup_list);
> - TAILQ_INIT(&pf->tm_conf.queue_list);
> - pf->tm_conf.nb_qgroup_node = 0;
> - pf->tm_conf.nb_queue_node = 0;
>   pf->tm_conf.committed = false;
>   pf->tm_conf.clear_on_fail = false;
>  }
> 
> -void
> -ice_tm_conf_uninit(struct rte_eth_dev *dev)
> +static void free_node(struct ice_tm_node *root)
>  {
> - struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data-
> >dev_private);
> - struct ice_tm_node *tm_node;
> + uint32_t i;
> 
> - /* clear node configuration */
> - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) {
> - TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node);
> - rte_free(tm_node);
> - }
> - pf->tm_conf.nb_queue_node = 0;
> - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) {
> - TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node);
> - rte_free(tm_node);
> - }
> - pf->tm_conf.nb_qgroup_node = 0;
> - if (pf->tm_conf.root) {
> - rte_free(pf->tm_conf.root);
> - pf->tm_conf.root = NULL;
> - }
> + if (root == NULL)
> + return;
> +
> + for (i = 0; i < root->reference_count; i++)
> + free_node(root->children[i]);
> +
> + rte_free(root);


The memory of point array for children should also be freed.

rte_free(root->children)

As the patch has been acked, I will squash the fix when merging the patch.




Re: [EXT] [PATCH v6 05/20] net/dpaa2: used dedicated logtype not PMD

2024-01-05 Thread Jun Yang
What is the log level of DPAA2_PMD_INFO? I expect to print information by this 
as default.

获取 Outlook for iOS

发件人: Stephen Hemminger 
发送时间: 星期六, 十二月 23, 2023 01:18
收件人: dev@dpdk.org 
抄送: Stephen Hemminger ; Hemant Agrawal 
; Sachin Saxena ; Jun Yang 

主题: [EXT] [PATCH v6 05/20] net/dpaa2: used dedicated logtype not PMD

Caution: This is an external email. Please take care when clicking links or 
opening attachments. When in doubt, report the message using the 'Report this 
email' button


The driver has a logtype, but was not being used in one place.

Fixes: f023d059769f ("net/dpaa2: support recycle loopback port")
Fixes: 72ec7a678e70 ("net/dpaa2: add soft parser driver")

Signed-off-by: Stephen Hemminger 
---
 drivers/net/dpaa2/dpaa2_ethdev.c  | 2 +-
 drivers/net/dpaa2/dpaa2_sparser.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 8e610b6bba30..91846fcd2f23 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -2851,7 +2851,7 @@ dpaa2_dev_init(struct rte_eth_dev *eth_dev)
return ret;
}
}
-   RTE_LOG(INFO, PMD, "%s: netdev created, connected to %s\n",
+   DPAA2_PMD_INFO("%s: netdev created, connected to %s",
eth_dev->data->name, dpaa2_dev->ep_name);

return 0;
diff --git a/drivers/net/dpaa2/dpaa2_sparser.c 
b/drivers/net/dpaa2/dpaa2_sparser.c
index 63463c4fbfd6..36a14526a5c5 100644
--- a/drivers/net/dpaa2/dpaa2_sparser.c
+++ b/drivers/net/dpaa2/dpaa2_sparser.c
@@ -181,7 +181,7 @@ int dpaa2_eth_load_wriop_soft_parser(struct dpaa2_dev_priv 
*priv,

priv->ss_iova = (uint64_t)(DPAA2_VADDR_TO_IOVA(addr));
priv->ss_offset += sp_param.size;
-   RTE_LOG(INFO, PMD, "Soft parser loaded for dpni@%d\n", priv->hw_id);
+   DPAA2_PMD_INFO("Soft parser loaded for dpni@%d", priv->hw_id);

rte_free(addr);
return 0;
@@ -234,6 +234,6 @@ int dpaa2_eth_enable_wriop_soft_parser(struct 
dpaa2_dev_priv *priv,
}

rte_free(param_addr);
-   RTE_LOG(INFO, PMD, "Soft parser enabled for dpni@%d\n", priv->hw_id);
+   DPAA2_PMD_INFO("Soft parser enabled for dpni@%d", priv->hw_id);
return 0;
 }
--
2.43.0



[PATCH v2 1/2] telemetry: correct json empty dictionaries

2024-01-05 Thread Jonathan Erb
Fix to allow telemetry to handle empty dictionaries correctly.

This patch resolves an issue where empty dictionaries are reported
by telemetry as '[]' rather than '{}'. Initializing the output
buffer based on the container type resolves the issue.

Signed-off-by: Jonathan Erb 
---
 .mailmap  | 2 +-
 lib/telemetry/telemetry.c | 6 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/.mailmap b/.mailmap
index ab0742a382..a3302ba7a1 100644
--- a/.mailmap
+++ b/.mailmap
@@ -675,7 +675,7 @@ John Ousterhout 
 John Romein 
 John W. Linville 
 Jonas Pfefferle  
-Jonathan Erb  
+Jonathan Erb 
 Jonathan Tsai 
 Jon DeVree 
 Jon Loeliger 
diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 92982842a8..0788a32210 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -169,7 +169,11 @@ container_to_json(const struct rte_tel_data *d, char 
*out_buf, size_t buf_len)
d->type != TEL_ARRAY_INT && d->type != TEL_ARRAY_STRING)
return snprintf(out_buf, buf_len, "null");
 
-   used = rte_tel_json_empty_array(out_buf, buf_len, 0);
+   if (d->type == RTE_TEL_DICT)
+   used = rte_tel_json_empty_obj(out_buf, buf_len, 0);
+   else
+   used = rte_tel_json_empty_array(out_buf, buf_len, 0);
+
if (d->type == TEL_ARRAY_UINT)
for (i = 0; i < d->data_len; i++)
used = rte_tel_json_add_array_uint(out_buf,
-- 
2.34.1



[PATCH v2 2/2] telemetry: correct json empty dictionaries

2024-01-05 Thread Jonathan Erb
Fix use of incorrect enum name.

Signed-off-by: Jonathan Erb 
---
 lib/telemetry/telemetry.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 0788a32210..eef4ac7bb7 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -169,7 +169,7 @@ container_to_json(const struct rte_tel_data *d, char 
*out_buf, size_t buf_len)
d->type != TEL_ARRAY_INT && d->type != TEL_ARRAY_STRING)
return snprintf(out_buf, buf_len, "null");
 
-   if (d->type == RTE_TEL_DICT)
+   if (d->type == TEL_DICT)
used = rte_tel_json_empty_obj(out_buf, buf_len, 0);
else
used = rte_tel_json_empty_array(out_buf, buf_len, 0);
-- 
2.34.1



Re: [PATCH] dts: improve documentation

2024-01-05 Thread Luca Vizzarro

On 04/01/2024 10:52, Thomas Monjalon wrote:

  DTS needs to know which nodes to connect to and what hardware to use on those 
nodes.
-Once that's configured, DTS needs a DPDK tarball and it's ready to run.
+Once that's configured, DTS needs a DPDK tarball or a git ref ID and it's 
ready to run.


That's assuming DTS is compiling DPDK.
We may want to provide an already compiled DPDK to DTS.


Yes, that is correct. At the current state, DTS is always compiled from 
source though, so it may be reasonable to leave it as it is until this

feature may be implemented. Nonetheless, my change just informs the user
of the (already implemented) feature that uses `git archive` from the 
local repository to create a tarball. A sensible change would be to add
this explanation I have just given, but it is a technicality and it 
won't really make a difference to the user.



+   (dts-py3.10) $ ./main.py --help


Why adding this line?


Just running `./main.py` will just throw a confusing error to the user. 
I am in the process of sorting this out as it is misleading and not 
helpful. Specifying the line in this case just hints to the user on the 
origin of that help/usage document.



Should we remove the shell prefix referring to a specific Python version?


I have purposely left the prefix to indicate that we are in a Poetry 
shell environment, as that is a pre-requisite to run DTS. So more of an 
implicit reminder. The Python version specified is in line with the 
minimum requirement of DTS.



In general it is better to avoid long lines, and split after a punctation.
I think we should take the habit to always go to the next line after the end of 
a sentence.


I left the output of `--help` under a code block as it is originally 
printed in the console. Could surely amend it in the docs to be easier 
to read, but the user could as easily print it themselves in their own 
terminal in the comfort of their own environment.



-   [DTS_OUTPUT_DIR] Output directory where dts logs 
and results are
-   saved. (default: output)
+   [DTS_OUTPUT_DIR] Output directory where dts logs 
and results are saved.


dts -> DTS


As above. The output of `--help` only changed as a result of not being 
updated before in parallel with code changes. Consistently this is what 
the user would see right now. It may or may not be a good idea to update 
this whenever changed in the future.


Nonetheless, I am keen to update the code as part of this patch to 
resolve your comments.



Please don't add compilation configuration for now,
I would like to work on the schema first.
This is mostly imported from the old DTS and needs to be rethink.


While I understand the concern on wanting to rework the schema, which is 
a great point you make, it may be reasonable to provide something useful 
to close the existing documentation gap. And incrementally updating from 
there. If there is no realistic timeline set in place for a schema 
rework, it may just be better to have something rather than nothing. And 
certainly it would not be very useful to upstream a partial documentation.


Thank you a lot for your review! You have made some good points which 
open up new potential tasks to add to the pipeline.


Best,
Luca


RE: [PATCH v4 1/2] net/mlx5/hws: add support for random number match

2024-01-05 Thread Dariusz Sosnowski
> -Original Message-
> From: Michael Baum 
> Sent: Monday, December 25, 2023 11:25
> To: dev@dpdk.org
> Cc: Matan Azrad ; Dariusz Sosnowski
> ; Raslan Darawsheh ; Slava
> Ovsiienko ; Ori Kam ;
> Suanming Mou ; Erez Shitrit 
> Subject: [PATCH v4 1/2] net/mlx5/hws: add support for random number
> match
> 
> From: Erez Shitrit 
> 
> The HW adds a random number per each hash, this value can be used for
> statistic calculation over the packets, for example by setting one bit in the 
> mask
> of that field we will get half of the traffic in the flow, and so on with the 
> rest of
> the mask.
> 
> Signed-off-by: Erez Shitrit 
Acked-by: Dariusz Sosnowski 

Best regards,
Dariusz Sosnowski


Minutes of Technical Board Meeting, 2023-December-13

2024-01-05 Thread Jerin Jacob Kollanukkaran
Minutes of Technical Board Meeting, 2023-December-13

Members Attending
-
-Aaron
-Hemant
-Jerin (Chair)
-Kevin
-Konstantin
-Maxime
-Morten
-Stephen
-Thomas
 

NOTE: The technical board meetings every second Wednesday at 
https://meet.jit.si/DPDK at 3 pm UTC.
Meetings are public, and DPDK community members are welcome to attend.
 
NOTE: Next meeting will be on Wednesday 2024- January -10 @3pm UTC, and will be 
chaired by Honnappa


1) 5G Webinar Summary

- The scheduled time for the event coincided with the OVS conference, Techboard 
requested Linux Foundation to cross-reference other open-source conferences 
listed at https://lwn.net/Calendar/ to prevent scheduling conflicts in the 
future.

- The Techboard expressed eagerness to organize more webinars and encouraged 
volunteers to participate as speakers.
Notably, the recent event boasted an impressive attendance of over 100 
participants.

- Techboard is exploring the possibility of hosting a virtual hackathon to 
collaboratively address technical challenges.

2) Process for Qualification Criteria for External Library

TB continued to discuss the process for qualification criteria for external 
library, Initiated from 
https://patches.dpdk.org/project/dpdk/patch/20230928054036.645183-1-jer...@marvell.com/

# Revised policy description based on TB meetings and mailing list discussions

a) Documentation:

- Must have adequate documentation for the steps to build it.
- Must have clear license documentation on distribution and usage aspects of 
external library.

b) Free Availability:

- The library must be freely available to build in either source or binary form.
- It shall be downloadable from a direct link. There shall not be any 
requirement to explicitly login or sign a user agreement.

c) Usage License:

- Both permissive (e.g., BSD-3 or Apache) and non-permissive (e.g., GPLv3) 
licenses are acceptable.
- In the case of a permissive license, automatic inclusion in the build process 
is assumed.
For non-permissive licenses, an additional build config option is required.

d) Distributions License:

- No specific constraints beyond documentation.

e) Compiler Compatibility:

- The library must be able to compile with a DPDK supported compiler for the 
given execution environment.
For example, For Linux, the library must be able to compile with GCC and/or 
clang.

- Library may be limited to a specific OS.

f) Meson Build Integration:

- Libraries must offer a standard method like pkg-config for seamless 
integration with DPDK's build environment.

g) Code Readability:

- Optional dependencies should use stubs to minimize ifdef clutter, promoting 
improved code readability.

3) The meeting also addressed the timeframe for existing libraries to align 
with the new policy, with a general consensus to provide a minimum of 6 months.

4) Security Maintainers Updates

- Thomas expressed his desire to retire from security maintainers 
responsibilities.
- Aaron and Maxime volunteered to be part of security maintainer roles.
- Cheng Jian, having left Intel, is no longer active in the security team.
- Stephen will consult with Luca to gauge his interest in joining as one of the 
security maintainers.


[PATCH] common/cnxk: update MACsec pkt ok count

2024-01-05 Thread Akhil Goyal
In case of 103xx platform, the packet unchecked count
is same as packet ok count when validate frames is set in
secy configuration. And when validate frames is not set,
then also unchecked count can be treated as ok count.

Signed-off-by: Akhil Goyal 
---
 drivers/common/cnxk/roc_mcs_stats.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/common/cnxk/roc_mcs_stats.c 
b/drivers/common/cnxk/roc_mcs_stats.c
index cac611959d..9e5d62c9e2 100644
--- a/drivers/common/cnxk/roc_mcs_stats.c
+++ b/drivers/common/cnxk/roc_mcs_stats.c
@@ -120,6 +120,11 @@ roc_mcs_sc_stats_get(struct roc_mcs *mcs, struct 
roc_mcs_stats_req *mcs_req,
if (roc_model_is_cn10kb_a0()) {
stats->octet_decrypt_cnt = rsp->octet_decrypt_cnt;
stats->octet_validate_cnt = rsp->octet_validate_cnt;
+   /*
+* If validate frame is enabled in secy configuration,
+* pkt unchecked count is same as pkt ok count.
+*/
+   stats->pkt_ok_cnt = rsp->pkt_unchecked_cnt;
} else {
stats->pkt_delay_cnt = rsp->pkt_delay_cnt;
stats->pkt_ok_cnt = rsp->pkt_ok_cnt;
-- 
2.25.1



Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-05 Thread Jerin Jacob
On Fri, Jan 5, 2024 at 4:04 AM Thomas Monjalon  wrote:
>
> 19/12/2023 18:29, jer...@marvell.com:
> > --- a/doc/guides/nics/features/default.ini
> > +++ b/doc/guides/nics/features/default.ini
> > @@ -59,6 +59,7 @@ Packet type parsing  =
> >
> >  Timesync =
> >  Rx descriptor status =
> >  Tx descriptor status =
> > +Tx free descriptor query =
>
> I think we can drop "query" here.

How about "Tx queue free count" then?

>
>
> > +__rte_experimental
> > +static inline uint32_t
> > +rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id)
>
> For consistency with rte_eth_rx_queue_count(),
> I propose the name rte_eth_tx_queue_free_count().

Make sense. I will change it in next version.


>
>
>


Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-05 Thread Jerin Jacob
On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon  wrote:
>
> 04/01/2024 15:21, Konstantin Ananyev:
> >
> > > > > Introduce a new API to retrieve the number of available free 
> > > > > descriptors
> > > > > in a Tx queue. Applications can leverage this API in the fast path to
> > > > > inspect the Tx queue occupancy and take appropriate actions based on 
> > > > > the
> > > > > available free descriptors.
> > > > >
> > > > > A notable use case could be implementing Random Early Discard (RED)
> > > > > in software based on Tx queue occupancy.
> > > > >
> > > > > Signed-off-by: Jerin Jacob 
> > > >
> > > > I think having an API to get the number of free descriptors per queue 
> > > > is a good idea. Why have it only for TX queues and not for RX
> > > queues as well?
> > >
> > > I see no harm in adding for Rx as well. I think, it is better to have
> > > separate API for each instead of adding argument as it is fast path
> > > API.
> > > If so, we could add a new API when there is any PMD implementation or
> > > need for this.
> >
> > I think for RX we already have similar one:
> > /** @internal Get number of used descriptors on a receive queue. */
> > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
>
> rte_eth_rx_queue_count() gives the number of Rx used descriptors
> rte_eth_rx_descriptor_status() gives the status of one Rx descriptor
> rte_eth_tx_descriptor_status() gives the status of one Tx descriptor
>
> This patch is adding a function to get Tx available descriptors,
> rte_eth_tx_queue_free_desc_get().
> I can see a symmetry with rte_eth_rx_queue_count().
> For consistency I would rename it to rte_eth_tx_queue_free_count().
>
> Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?

IMO, rte_eth_rx_queue_free_count() is enough as
used count =  total desc number(configured via nb_tx_desc with
rte_eth_tx_queue_setup())  - free count

>
>


[PATCH] app/test-crypto-perf: add throughput OOP decryption

2024-01-05 Thread Suanming Mou
During throughput running, re-filling the test data will
impact the performance test result. So for now, to run
decrypt throughput testing is not supported since the
test data is not filled.

But if user requires OOP(out-of-place) mode, the test
data from source mbuf will never be modified, and if
the test data can be prepared out of the running loop,
the decryption test should be fine.

This commit adds the support of out-of-place decryption
testing for throughput.

[1]:
http://mails.dpdk.org/archives/dev/2023-July/273328.html

Signed-off-by: Suanming Mou 
---
 app/test-crypto-perf/cperf_ops.c |  5 ++-
 app/test-crypto-perf/cperf_options_parsing.c |  8 +
 app/test-crypto-perf/cperf_test_throughput.c | 37 +---
 3 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/app/test-crypto-perf/cperf_ops.c b/app/test-crypto-perf/cperf_ops.c
index 84945d1313..1d57b78c2b 100644
--- a/app/test-crypto-perf/cperf_ops.c
+++ b/app/test-crypto-perf/cperf_ops.c
@@ -608,7 +608,10 @@ cperf_set_ops_aead(struct rte_crypto_op **ops,
}
 
if ((options->test == CPERF_TEST_TYPE_VERIFY) ||
-   (options->test == CPERF_TEST_TYPE_LATENCY)) {
+   (options->test == CPERF_TEST_TYPE_LATENCY) ||
+   (options->test == CPERF_TEST_TYPE_THROUGHPUT &&
+(options->aead_op == RTE_CRYPTO_AEAD_OP_DECRYPT ||
+ options->cipher_op == RTE_CRYPTO_CIPHER_OP_DECRYPT))) {
for (i = 0; i < nb_ops; i++) {
uint8_t *iv_ptr = rte_crypto_op_ctod_offset(ops[i],
uint8_t *, iv_offset);
diff --git a/app/test-crypto-perf/cperf_options_parsing.c 
b/app/test-crypto-perf/cperf_options_parsing.c
index 75afedc7fd..6caca44371 100644
--- a/app/test-crypto-perf/cperf_options_parsing.c
+++ b/app/test-crypto-perf/cperf_options_parsing.c
@@ -1291,6 +1291,14 @@ cperf_options_check(struct cperf_options *options)
}
}
 
+   if (options->test == CPERF_TEST_TYPE_THROUGHPUT &&
+   (options->aead_op == RTE_CRYPTO_AEAD_OP_DECRYPT ||
+options->cipher_op == RTE_CRYPTO_CIPHER_OP_DECRYPT) &&
+   !options->out_of_place) {
+   RTE_LOG(ERR, USER1, "Only out-of-place is allowed in throughput 
decryption.\n");
+   return -EINVAL;
+   }
+
if (options->op_type == CPERF_CIPHER_ONLY ||
options->op_type == CPERF_CIPHER_THEN_AUTH ||
options->op_type == CPERF_AUTH_THEN_CIPHER) {
diff --git a/app/test-crypto-perf/cperf_test_throughput.c 
b/app/test-crypto-perf/cperf_test_throughput.c
index f8f8bd717f..eab25ec863 100644
--- a/app/test-crypto-perf/cperf_test_throughput.c
+++ b/app/test-crypto-perf/cperf_test_throughput.c
@@ -98,6 +98,29 @@ cperf_throughput_test_constructor(struct rte_mempool 
*sess_mp,
return NULL;
 }
 
+static void
+cperf_verify_init_ops(struct rte_mempool *mp __rte_unused,
+ void *opaque_arg,
+ void *obj,
+ __rte_unused unsigned int i)
+{
+   uint16_t iv_offset = sizeof(struct rte_crypto_op) +
+   sizeof(struct rte_crypto_sym_op);
+   uint32_t imix_idx = 0;
+   struct cperf_throughput_ctx *ctx = opaque_arg;
+   struct rte_crypto_op *op = obj;
+
+   (ctx->populate_ops)(&op, ctx->src_buf_offset,
+   ctx->dst_buf_offset,
+   1, ctx->sess, ctx->options,
+   ctx->test_vector, iv_offset, &imix_idx, NULL);
+
+   cperf_mbuf_set(op->sym->m_src,
+   ctx->options,
+   ctx->test_vector);
+
+}
+
 int
 cperf_throughput_test_runner(void *test_ctx)
 {
@@ -143,6 +166,9 @@ cperf_throughput_test_runner(void *test_ctx)
uint16_t iv_offset = sizeof(struct rte_crypto_op) +
sizeof(struct rte_crypto_sym_op);
 
+   if (ctx->options->out_of_place)
+   rte_mempool_obj_iter(ctx->pool, cperf_verify_init_ops, (void 
*)ctx);
+
while (test_burst_size <= ctx->options->max_burst_size) {
uint64_t ops_enqd = 0, ops_enqd_total = 0, ops_enqd_failed = 0;
uint64_t ops_deqd = 0, ops_deqd_total = 0, ops_deqd_failed = 0;
@@ -175,11 +201,12 @@ cperf_throughput_test_runner(void *test_ctx)
}
 
/* Setup crypto op, attach mbuf etc */
-   (ctx->populate_ops)(ops, ctx->src_buf_offset,
-   ctx->dst_buf_offset,
-   ops_needed, ctx->sess,
-   ctx->options, ctx->test_vector,
-   iv_offset, &imix_idx, &tsc_start);
+   if (!ctx->options->out_of_place)
+   (ctx->populate_ops)(ops, ctx->src_buf_offset,
+   ctx->dst_buf_offset,
+ 

Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-05 Thread Thomas Monjalon
05/01/2024 10:54, Jerin Jacob:
> On Fri, Jan 5, 2024 at 4:04 AM Thomas Monjalon  wrote:
> >
> > 19/12/2023 18:29, jer...@marvell.com:
> > > --- a/doc/guides/nics/features/default.ini
> > > +++ b/doc/guides/nics/features/default.ini
> > > @@ -59,6 +59,7 @@ Packet type parsing  =
> > >
> > >  Timesync =
> > >  Rx descriptor status =
> > >  Tx descriptor status =
> > > +Tx free descriptor query =
> >
> > I think we can drop "query" here.
> 
> How about "Tx queue free count" then?

No strong opinion. What others think?


> > > +__rte_experimental
> > > +static inline uint32_t
> > > +rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id)
> >
> > For consistency with rte_eth_rx_queue_count(),
> > I propose the name rte_eth_tx_queue_free_count().
> 
> Make sense. I will change it in next version.





Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-05 Thread Thomas Monjalon
05/01/2024 10:57, Jerin Jacob:
> On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon  wrote:
> >
> > 04/01/2024 15:21, Konstantin Ananyev:
> > >
> > > > > > Introduce a new API to retrieve the number of available free 
> > > > > > descriptors
> > > > > > in a Tx queue. Applications can leverage this API in the fast path 
> > > > > > to
> > > > > > inspect the Tx queue occupancy and take appropriate actions based 
> > > > > > on the
> > > > > > available free descriptors.
> > > > > >
> > > > > > A notable use case could be implementing Random Early Discard (RED)
> > > > > > in software based on Tx queue occupancy.
> > > > > >
> > > > > > Signed-off-by: Jerin Jacob 
> > > > >
> > > > > I think having an API to get the number of free descriptors per queue 
> > > > > is a good idea. Why have it only for TX queues and not for RX
> > > > queues as well?
> > > >
> > > > I see no harm in adding for Rx as well. I think, it is better to have
> > > > separate API for each instead of adding argument as it is fast path
> > > > API.
> > > > If so, we could add a new API when there is any PMD implementation or
> > > > need for this.
> > >
> > > I think for RX we already have similar one:
> > > /** @internal Get number of used descriptors on a receive queue. */
> > > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
> >
> > rte_eth_rx_queue_count() gives the number of Rx used descriptors
> > rte_eth_rx_descriptor_status() gives the status of one Rx descriptor
> > rte_eth_tx_descriptor_status() gives the status of one Tx descriptor
> >
> > This patch is adding a function to get Tx available descriptors,
> > rte_eth_tx_queue_free_desc_get().
> > I can see a symmetry with rte_eth_rx_queue_count().
> > For consistency I would rename it to rte_eth_tx_queue_free_count().
> >
> > Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?
> 
> IMO, rte_eth_rx_queue_free_count() is enough as
> used count =  total desc number(configured via nb_tx_desc with
> rte_eth_tx_queue_setup())  - free count

I'm fine with that.





[PATCH v10] net/iavf: add diagnostic support in TX path

2024-01-05 Thread Mingjin Ye
The only way to enable diagnostics for TX paths is to modify the
application source code. Making it difficult to diagnose faults.

In this patch, the devarg option "mbuf_check" is introduced and the
parameters are configured to enable the corresponding diagnostics.

supported cases: mbuf, size, segment, offload.
 1. mbuf: check for corrupted mbuf.
 2. size: check min/max packet length according to hw spec.
 3. segment: check number of mbuf segments not exceed hw limitation.
 4. offload: check any unsupported offload flag.

parameter format: mbuf_check=[mbuf,,]
eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i

Signed-off-by: Mingjin Ye 
---
v2: Remove call chain.
---
v3: Optimisation implementation.
---
v4: Fix Windows os compilation error.
---
v5: Split Patch.
---
v6: remove strict.
---
v9: Modify the description document.
---
v10: Modify vf rst document.
---
 doc/guides/nics/intel_vf.rst   | 11 
 drivers/net/iavf/iavf.h| 12 +
 drivers/net/iavf/iavf_ethdev.c | 75 ++
 drivers/net/iavf/iavf_rxtx.c   | 98 ++
 drivers/net/iavf/iavf_rxtx.h   |  2 +
 5 files changed, 198 insertions(+)

diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst
index ce96c2e1f8..f62bb4233c 100644
--- a/doc/guides/nics/intel_vf.rst
+++ b/doc/guides/nics/intel_vf.rst
@@ -111,6 +111,17 @@ For more detail on SR-IOV, please refer to the following 
documents:
 by setting the ``devargs`` parameter like ``-a 
18:01.0,no-poll-on-link-down=1``
 when IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 
Series Ethernet device.
 
+When IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 
series Ethernet devices.
+Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. For 
example,
+``-a 18:01.0,mbuf_check=`` or ``-a 
18:01.0,mbuf_check=[,...]``. Also,
+``xstats_get`` can be used to get the error counts, which are collected in 
``tx_mbuf_error_packets``
+xstats. For example, ``testpmd> show port xstats all``. Supported cases:
+
+*   mbuf: Check for corrupted mbuf.
+*   size: Check min/max packet length according to hw spec.
+*   segment: Check number of mbuf segments not exceed hw limitation.
+*   offload: Check any unsupported offload flag.
+
 The PCIE host-interface of Intel Ethernet Switch FM1 Series VF 
infrastructure
 
^
 
diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index ab24cb02c3..23c0496d54 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -114,9 +114,14 @@ struct iavf_ipsec_crypto_stats {
} ierrors;
 };
 
+struct iavf_mbuf_stats {
+   uint64_t tx_pkt_errors;
+};
+
 struct iavf_eth_xstats {
struct virtchnl_eth_stats eth_stats;
struct iavf_ipsec_crypto_stats ips_stats;
+   struct iavf_mbuf_stats mbuf_stats;
 };
 
 /* Structure that defines a VSI, associated with a adapter. */
@@ -310,6 +315,7 @@ struct iavf_devargs {
uint32_t watchdog_period;
int auto_reset;
int no_poll_on_link_down;
+   int mbuf_check;
 };
 
 struct iavf_security_ctx;
@@ -353,6 +359,11 @@ enum iavf_tx_burst_type {
IAVF_TX_AVX512_CTX_OFFLOAD,
 };
 
+#define IAVF_MBUF_CHECK_F_TX_MBUF(1ULL << 0)
+#define IAVF_MBUF_CHECK_F_TX_SIZE(1ULL << 1)
+#define IAVF_MBUF_CHECK_F_TX_SEGMENT (1ULL << 2)
+#define IAVF_MBUF_CHECK_F_TX_OFFLOAD (1ULL << 3)
+
 /* Structure to store private data for each VF instance. */
 struct iavf_adapter {
struct iavf_hw hw;
@@ -370,6 +381,7 @@ struct iavf_adapter {
bool no_poll;
enum iavf_rx_burst_type rx_burst_type;
enum iavf_tx_burst_type tx_burst_type;
+   uint64_t mc_flags; /* mbuf check flags. */
uint16_t fdir_ref_cnt;
struct iavf_devargs devargs;
 };
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index 1fb876e827..903a43d004 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -39,6 +40,7 @@
 #define IAVF_RESET_WATCHDOG_ARG"watchdog_period"
 #define IAVF_ENABLE_AUTO_RESET_ARG "auto_reset"
 #define IAVF_NO_POLL_ON_LINK_DOWN_ARG "no-poll-on-link-down"
+#define IAVF_MBUF_CHECK_ARG   "mbuf_check"
 uint64_t iavf_timestamp_dynflag;
 int iavf_timestamp_dynfield_offset = -1;
 int rte_pmd_iavf_tx_lldp_dynfield_offset = -1;
@@ -49,6 +51,7 @@ static const char * const iavf_valid_args[] = {
IAVF_RESET_WATCHDOG_ARG,
IAVF_ENABLE_AUTO_RESET_ARG,
IAVF_NO_POLL_ON_LINK_DOWN_ARG,
+   IAVF_MBUF_CHECK_ARG,
NULL
 };
 
@@ -175,6 +178,7 @@ static const struct rte_iavf_xstats_name_off 
rte_iavf_stats_strings[] = {
{"tx_broadcast_packets", _OFF_OF(eth_stats.tx_broadcast)},
{"tx_dropped_packets", _OFF_OF(eth_stats.tx_di

[PATCH v4] net/i40e: add diagnostic support in TX path

2024-01-05 Thread Mingjin Ye
The only way to enable diagnostics for TX paths is to modify the
application source code. Making it difficult to diagnose faults.

In this patch, the devarg option "mbuf_check" is introduced and the
parameters are configured to enable the corresponding diagnostics.

supported cases: mbuf, size, segment, offload.
 1. mbuf: check for corrupted mbuf.
 2. size: check min/max packet length according to hw spec.
 3. segment: check number of mbuf segments not exceed hw limitation.
 4. offload: check any unsupported offload flag.

parameter format: mbuf_check=[mbuf,,]
eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i

Signed-off-by: Mingjin Ye 
---
v2: remove strict.
---
v3: optimised.
---
v4: rebase.
---
 doc/guides/nics/i40e.rst   |  13 +++
 drivers/net/i40e/i40e_ethdev.c | 138 -
 drivers/net/i40e/i40e_ethdev.h |  28 ++
 drivers/net/i40e/i40e_rxtx.c   | 153 +++--
 drivers/net/i40e/i40e_rxtx.h   |   2 +
 5 files changed, 326 insertions(+), 8 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 15689ac958..bf1d1e5d60 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -275,6 +275,19 @@ Runtime Configuration
 
   -a 84:00.0,vf_msg_cfg=80@120:180
 
+- ``Support TX diagnostics`` (default ``not enabled``)
+
+  Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. For 
example,
+  ``-a 18:01.0,mbuf_check=`` or ``-a 
18:01.0,mbuf_check=[,...]``. Also,
+  ``xstats_get`` can be used to get the error counts, which are collected in
+  ``tx_mbuf_error_packets`` xstats. For example, ``testpmd> show port xstats 
all``.
+  Supported cases:
+
+  *   mbuf: Check for corrupted mbuf.
+  *   size: Check min/max packet length according to hw spec.
+  *   segment: Check number of mbuf segments not exceed hw limitation.
+  *   offload: Check any unsupported offload flag.
+
 Vector RX Pre-conditions
 
 For Vector RX it is assumed that the number of descriptor rings will be a power
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 3ca226156b..f23f80fd16 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -48,6 +48,7 @@
 #define ETH_I40E_SUPPORT_MULTI_DRIVER  "support-multi-driver"
 #define ETH_I40E_QUEUE_NUM_PER_VF_ARG  "queue-num-per-vf"
 #define ETH_I40E_VF_MSG_CFG"vf_msg_cfg"
+#define ETH_I40E_MBUF_CHECK_ARG   "mbuf_check"
 
 #define I40E_CLEAR_PXE_WAIT_MS 200
 #define I40E_VSI_TSR_QINQ_STRIP0x4010
@@ -412,6 +413,7 @@ static const char *const valid_keys[] = {
ETH_I40E_SUPPORT_MULTI_DRIVER,
ETH_I40E_QUEUE_NUM_PER_VF_ARG,
ETH_I40E_VF_MSG_CFG,
+   ETH_I40E_MBUF_CHECK_ARG,
NULL};
 
 static const struct rte_pci_id pci_id_i40e_map[] = {
@@ -545,6 +547,14 @@ static const struct rte_i40e_xstats_name_off 
rte_i40e_stats_strings[] = {
 #define I40E_NB_ETH_XSTATS (sizeof(rte_i40e_stats_strings) / \
sizeof(rte_i40e_stats_strings[0]))
 
+static const struct rte_i40e_xstats_name_off i40e_mbuf_strings[] = {
+   {"tx_mbuf_error_packets", offsetof(struct i40e_mbuf_stats,
+   tx_pkt_errors)},
+};
+
+#define I40E_NB_MBUF_XSTATS (sizeof(i40e_mbuf_strings) / \
+   sizeof(i40e_mbuf_strings[0]))
+
 static const struct rte_i40e_xstats_name_off rte_i40e_hw_port_strings[] = {
{"tx_link_down_dropped", offsetof(struct i40e_hw_port_stats,
tx_dropped_link_down)},
@@ -1373,6 +1383,88 @@ read_vf_msg_config(__rte_unused const char *key,
return 0;
 }
 
+static int
+read_mbuf_check_config(__rte_unused const char *key, const char *value, void 
*args)
+{
+   char *cur;
+   char *tmp;
+   int str_len;
+   int valid_len;
+
+   int ret = 0;
+   uint64_t *mc_flags = args;
+   char *str2 = strdup(value);
+   if (str2 == NULL)
+   return -1;
+
+   str_len = strlen(str2);
+   if (str2[0] == '[' && str2[str_len - 1] == ']') {
+   if (str_len < 3) {
+   ret = -1;
+   goto mdd_end;
+   }
+   valid_len = str_len - 2;
+   memmove(str2, str2 + 1, valid_len);
+   memset(str2 + valid_len, '\0', 2);
+   }
+   cur = strtok_r(str2, ",", &tmp);
+   while (cur != NULL) {
+   if (!strcmp(cur, "mbuf"))
+   *mc_flags |= I40E_MBUF_CHECK_F_TX_MBUF;
+   else if (!strcmp(cur, "size"))
+   *mc_flags |= I40E_MBUF_CHECK_F_TX_SIZE;
+   else if (!strcmp(cur, "segment"))
+   *mc_flags |= I40E_MBUF_CHECK_F_TX_SEGMENT;
+   else if (!strcmp(cur, "offload"))
+   *mc_flags |= I40E_MBUF_CHECK_F_TX_OFFLOAD;
+   else
+   PMD_DRV_LOG(ERR, "Unsupported mdd check type: %s", cur);
+   cur = strtok_r(NULL, ",", 

[PATCH v2] net/ice: add diagnostic support in TX path

2024-01-05 Thread Mingjin Ye
The only way to enable diagnostics for TX paths is to modify the
application source code. Making it difficult to diagnose faults.

In this patch, the devarg option "mbuf_check" is introduced and the
parameters are configured to enable the corresponding diagnostics.

supported cases: mbuf, size, segment, offload.
 1. mbuf: check for corrupted mbuf.
 2. size: check min/max packet length according to hw spec.
 3. segment: check number of mbuf segments not exceed hw limitation.
 4. offload: check any unsupported offload flag.

parameter format: mbuf_check=[mbuf,,]
eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i

Signed-off-by: Mingjin Ye 
---
v2: rebase.
---
 doc/guides/nics/ice.rst  |  13 +++
 drivers/net/ice/ice_ethdev.c | 104 ++-
 drivers/net/ice/ice_ethdev.h |  24 ++
 drivers/net/ice/ice_rxtx.c   | 158 ---
 drivers/net/ice/ice_rxtx.h   |  20 +
 5 files changed, 308 insertions(+), 11 deletions(-)

diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index bafb3ba022..d1aee811b3 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -257,6 +257,19 @@ Runtime Configuration
   As a trade-off, this configuration may cause the packet processing 
performance
   degradation due to the PCI bandwidth limitation.
 
+- ``Tx diagnostics`` (default ``not enabled``)
+
+  Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. For 
example,
+  ``-a 18:01.0,mbuf_check=`` or ``-a 
18:01.0,mbuf_check=[,...]``.
+  Also, ``xstats_get`` can be used to get the error counts, which are 
collected in
+  ``tx_mbuf_error_packets`` xstats. For example, ``testpmd> show port xstats 
all``.
+  Supported cases:
+
+  *   mbuf: Check for corrupted mbuf.
+  *   size: Check min/max packet length according to hw spec.
+  *   segment: Check number of mbuf segments not exceed hw limitation.
+  *   offload: Check any unsupported offload flag.
+
 Driver compilation and testing
 --
 
diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index 72e13f95f8..254993b813 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -12,6 +12,7 @@
 #include 
 
 #include 
+#include 
 
 #include "eal_firmware.h"
 
@@ -34,6 +35,7 @@
 #define ICE_HW_DEBUG_MASK_ARG "hw_debug_mask"
 #define ICE_ONE_PPS_OUT_ARG   "pps_out"
 #define ICE_RX_LOW_LATENCY_ARG"rx_low_latency"
+#define ICE_MBUF_CHECK_ARG   "mbuf_check"
 
 #define ICE_CYCLECOUNTER_MASK  0xULL
 
@@ -49,6 +51,7 @@ static const char * const ice_valid_args[] = {
ICE_ONE_PPS_OUT_ARG,
ICE_RX_LOW_LATENCY_ARG,
ICE_DEFAULT_MAC_DISABLE,
+   ICE_MBUF_CHECK_ARG,
NULL
 };
 
@@ -319,6 +322,14 @@ static const struct ice_xstats_name_off 
ice_stats_strings[] = {
 #define ICE_NB_ETH_XSTATS (sizeof(ice_stats_strings) / \
sizeof(ice_stats_strings[0]))
 
+static const struct ice_xstats_name_off ice_mbuf_strings[] = {
+   {"tx_mbuf_error_packets", offsetof(struct ice_mbuf_stats,
+   tx_pkt_errors)},
+};
+
+#define ICE_NB_MBUF_XSTATS (sizeof(ice_mbuf_strings) / \
+   sizeof(ice_mbuf_strings[0]))
+
 static const struct ice_xstats_name_off ice_hw_port_strings[] = {
{"tx_link_down_dropped", offsetof(struct ice_hw_port_stats,
tx_dropped_link_down)},
@@ -2061,6 +2072,50 @@ handle_pps_out_arg(__rte_unused const char *key, const 
char *value,
return 0;
 }
 
+static int
+ice_parse_mbuf_check(__rte_unused const char *key, const char *value, void 
*args)
+{
+   char *cur;
+   char *tmp;
+   int str_len;
+   int valid_len;
+
+   int ret = 0;
+   uint64_t *mc_flags = args;
+   char *str2 = strdup(value);
+   if (str2 == NULL)
+   return -1;
+
+   str_len = strlen(str2);
+   if (str2[0] == '[' && str2[str_len - 1] == ']') {
+   if (str_len < 3) {
+   ret = -1;
+   goto mdd_end;
+   }
+   valid_len = str_len - 2;
+   memmove(str2, str2 + 1, valid_len);
+   memset(str2 + valid_len, '\0', 2);
+   }
+   cur = strtok_r(str2, ",", &tmp);
+   while (cur != NULL) {
+   if (!strcmp(cur, "mbuf"))
+   *mc_flags |= ICE_MBUF_CHECK_F_TX_MBUF;
+   else if (!strcmp(cur, "size"))
+   *mc_flags |= ICE_MBUF_CHECK_F_TX_SIZE;
+   else if (!strcmp(cur, "segment"))
+   *mc_flags |= ICE_MBUF_CHECK_F_TX_SEGMENT;
+   else if (!strcmp(cur, "offload"))
+   *mc_flags |= ICE_MBUF_CHECK_F_TX_OFFLOAD;
+   else
+   PMD_DRV_LOG(ERR, "Unsupported mdd check type: %s", cur);
+   cur = strtok_r(NULL, ",", &tmp);
+   }
+
+mdd_end:
+   free(str2);
+   return ret;
+}
+
 static int ice_parse_devargs(struct rte_eth_dev

RE: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-05 Thread Konstantin Ananyev


> -Original Message-
> From: Thomas Monjalon 
> Sent: Friday, January 5, 2024 10:04 AM
> To: Jerin Jacob 
> Cc: Dumitrescu, Cristian ; Konstantin Ananyev 
> ;
> jer...@marvell.com; dev@dpdk.org; Ferruh Yigit ; Andrew 
> Rybchenko ;
> ferruh.yi...@xilinx.com; ajit.khapa...@broadcom.com; abo...@pensando.io; 
> Xing, Beilei ; Richardson, Bruce
> ; ch...@att.com; chenbo@intel.com; Loftus, 
> Ciara ;
> dsinghra...@marvell.com; Czeck, Ed ; 
> evge...@amazon.com; gr...@u256.net; g.si...@nxp.com;
> zhouguoy...@huawei.com; Wang, Haiyue ; 
> hka...@marvell.com; heinrich.k...@corigine.com;
> hemant.agra...@nxp.com; hyon...@cisco.com; igo...@amazon.com; 
> irussk...@marvell.com; jgraj...@cisco.com; Singh, Jasvinder
> ; jianw...@trustnetic.com; 
> jiawe...@trustnetic.com; Wu, Jingjing ;
> johnd...@cisco.com; john.mil...@atomicrules.com; linvi...@tuxdriver.com; 
> Wiles, Keith ;
> kirankum...@marvell.com; ouli...@huawei.com; lir...@marvell.com; 
> lon...@microsoft.com; m...@semihalf.com;
> spin...@cesnet.cz; ma...@nvidia.com; Peters, Matt 
> ; maxime.coque...@redhat.com;
> m...@semihalf.com; humin (Q) ; pna...@marvell.com; 
> ndabilpu...@marvell.com; Yang, Qiming
> ; Zhang, Qi Z ; 
> rad...@marvell.com; rahul.lakkire...@chelsio.com;
> rm...@marvell.com; Xu, Rosen ; sachin.sax...@oss.nxp.com; 
> skotesh...@marvell.com;
> shsha...@marvell.com; shaib...@amazon.com; Siegel, Shepard 
> ; asoma...@amd.com;
> somnath.ko...@broadcom.com; sthem...@microsoft.com; Webster, Steven 
> ;
> sk...@marvell.com; mtetsu...@gmail.com; vbu...@marvell.com; 
> viachesl...@nvidia.com; Wang, Xiao W
> ; Wangxiaoyun (Cloud) ; 
> Zhuangyuzeng (Yisen)
> ; Wang, Yong ; Xuanziyang 
> (William) 
> Subject: Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
> 
> 05/01/2024 10:57, Jerin Jacob:
> > On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon  wrote:
> > >
> > > 04/01/2024 15:21, Konstantin Ananyev:
> > > >
> > > > > > > Introduce a new API to retrieve the number of available free 
> > > > > > > descriptors
> > > > > > > in a Tx queue. Applications can leverage this API in the fast 
> > > > > > > path to
> > > > > > > inspect the Tx queue occupancy and take appropriate actions based 
> > > > > > > on the
> > > > > > > available free descriptors.
> > > > > > >
> > > > > > > A notable use case could be implementing Random Early Discard 
> > > > > > > (RED)
> > > > > > > in software based on Tx queue occupancy.
> > > > > > >
> > > > > > > Signed-off-by: Jerin Jacob 
> > > > > >
> > > > > > I think having an API to get the number of free descriptors per 
> > > > > > queue is a good idea. Why have it only for TX queues and not
> for RX
> > > > > queues as well?
> > > > >
> > > > > I see no harm in adding for Rx as well. I think, it is better to have
> > > > > separate API for each instead of adding argument as it is fast path
> > > > > API.
> > > > > If so, we could add a new API when there is any PMD implementation or
> > > > > need for this.
> > > >
> > > > I think for RX we already have similar one:
> > > > /** @internal Get number of used descriptors on a receive queue. */
> > > > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
> > >
> > > rte_eth_rx_queue_count() gives the number of Rx used descriptors
> > > rte_eth_rx_descriptor_status() gives the status of one Rx descriptor
> > > rte_eth_tx_descriptor_status() gives the status of one Tx descriptor
> > >
> > > This patch is adding a function to get Tx available descriptors,
> > > rte_eth_tx_queue_free_desc_get().
> > > I can see a symmetry with rte_eth_rx_queue_count().
> > > For consistency I would rename it to rte_eth_tx_queue_free_count().
> > >
> > > Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?
> >
> > IMO, rte_eth_rx_queue_free_count() is enough as
> > used count =  total desc number(configured via nb_tx_desc with
> > rte_eth_tx_queue_setup())  - free count
> 
> I'm fine with that.
> 

Yep, agree.
If we ever need  rte_eth_rx_queue_free_count() and 
rte_eth_tx_queue_used_count(),
it could be done via slow-path as Jerin outlined above, no need to waste 
entries in fp_ops
for that.



[PATCH v1] net/axgbe: read and save the port property register

2024-01-05 Thread Venkat Kumar Ande
From: Venkat Kumar Ande 

Read and save the port property registers once during the device probe
and then use the saved values as they are needed.

Signed-off-by: Venkat Kumar Ande 
---
 drivers/net/axgbe/axgbe_ethdev.c   | 21 +
 drivers/net/axgbe/axgbe_ethdev.h   |  7 +++
 drivers/net/axgbe/axgbe_phy_impl.c | 68 --
 3 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index f174d46143..3450374535 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -2342,23 +2342,28 @@ eth_axgbe_dev_init(struct rte_eth_dev *eth_dev)
pdata->arcache = AXGBE_DMA_OS_ARCACHE;
pdata->awcache = AXGBE_DMA_OS_AWCACHE;
 
+   /* Read the port property registers */
+   pdata->pp0 = XP_IOREAD(pdata, XP_PROP_0);
+   pdata->pp1 = XP_IOREAD(pdata, XP_PROP_1);
+   pdata->pp2 = XP_IOREAD(pdata, XP_PROP_2);
+   pdata->pp3 = XP_IOREAD(pdata, XP_PROP_3);
+   pdata->pp4 = XP_IOREAD(pdata, XP_PROP_4);
+
/* Set the maximum channels and queues */
-   reg = XP_IOREAD(pdata, XP_PROP_1);
-   pdata->tx_max_channel_count = XP_GET_BITS(reg, XP_PROP_1, MAX_TX_DMA);
-   pdata->rx_max_channel_count = XP_GET_BITS(reg, XP_PROP_1, MAX_RX_DMA);
-   pdata->tx_max_q_count = XP_GET_BITS(reg, XP_PROP_1, MAX_TX_QUEUES);
-   pdata->rx_max_q_count = XP_GET_BITS(reg, XP_PROP_1, MAX_RX_QUEUES);
+   pdata->tx_max_channel_count = XP_GET_BITS(pdata->pp1, XP_PROP_1, 
MAX_TX_DMA);
+   pdata->rx_max_channel_count = XP_GET_BITS(pdata->pp1, XP_PROP_1, 
MAX_RX_DMA);
+   pdata->tx_max_q_count = XP_GET_BITS(pdata->pp1, XP_PROP_1, 
MAX_TX_QUEUES);
+   pdata->rx_max_q_count = XP_GET_BITS(pdata->pp1, XP_PROP_1, 
MAX_RX_QUEUES);
 
/* Set the hardware channel and queue counts */
axgbe_set_counts(pdata);
 
/* Set the maximum fifo amounts */
-   reg = XP_IOREAD(pdata, XP_PROP_2);
-   pdata->tx_max_fifo_size = XP_GET_BITS(reg, XP_PROP_2, TX_FIFO_SIZE);
+   pdata->tx_max_fifo_size = XP_GET_BITS(pdata->pp2, XP_PROP_2, 
TX_FIFO_SIZE);
pdata->tx_max_fifo_size *= 16384;
pdata->tx_max_fifo_size = RTE_MIN(pdata->tx_max_fifo_size,
  pdata->vdata->tx_max_fifo_size);
-   pdata->rx_max_fifo_size = XP_GET_BITS(reg, XP_PROP_2, RX_FIFO_SIZE);
+   pdata->rx_max_fifo_size = XP_GET_BITS(pdata->pp2, XP_PROP_2, 
RX_FIFO_SIZE);
pdata->rx_max_fifo_size *= 16384;
pdata->rx_max_fifo_size = RTE_MIN(pdata->rx_max_fifo_size,
  pdata->vdata->rx_max_fifo_size);
diff --git a/drivers/net/axgbe/axgbe_ethdev.h b/drivers/net/axgbe/axgbe_ethdev.h
index 7f19321d88..df5d63c493 100644
--- a/drivers/net/axgbe/axgbe_ethdev.h
+++ b/drivers/net/axgbe/axgbe_ethdev.h
@@ -539,6 +539,13 @@ struct axgbe_port {
void *xprop_regs;   /* AXGBE property registers */
void *xi2c_regs;/* AXGBE I2C CSRs */
 
+   /* Port property registers */
+   unsigned int pp0;
+   unsigned int pp1;
+   unsigned int pp2;
+   unsigned int pp3;
+   unsigned int pp4;
+
bool cdr_track_early;
/* XPCS indirect addressing lock */
unsigned int xpcs_window_def_reg;
diff --git a/drivers/net/axgbe/axgbe_phy_impl.c 
b/drivers/net/axgbe/axgbe_phy_impl.c
index d97fbbfddd..44ff28517c 100644
--- a/drivers/net/axgbe/axgbe_phy_impl.c
+++ b/drivers/net/axgbe/axgbe_phy_impl.c
@@ -1709,40 +1709,35 @@ static int axgbe_phy_link_status(struct axgbe_port 
*pdata, int *an_restart)
 static void axgbe_phy_sfp_gpio_setup(struct axgbe_port *pdata)
 {
struct axgbe_phy_data *phy_data = pdata->phy_data;
-   unsigned int reg;
-
-   reg = XP_IOREAD(pdata, XP_PROP_3);
 
phy_data->sfp_gpio_address = AXGBE_GPIO_ADDRESS_PCA9555 +
-   XP_GET_BITS(reg, XP_PROP_3, GPIO_ADDR);
+   XP_GET_BITS(pdata->pp3, XP_PROP_3, GPIO_ADDR);
 
-   phy_data->sfp_gpio_mask = XP_GET_BITS(reg, XP_PROP_3, GPIO_MASK);
+   phy_data->sfp_gpio_mask = XP_GET_BITS(pdata->pp3, XP_PROP_3, GPIO_MASK);
 
-   phy_data->sfp_gpio_rx_los = XP_GET_BITS(reg, XP_PROP_3,
+   phy_data->sfp_gpio_rx_los = XP_GET_BITS(pdata->pp3, XP_PROP_3,
GPIO_RX_LOS);
-   phy_data->sfp_gpio_tx_fault = XP_GET_BITS(reg, XP_PROP_3,
+   phy_data->sfp_gpio_tx_fault = XP_GET_BITS(pdata->pp3, XP_PROP_3,
  GPIO_TX_FAULT);
-   phy_data->sfp_gpio_mod_absent = XP_GET_BITS(reg, XP_PROP_3,
+   phy_data->sfp_gpio_mod_absent = XP_GET_BITS(pdata->pp3, XP_PROP_3,
GPIO_MOD_ABS);
-   phy_data->sfp_gpio_rate_select = XP_GET_BITS(reg, XP_PROP_3,
+   phy_data->sfp_gpio_rate_select = XP_GET_BITS(pdata->pp3, XP_PROP_3,
 GPIO_RATE_SELECT

[dpdk-dev] [v3] doc: define qualification criteria for external library

2024-01-05 Thread jerinj
From: Jerin Jacob 

Define qualification criteria for external library
based on a techboard meeting minutes [1] and past
learnings from mailing list discussion.

[1]
http://mails.dpdk.org/archives/dev/2019-June/135847.html
https://mails.dpdk.org/archives/dev/2024-January/284849.html

Signed-off-by: Jerin Jacob 
---
 doc/guides/contributing/index.rst |  1 +
 .../contributing/library_dependency.rst   | 45 +++
 2 files changed, 46 insertions(+)
 create mode 100644 doc/guides/contributing/library_dependency.rst

v3:
- Updated the content based on TB discussion which is documented at
https://mails.dpdk.org/archives/dev/2024-January/284849.html

v2:
- Added "Meson build integration" and "Code readability" sections.

diff --git a/doc/guides/contributing/index.rst 
b/doc/guides/contributing/index.rst
index dcb9b1fbf0..e5a8c2b0a3 100644
--- a/doc/guides/contributing/index.rst
+++ b/doc/guides/contributing/index.rst
@@ -15,6 +15,7 @@ Contributor's Guidelines
 documentation
 unit_test
 new_library
+library_dependency
 patches
 vulnerability
 stable
diff --git a/doc/guides/contributing/library_dependency.rst 
b/doc/guides/contributing/library_dependency.rst
new file mode 100644
index 00..4242919475
--- /dev/null
+++ b/doc/guides/contributing/library_dependency.rst
@@ -0,0 +1,45 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+   Copyright(c) 2024 Marvell.
+
+External Library dependency
+===
+
+This document defines the qualification criteria for external libraries that 
may be
+used as dependencies in DPDK drivers or libraries.
+
+#. **Documentation:**
+
+   - Must have adequate documentation for the steps to build it.
+   - Must have clear license documentation on distribution and usage aspects 
of external library.
+
+#. **Free availability:**
+
+   - The library must be freely available to build in either source or binary 
form.
+   - It shall be downloadable from a direct link. There shall not be any 
requirement to explicitly
+ login or sign a user agreement.
+
+#. **Usage License:**
+
+   - Both permissive (e.g., BSD-3 or Apache) and non-permissive (e.g., GPLv3) 
licenses are acceptable.
+   - In the case of a permissive license, automatic inclusion in the build 
process is assumed.
+ For non-permissive licenses, an additional build configuration option is 
required.
+
+#. **Distributions License:**
+
+   - No specific constraints beyond documentation.
+
+#. **Compiler compatibility:**
+
+   - The library must be able to compile with a DPDK supported compiler for 
the given execution
+ environment. For example, For Linux, the library must be able to compile 
with GCC and/or clang.
+   - Library may be limited to a specific OS.
+
+#. **Meson build integration:**
+
+   - The library must have standard method like ``pkg-config`` for seamless 
integration with
+ DPDK's build environment.
+
+#. **Code readability:**
+
+   - Optional dependencies should use stubs to minimize ``ifdef`` clutter, 
promoting improved
+ code readability.
-- 
2.43.0



Re: [dpdk-dev] [v3] doc: define qualification criteria for external library

2024-01-05 Thread Thomas Monjalon
05/01/2024 13:12, jer...@marvell.com:
> From: Jerin Jacob 
> 
> Define qualification criteria for external library
> based on a techboard meeting minutes [1] and past
> learnings from mailing list discussion.
> 
> [1]
> http://mails.dpdk.org/archives/dev/2019-June/135847.html
> https://mails.dpdk.org/archives/dev/2024-January/284849.html
[...]
> +#. **Documentation:**
> +
> +   - Must have adequate documentation for the steps to build it.
> +   - Must have clear license documentation on distribution and usage aspects 
> of external library.
> +
> +#. **Free availability:**
> +
> +   - The library must be freely available to build in either source or 
> binary form.
> +   - It shall be downloadable from a direct link. There shall not be any 
> requirement to explicitly
> + login or sign a user agreement.
> +
> +#. **Usage License:**
> +
> +   - Both permissive (e.g., BSD-3 or Apache) and non-permissive (e.g., 
> GPLv3) licenses are acceptable.
> +   - In the case of a permissive license, automatic inclusion in the build 
> process is assumed.
> + For non-permissive licenses, an additional build configuration option 
> is required.
> +
> +#. **Distributions License:**
> +
> +   - No specific constraints beyond documentation.
> +
> +#. **Compiler compatibility:**
> +
> +   - The library must be able to compile with a DPDK supported compiler for 
> the given execution
> + environment. For example, For Linux, the library must be able to 
> compile with GCC and/or clang.

Please go to next line when starting a sentence.
There is an extra uppercasing in "For Linux".

> +   - Library may be limited to a specific OS.
> +
> +#. **Meson build integration:**
> +
> +   - The library must have standard method like ``pkg-config`` for seamless 
> integration with
> + DPDK's build environment.
> +
> +#. **Code readability:**
> +
> +   - Optional dependencies should use stubs to minimize ``ifdef`` clutter, 
> promoting improved
> + code readability.


Acked-by: Thomas Monjalon 




[dpdk-dev] [v4] doc: define qualification criteria for external library

2024-01-05 Thread jerinj
From: Jerin Jacob 

Define qualification criteria for external library
based on a techboard meeting minutes [1] and past
learnings from mailing list discussion.

[1]
http://mails.dpdk.org/archives/dev/2019-June/135847.html
https://mails.dpdk.org/archives/dev/2024-January/284849.html

Signed-off-by: Jerin Jacob 
Acked-by: Thomas Monjalon 
---
 doc/guides/contributing/index.rst |  1 +
 .../contributing/library_dependency.rst   | 46 +++
 2 files changed, 47 insertions(+)
 create mode 100644 doc/guides/contributing/library_dependency.rst

v4:
- Address Thomas comments from 
https://patches.dpdk.org/project/dpdk/patch/20240105121215.3950532-1-jer...@marvell.com/

v3:
- Updated the content based on TB discussion which is documented at
https://mails.dpdk.org/archives/dev/2024-January/284849.html

v2:
- Added "Meson build integration" and "Code readability" sections.


diff --git a/doc/guides/contributing/index.rst 
b/doc/guides/contributing/index.rst
index dcb9b1fbf0..e5a8c2b0a3 100644
--- a/doc/guides/contributing/index.rst
+++ b/doc/guides/contributing/index.rst
@@ -15,6 +15,7 @@ Contributor's Guidelines
 documentation
 unit_test
 new_library
+library_dependency
 patches
 vulnerability
 stable
diff --git a/doc/guides/contributing/library_dependency.rst 
b/doc/guides/contributing/library_dependency.rst
new file mode 100644
index 00..367e380a89
--- /dev/null
+++ b/doc/guides/contributing/library_dependency.rst
@@ -0,0 +1,46 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+   Copyright(c) 2024 Marvell.
+
+External Library dependency
+===
+
+This document defines the qualification criteria for external libraries that 
may be
+used as dependencies in DPDK drivers or libraries.
+
+#. **Documentation:**
+
+   - Must have adequate documentation for the steps to build it.
+   - Must have clear license documentation on distribution and usage aspects 
of external library.
+
+#. **Free availability:**
+
+   - The library must be freely available to build in either source or binary 
form.
+   - It shall be downloadable from a direct link. There shall not be any 
requirement to explicitly
+ login or sign a user agreement.
+
+#. **Usage License:**
+
+   - Both permissive (e.g., BSD-3 or Apache) and non-permissive (e.g., GPLv3) 
licenses are acceptable.
+   - In the case of a permissive license, automatic inclusion in the build 
process is assumed.
+ For non-permissive licenses, an additional build configuration option is 
required.
+
+#. **Distributions License:**
+
+   - No specific constraints beyond documentation.
+
+#. **Compiler compatibility:**
+
+   - The library must be able to compile with a DPDK supported compiler for 
the given execution
+ environment.
+ For example, for Linux, the library must be able to compile with GCC 
and/or clang.
+   - Library may be limited to a specific OS.
+
+#. **Meson build integration:**
+
+   - The library must have standard method like ``pkg-config`` for seamless 
integration with
+ DPDK's build environment.
+
+#. **Code readability:**
+
+   - Optional dependencies should use stubs to minimize ``ifdef`` clutter, 
promoting improved
+ code readability.
-- 
2.43.0



[PATCH v3 0/3] net/ice: simplified to 3 layer Tx scheduler

2024-01-05 Thread Qi Zhang
Remove dummy layers, code refactor, complete document

v3:
- fix tm_node memory free.
- fix corrupt when slibling node deletion is not in a reversed order.

v2:
- fix typos.

Qi Zhang (3):
  net/ice: hide port and TC layer in Tx sched tree
  net/ice: refactor tm config data structure
  doc: update ice document for qos

 doc/guides/nics/ice.rst  |  19 +++
 drivers/net/ice/ice_ethdev.h |  12 +-
 drivers/net/ice/ice_tm.c | 313 +--
 3 files changed, 132 insertions(+), 212 deletions(-)

-- 
2.31.1



[PATCH v3 1/3] net/ice: hide port and TC layer in Tx sched tree

2024-01-05 Thread Qi Zhang
In currently 5 layer tree implementation, the port and tc layer
is not configurable, so its not necessary to expose them to application.

The patch hides the top 2 layers and represented the root of the tree at
VSI layer. From application's point of view, its a 3 layer scheduler tree:

Port -> Queue Group -> Queue.

Signed-off-by: Qi Zhang 
Acked-by: Wenjun Wu 
---
 drivers/net/ice/ice_ethdev.h |  7 
 drivers/net/ice/ice_tm.c | 79 
 2 files changed, 7 insertions(+), 79 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index fa4981ed14..ae22c29ffc 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -470,7 +470,6 @@ struct ice_tm_shaper_profile {
 struct ice_tm_node {
TAILQ_ENTRY(ice_tm_node) node;
uint32_t id;
-   uint32_t tc;
uint32_t priority;
uint32_t weight;
uint32_t reference_count;
@@ -484,8 +483,6 @@ struct ice_tm_node {
 /* node type of Traffic Manager */
 enum ice_tm_node_type {
ICE_TM_NODE_TYPE_PORT,
-   ICE_TM_NODE_TYPE_TC,
-   ICE_TM_NODE_TYPE_VSI,
ICE_TM_NODE_TYPE_QGROUP,
ICE_TM_NODE_TYPE_QUEUE,
ICE_TM_NODE_TYPE_MAX,
@@ -495,12 +492,8 @@ enum ice_tm_node_type {
 struct ice_tm_conf {
struct ice_shaper_profile_list shaper_profile_list;
struct ice_tm_node *root; /* root node - port */
-   struct ice_tm_node_list tc_list; /* node list for all the TCs */
-   struct ice_tm_node_list vsi_list; /* node list for all the VSIs */
struct ice_tm_node_list qgroup_list; /* node list for all the queue 
groups */
struct ice_tm_node_list queue_list; /* node list for all the queues */
-   uint32_t nb_tc_node;
-   uint32_t nb_vsi_node;
uint32_t nb_qgroup_node;
uint32_t nb_queue_node;
bool committed;
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c
index b570798f07..7ae68c683b 100644
--- a/drivers/net/ice/ice_tm.c
+++ b/drivers/net/ice/ice_tm.c
@@ -43,12 +43,8 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
/* initialize node configuration */
TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
pf->tm_conf.root = NULL;
-   TAILQ_INIT(&pf->tm_conf.tc_list);
-   TAILQ_INIT(&pf->tm_conf.vsi_list);
TAILQ_INIT(&pf->tm_conf.qgroup_list);
TAILQ_INIT(&pf->tm_conf.queue_list);
-   pf->tm_conf.nb_tc_node = 0;
-   pf->tm_conf.nb_vsi_node = 0;
pf->tm_conf.nb_qgroup_node = 0;
pf->tm_conf.nb_queue_node = 0;
pf->tm_conf.committed = false;
@@ -72,16 +68,6 @@ ice_tm_conf_uninit(struct rte_eth_dev *dev)
rte_free(tm_node);
}
pf->tm_conf.nb_qgroup_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.vsi_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.vsi_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_vsi_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.tc_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.tc_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_tc_node = 0;
if (pf->tm_conf.root) {
rte_free(pf->tm_conf.root);
pf->tm_conf.root = NULL;
@@ -93,8 +79,6 @@ ice_tm_node_search(struct rte_eth_dev *dev,
uint32_t node_id, enum ice_tm_node_type *node_type)
 {
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node_list *tc_list = &pf->tm_conf.tc_list;
-   struct ice_tm_node_list *vsi_list = &pf->tm_conf.vsi_list;
struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
struct ice_tm_node *tm_node;
@@ -104,20 +88,6 @@ ice_tm_node_search(struct rte_eth_dev *dev,
return pf->tm_conf.root;
}
 
-   TAILQ_FOREACH(tm_node, tc_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_TC;
-   return tm_node;
-   }
-   }
-
-   TAILQ_FOREACH(tm_node, vsi_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_VSI;
-   return tm_node;
-   }
-   }
-
TAILQ_FOREACH(tm_node, qgroup_list, node) {
if (tm_node->id == node_id) {
*node_type = ICE_TM_NODE_TYPE_QGROUP;
@@ -371,6 +341,8 @@ ice_shaper_profile_del(struct rte_eth_dev *dev,
return 0;
 }
 
+#define MAX_QUEUE_PER_GROUP8
+
 static int
 ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
  uint32_t parent_node_id, uint32_t priority,
@@ -384,8 +356,6 @@ ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
struct ice_tm_shaper_profile *shaper_profile = NULL;
struct ice_tm_node *tm_node;
struct ice_

[PATCH v3 2/3] net/ice: refactor tm config data structure

2024-01-05 Thread Qi Zhang
Simplified struct ice_tm_conf by removing per level node list.

Signed-off-by: Qi Zhang 
---
 drivers/net/ice/ice_ethdev.h |   5 +-
 drivers/net/ice/ice_tm.c | 244 ---
 2 files changed, 111 insertions(+), 138 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index ae22c29ffc..008a7a23b9 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -472,6 +472,7 @@ struct ice_tm_node {
uint32_t id;
uint32_t priority;
uint32_t weight;
+   uint32_t level;
uint32_t reference_count;
struct ice_tm_node *parent;
struct ice_tm_node **children;
@@ -492,10 +493,6 @@ enum ice_tm_node_type {
 struct ice_tm_conf {
struct ice_shaper_profile_list shaper_profile_list;
struct ice_tm_node *root; /* root node - port */
-   struct ice_tm_node_list qgroup_list; /* node list for all the queue 
groups */
-   struct ice_tm_node_list queue_list; /* node list for all the queues */
-   uint32_t nb_qgroup_node;
-   uint32_t nb_queue_node;
bool committed;
bool clear_on_fail;
 };
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c
index 7ae68c683b..c579662843 100644
--- a/drivers/net/ice/ice_tm.c
+++ b/drivers/net/ice/ice_tm.c
@@ -6,6 +6,9 @@
 #include "ice_ethdev.h"
 #include "ice_rxtx.h"
 
+#define MAX_CHILDREN_PER_SCHED_NODE8
+#define MAX_CHILDREN_PER_TM_NODE   256
+
 static int ice_hierarchy_commit(struct rte_eth_dev *dev,
 int clear_on_fail,
 __rte_unused struct rte_tm_error *error);
@@ -43,66 +46,30 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
/* initialize node configuration */
TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
pf->tm_conf.root = NULL;
-   TAILQ_INIT(&pf->tm_conf.qgroup_list);
-   TAILQ_INIT(&pf->tm_conf.queue_list);
-   pf->tm_conf.nb_qgroup_node = 0;
-   pf->tm_conf.nb_queue_node = 0;
pf->tm_conf.committed = false;
pf->tm_conf.clear_on_fail = false;
 }
 
-void
-ice_tm_conf_uninit(struct rte_eth_dev *dev)
+static void free_node(struct ice_tm_node *root)
 {
-   struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node *tm_node;
+   uint32_t i;
 
-   /* clear node configuration */
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_queue_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_qgroup_node = 0;
-   if (pf->tm_conf.root) {
-   rte_free(pf->tm_conf.root);
-   pf->tm_conf.root = NULL;
-   }
+   if (root == NULL)
+   return;
+
+   for (i = 0; i < root->reference_count; i++)
+   free_node(root->children[i]);
+
+   rte_free(root);
 }
 
-static inline struct ice_tm_node *
-ice_tm_node_search(struct rte_eth_dev *dev,
-   uint32_t node_id, enum ice_tm_node_type *node_type)
+void
+ice_tm_conf_uninit(struct rte_eth_dev *dev)
 {
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
-   struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
-   struct ice_tm_node *tm_node;
-
-   if (pf->tm_conf.root && pf->tm_conf.root->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_PORT;
-   return pf->tm_conf.root;
-   }
 
-   TAILQ_FOREACH(tm_node, qgroup_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_QGROUP;
-   return tm_node;
-   }
-   }
-
-   TAILQ_FOREACH(tm_node, queue_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_QUEUE;
-   return tm_node;
-   }
-   }
-
-   return NULL;
+   free_node(pf->tm_conf.root);
+   pf->tm_conf.root = NULL;
 }
 
 static int
@@ -195,11 +162,29 @@ ice_node_param_check(struct ice_pf *pf, uint32_t node_id,
return 0;
 }
 
+static struct ice_tm_node *
+find_node(struct ice_tm_node *root, uint32_t id)
+{
+   uint32_t i;
+
+   if (root == NULL || root->id == id)
+   return root;
+
+   for (i = 0; i < root->reference_count; i++) {
+   struct ice_tm_node *node = find_node(root->children[i], id);
+
+   if (node)
+   return node;
+   }
+
+   return NULL;
+}
+
 static int
 ice_node_type_get(struct rte_eth_dev *dev, uint32_t node_id,
   int *is_leaf, struct rte_tm_error *error)
 {
-

[PATCH v3 3/3] doc: update ice document for qos

2024-01-05 Thread Qi Zhang
Add description for ice PMD's rte_tm capabilities.

Signed-off-by: Qi Zhang 
Acked-by: Wenjun Wu 
---
 doc/guides/nics/ice.rst | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index bafb3ba022..3d381a266b 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -352,6 +352,25 @@ queue 3 using a raw pattern::
 
 Currently, raw pattern support is limited to the FDIR and Hash engines.
 
+Traffic Management Support
+~~
+
+The ice PMD provides support for the Traffic Management API (RTE_RM), allow
+users to offload a 3-layers Tx scheduler on the E810 NIC:
+
+- ``Port Layer``
+
+  This is the root layer, support peak bandwidth configuration, max to 32 
children.
+
+- ``Queue Group Layer``
+
+  The middel layer, support peak / committed bandwidth, weight, priority 
configurations,
+  max to 8 children.
+
+- ``Queue Layer``
+
+  The leaf layer, support peak / committed bandwidth, weight, priority 
configurations.
+
 Additional Options
 ++
 
-- 
2.31.1



RE: [PATCH] net/ice: refine queue start stop

2024-01-05 Thread Zhang, Qi Z



> -Original Message-
> From: Wu, Wenjun1 
> Sent: Friday, January 5, 2024 2:03 PM
> To: Zhang, Qi Z ; Yang, Qiming
> 
> Cc: dev@dpdk.org
> Subject: RE: [PATCH] net/ice: refine queue start stop
> 
> > -Original Message-
> > From: Zhang, Qi Z 
> > Sent: Friday, January 5, 2024 9:37 PM
> > To: Yang, Qiming ; Wu, Wenjun1
> > 
> > Cc: dev@dpdk.org; Zhang, Qi Z 
> > Subject: [PATCH] net/ice: refine queue start stop
> >
> > Not necessary to return fail when starting or stopping a queue if the
> > queue was already at required state.
> >
> > Signed-off-by: Qi Zhang 
> > ---
> >  drivers/net/ice/ice_rxtx.c | 16 
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
> > index 73e47ae92d..3286bb08fe 100644
> > --- a/drivers/net/ice/ice_rxtx.c
> > +++ b/drivers/net/ice/ice_rxtx.c
> > @@ -673,6 +673,10 @@ ice_rx_queue_start(struct rte_eth_dev *dev,
> > uint16_t rx_queue_id)
> > return -EINVAL;
> > }
> >
> > +   if (dev->data->rx_queue_state[rx_queue_id] ==
> > +   RTE_ETH_QUEUE_STATE_STARTED)
> > +   return 0;
> > +
> > if (dev->data->dev_conf.rxmode.offloads &
> > RTE_ETH_RX_OFFLOAD_TIMESTAMP)
> > rxq->ts_enable = true;
> > err = ice_program_hw_rx_queue(rxq);
> > @@ -717,6 +721,10 @@ ice_rx_queue_stop(struct rte_eth_dev *dev,
> > uint16_t rx_queue_id)
> > if (rx_queue_id < dev->data->nb_rx_queues) {
> > rxq = dev->data->rx_queues[rx_queue_id];
> >
> > +   if (dev->data->rx_queue_state[rx_queue_id] ==
> > +   RTE_ETH_QUEUE_STATE_STOPPED)
> > +   return 0;
> > +
> > err = ice_switch_rx_queue(hw, rxq->reg_idx, false);
> > if (err) {
> > PMD_DRV_LOG(ERR, "Failed to switch RX queue %u
> off", @@ -758,6
> > +766,10 @@ ice_tx_queue_start(struct rte_eth_dev *dev, uint16_t
> > tx_queue_id)
> > return -EINVAL;
> > }
> >
> > +   if (dev->data->tx_queue_state[tx_queue_id] ==
> > +   RTE_ETH_QUEUE_STATE_STARTED)
> > +   return 0;
> > +
> > buf_len = ice_struct_size(txq_elem, txqs, 1);
> > txq_elem = ice_malloc(hw, buf_len);
> > if (!txq_elem)
> > @@ -1066,6 +1078,10 @@ ice_tx_queue_stop(struct rte_eth_dev *dev,
> > uint16_t tx_queue_id)
> > return -EINVAL;
> > }
> >
> > +   if (dev->data->tx_queue_state[tx_queue_id] ==
> > +   RTE_ETH_QUEUE_STATE_STOPPED)
> > +   return 0;
> > +
> > q_ids[0] = txq->reg_idx;
> > q_teids[0] = txq->q_teid;
> >
> > --
> > 2.31.1
> 
> Acked-by: Wenjun Wu 

Applied to dpdk-next-net-intel.

Thanks
Qi


[Bug 1342] net/i40e rejects packet without any Tx offload on Tx prepare

2024-01-05 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1342

Bug ID: 1342
   Summary: net/i40e rejects packet without any Tx offload on Tx
prepare
   Product: DPDK
   Version: 23.11
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: andrew.rybche...@oktetlabs.ru
  Target Milestone: ---

net/i40e rejects packet without any Tx offload on Tx prepare

Tx is configured to use no offloads. So, simple Tx prepare callback is used by
the driver.

mbuf has the following flags set: TX_L4_NO_CKSUM  |TX_IPV4 | TX_OUTER_IPV4 |
TX_TUNNEL_VXLAN | RX_IP_CKSUM_UNKNOWN | RX_L4_CKSUM_UNKNOWN
See logs
https://ts-factory.io/bublik/v2/log/362398?focusId=368929&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_70

These flags do not request any Tx offloads, just specify that it is a VXLAN
packet with inner and outer IPv4.

However, Tx prepare rejects it:
https://ts-factory.io/bublik/v2/log/362398?focusId=368929&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_89

Above logs are result of the test suite run at UNH IOL.

I guess the problem is TX_TUNNEL_VXLAN. I guess addition of
RTE_MBUF_F_TX_TUNNEL_MASK to I40E_TX_OFFLOAD_SIMPLE_SUP_MASK will solve the
problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 1343] net/i40e does not drop packet with too many segments

2024-01-05 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1343

Bug ID: 1343
   Summary: net/i40e does not drop packet with too many segments
   Product: DPDK
   Version: 23.11
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: andrew.rybche...@oktetlabs.ru
  Target Milestone: ---

net/i40e does not drop packet with too many segments

The test case is a bit artificial, but still makes sense. If application sent
packet with too many segments which do not fit in Tx ring (or free space in Tx
ring) the packet is reported as sent anyway.

The packet is not counted as sent in stats or output errors.

The following errors appear in logs:
i40e_dev_alarm_handler(): ICR0: malicious programming detected
i40e_handle_mdd_event(): Malicious Driver Detection event 0x02 on TX queue 1 PF
number 0x00 VF number 0x00
 device :01:00.0

Should be repeatable with testpmd which allows to sent packet with many
segments. E.g. setup Tx queue with 64 descriptors and try to send packet with
65 segments.

IMHO right behaviour in this case is to report the packet as transmitted in Tx
burst return value, but drop and count it in oerrors. (Otherwise Tx could
stuck).

Test logs (run at UHN IOL):
https://ts-factory.io/bublik/v2/log/362398?focusId=368819&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_60

Mentioned error appear in logs with delay (in the next test):
https://ts-factory.io/bublik/v2/log/362398?focusId=368820&mode=treeAndinfoAndlog&experimental=true&lineNumber=1_32

-- 
You are receiving this mail because:
You are the assignee for the bug.

RE: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Madhuker Mythri
Hi Stephen,

The BPF helper man pages implies in that way and the SKB data pointer access 
was working till 5.4 kernel also, however from Kernel-5.15 version, we do see 
eBPF verifier throws error when we use SKB data pointer access.
So, I had used this helper functions and able to resolve the errors. This is 
helper functions are safe to use and also protects from any non-linear skb data 
buffer access also.

So, I think using helper functions is better and safe way to access the SKB 
data, instead of pointer access.

Thanks,
Madhuker.

-Original Message-
From: Stephen Hemminger  
Sent: 05 January 2024 02:27
To: Madhuker Mythri 
Cc: ferruh.yi...@amd.com; dev@dpdk.org
Subject: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the 
new Kernel-version upgrade requirements.

On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> 
> 
> RCA:  These errors started coming after from the Kernel-5.15 version, in 
> which lots of new BPF verification restrictions were added for safe execution 
> of byte-code on to the Kernel, due to which existing BPF program verification 
> does not pass.
> Here are the major BPF verifier restrictions observed:
> 1) Need to use new BPF maps structure.
> 2) Kernel SKB data pointer access not allowed.

I noticed you are now using bpf_skb_load_bytes(), but the bpf helper man page 
implies it is not needed.

 long bpf_skb_load_bytes(const void *skb, u32 offset, void *to,
   u32 len)

  Description
 This helper was provided as an easy way to load
 data from a packet. It can be used to load len
 bytes from offset from the packet associated to
 skb, into the buffer pointed by to.

 Since Linux 4.7, usage of this helper has mostly
 been replaced by "direct packet access", enabling
 packet data to be manipulated with skb->data and
 skb->data_end pointing respectively to the first
 byte of packet data and to the byte after the last
 byte of packet data. However, it remains useful if
 one wishes to read large quantities of data at once
 from a packet into the eBPF stack.

  Return 0 on success, or a negative error in case of


RE: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Madhuker Mythri
Hi Stephen,

As part of hash calculation logic, the hash value is going beyond 32-bits and 
thus the eBPF verifier throws error with the 32-bit hash variable.
So, I need to modify as 64-bit hash variable to resolve the BPF verifier error.

Here, in the code this rte_softrss_be() function is returning the hash 
variable, which is a 64-bit value, so modified the return type from 32-bit to 
64-bit.
==
-static __u32  __attribute__((always_inline))
-rte_softrss_be(const __u32 *input_tuple, const uint8_t *rss_key,
-   __u8 input_len)
+static __u64  __attribute__((always_inline))
+rte_softrss_be(const __u32 *input_tuple, __u8 input_len)
 {
-   __u32 i, j, hash = 0;
-#pragma unroll
+   __u32 i, j;
+__u64  hash = 0;
+#pragma clang loop unroll(full)
for (j = 0; j < input_len; j++) {
-#pragma unroll
+#pragma clang loop unroll(full)
for (i = 0; i < 32; i++) {
if (input_tuple[j] & (1U << (31 - i))) {
hash ^= ((const __u32 *)def_rss_key)[j] << i |
-   (__u32)((uint64_t)
+   (__u32)((__u64)
(((const __u32 *)def_rss_key)[j + 1])
>> (32 - i));
}
@@ -119,137 +107,78 @@ rte_softrss_be(const __u32 *input_tuple, const uint8_t 
*rss_key,
return hash;
 }
==

Thanks,
Madhuker.

-Original Message-
From: Stephen Hemminger  
Sent: 05 January 2024 02:09
To: Madhuker Mythri 
Cc: ferruh.yi...@amd.com; dev@dpdk.org
Subject: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the 
new Kernel-version upgrade requirements.

On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> -static __u32  __attribute__((always_inline)) -rte_softrss_be(const 
> __u32 *input_tuple, const uint8_t *rss_key,
> - __u8 input_len)
> +static __u64  __attribute__((always_inline)) rte_softrss_be(const 
> +__u32 *input_tuple, __u8 input_len)

Why the change to u64?
This is not part of the bug fix and not how RSS is defined.


RE: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Madhuker Mythri
Sure, will update accordingly.

Thanks,
Madhuker.

-Original Message-
From: Stephen Hemminger  
Sent: 05 January 2024 02:02
To: Madhuker Mythri 
Cc: ferruh.yi...@amd.com; dev@dpdk.org
Subject: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the 
new Kernel-version upgrade requirements.

On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> From: Madhuker Mythri 
> 
> When multiple queues configured, internally RSS will be enabled and thus TAP 
> BPF RSS byte-code will be loaded on to the Kernel using BPF system calls.
> 
> Here, the problem is loading the existing BPF byte-code to the Kernel-5.15 
> and above versions throws errors, i.e: Kernel BPF verifier not accepted this 
> existing BPF byte-code and system calls return error code "-7" as follows:
> 
> rss_add_actions(): Failed to load BPF section l3_l4 (7): Argument list too 
> long
> 
> 
> RCA:  These errors started coming after from the Kernel-5.15 version, in 
> which lots of new BPF verification restrictions were added for safe execution 
> of byte-code on to the Kernel, due to which existing BPF program verification 
> does not pass.
> Here are the major BPF verifier restrictions observed:
> 1) Need to use new BPF maps structure.
> 2) Kernel SKB data pointer access not allowed.
> 3) Undefined loops were not allowed(which are bounded by a variable value).
> 4) unreachable instructions(like: undefined array access).
> 
> After addressing all these Kernel BPF verifier restrictions able to load the 
> BPF byte-code onto the Kernel successfully.
> 
> Note: This new BPF changes supports from Kernel:4.10 version.
> 
> Bugzilla Id: 1329
> 
> Signed-off-by: Madhuker Mythri 
> ---
>  drivers/net/tap/bpf/tap_bpf_program.c |  243 +-
>  drivers/net/tap/tap_bpf_api.c |4 +-
>  drivers/net/tap/tap_bpf_insns.h   | 3781 ++---
>  3 files changed, 2151 insertions(+), 1877 deletions(-)

Patch has trailing whitespace, git complains:
$ git am /tmp/bpf.mbox
Applying: net/tap: Modified TAP BPF program as per the new Kernel-version 
upgrade requirements.
/home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:98: 
trailing whitespace.
// queue match 
/home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:243: 
trailing whitespace.
/** Is IP fragmented **/ 
/home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:326: 
trailing whitespace.
/*  bpf_printk("> rss_l3_l4 hash=0x%x queue:1=%u\n", hash, queue); */ 
warning: 3 lines add whitespace errors.




RE: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Madhuker Mythri
By using this " bpf_skb_load_bytes_relative()" helper function we can directly 
retrieve the Network header data fields, so we no need to check for L2-header 
VLAN presence.

Thanks,
Madhuker.

-Original Message-
From: Stephen Hemminger  
Sent: 05 January 2024 02:11
To: Madhuker Mythri 
Cc: ferruh.yi...@amd.com; dev@dpdk.org
Subject: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the 
new Kernel-version upgrade requirements.

On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> - /* Get correct proto for 802.1ad */
> - if (skb->vlan_present && skb->vlan_proto == htons(ETH_P_8021AD)) {
> - if (data + ETH_ALEN * 2 + sizeof(struct vlan_hdr) +
> - sizeof(proto) > data_end)
> - return TC_ACT_OK;
> - proto = *(__u16 *)(data + ETH_ALEN * 2 +
> -sizeof(struct vlan_hdr));
> - off += sizeof(struct vlan_hdr);
> - }

Your version loses VLAN support?


Re: [PATCH] build: riscv is not a valid -march value

2024-01-05 Thread Stanisław Kardach
On Wed, Nov 22, 2023 at 5:41 PM David Marchand
 wrote:
>
> On Wed, Nov 22, 2023 at 5:17 PM Bruce Richardson
>  wrote:
> >
> > On Wed, Nov 22, 2023 at 05:02:56PM +0100, David Marchand wrote:
> > > On Tue, Nov 21, 2023 at 5:49 PM  wrote:
> > > >
> > > > From: Christian Ehrhardt 
> > > >
> > > > If building riscv natively with -Dplatform=generic config/meson.build
> > > > will select cpu_instruction_set=riscv.
> > > >
> > > > That was fine because config/riscv/meson.build did override it to valid
> > > > values later, but since b7676fcccab4 ("config: verify machine arch
> > > > flag") it will break the build as it tries to test -march=riscv which
> > > > is not a value value.
> > > >
> > > > The generic setting used in most cases is rv64gc, set this here
> > > > as well.
> > > >
> > > > Fixes: b7676fcccab4 ("config: verify machine arch flag")
> > > > Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")
> > > >
> > > > Signed-off-by: Christian Ehrhardt 
> > > > ---
> > > >  config/meson.build | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/config/meson.build b/config/meson.build
> > > > index d732154731..a9ccd56deb 100644
> > > > --- a/config/meson.build
> > > > +++ b/config/meson.build
> > > > @@ -152,7 +152,7 @@ if cpu_instruction_set == 'generic'
> > > >  elif host_machine.cpu_family().startswith('ppc')
> > > >  cpu_instruction_set = 'power8'
> > > >  elif host_machine.cpu_family().startswith('riscv')
> > > > -cpu_instruction_set = 'riscv'
> > > > +cpu_instruction_set = 'rv64gc'
> > >
> > > Copying more people.
> > >
> > > This fix is probably the best, so close to the release.
> > >
> >
> > Agreed
>
> I took this patch as is, for now.
Sorry for reviving an old thread, I was on a rather long OoO, hence I
did not answer.
Thank you for taking this patch.
>
> >
> > >
> > > However, I think a more complete fix would be to set this here to generic.
> > > And do the march validation in config/riscv/meson.build in a similar
> > > fashion to ARM.
> > >
> > > Or maybe the validation added in b7676fcccab4 ("config: verify machine
> > > arch flag") should be moved after subdir(arch_subdir).
> > > Bruce, opinion?
> > >
> >
> > Probably the first of these two is best, to do the march validation in the
> > riscv-specific file. However, I've no strong opinions either way.
>
> Stanislaw, could you look at doing some enhancement on this topic?
> And, in any case, what we lack is a CI for RISC V.
It seems that there is not much traction for RISC-V DPDK yet. StarFive
seems to be focused more on the platform side of things and therefore
I don't have any server-grade HW to really run CI on.
>
>
> --
> David Marchand
>


--
Best Regards,
Stanisław Kardach


[PATCH] doc: fix test eventdev example commands

2024-01-05 Thread pbhagavatula
From: Pavan Nikhilesh 

Fix incorrect core masks in testeventdev example
commands.

Fixes: f6dda59153f1 ("doc: add order queue test in eventdev test guide")
Fixes: dd37027f2ba6 ("doc: add order all types queue test in eventdev test 
guide")
Fixes: 43bc2fef79cd ("doc: add perf queue test in eventdev test guide")
Fixes: b3d4e665ed3d ("doc: add perf all types queue test in eventdev test 
guide")
Fixes: b01974da9f25 ("app/eventdev: add ethernet device producer option")
Fixes: ba9de463abeb ("doc: add pipeline queue test in testeventdev guide")
Fixes: d1b46daf7484 ("doc: add pipeline atq test in testeventdev guide")
Fixes: d008f20bce23 ("app/eventdev: add event timer adapter as a producer")
Fixes: 2eaa37b86635 ("app/eventdev: add vector mode in pipeline test")
Cc: sta...@dpdk.org

Signed-off-by: Pavan Nikhilesh 
---
 doc/guides/tools/testeventdev.rst | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/doc/guides/tools/testeventdev.rst 
b/doc/guides/tools/testeventdev.rst
index fc36bfb30c..820ccf383e 100644
--- a/doc/guides/tools/testeventdev.rst
+++ b/doc/guides/tools/testeventdev.rst
@@ -308,7 +308,7 @@ Example command to run order queue test:
 
 .. code-block:: console
 
-   sudo /app/dpdk-test-eventdev --vdev=event_sw0 -- \
+   sudo /app/dpdk-test-eventdev -c 0x1f -s 0x10 --vdev=event_sw0 -- 
\
 --test=order_queue --plcores 1 --wlcores 2,3
 
 
@@ -371,7 +371,7 @@ Example command to run order ``all types queue`` test:
 
 .. code-block:: console
 
-   sudo /app/dpdk-test-eventdev --vdev=event_octeontx -- \
+   sudo /app/dpdk-test-eventdev -c 0x1f --vdev=event_octeontx -- \
 --test=order_atq --plcores 1 --wlcores 2,3
 
 
@@ -475,14 +475,14 @@ Example command to run perf queue test:
 
 .. code-block:: console
 
-   sudo /app/dpdk-test-eventdev -c 0xf -s 0x1 --vdev=event_sw0 -- \
+   sudo /app/dpdk-test-eventdev -c 0xf -s 0x2 --vdev=event_sw0 -- \
 --test=perf_queue --plcores=2 --wlcore=3 --stlist=p --nb_pkts=0
 
 Example command to run perf queue test with producer enqueuing a burst of 
events:
 
 .. code-block:: console
 
-   sudo /app/dpdk-test-eventdev -c 0xf -s 0x1 --vdev=event_sw0 -- \
+   sudo /app/dpdk-test-eventdev -c 0xf -s 0x2 --vdev=event_sw0 -- \
 --test=perf_queue --plcores=2 --wlcore=3 --stlist=p --nb_pkts=0 \
 --prod_enq_burst_sz=32
 
@@ -490,15 +490,15 @@ Example command to run perf queue test with ethernet 
ports:
 
 .. code-block:: console
 
-   sudo build/app/dpdk-test-eventdev --vdev=event_sw0 -- \
+   sudo build/app/dpdk-test-eventdev -c 0xf -s 0x2 --vdev=event_sw0 -- \
 --test=perf_queue --plcores=2 --wlcore=3 --stlist=p --prod_type_ethdev
 
 Example command to run perf queue test with event timer adapter:
 
 .. code-block:: console
 
-   sudo  /app/dpdk-test-eventdev --vdev="event_octeontx" -- \
---wlcores 4 --plcores 12 --test perf_queue --stlist=a \
+   sudo  /app/dpdk-test-eventdev -c 0xfff1 --vdev="event_octeontx" \
+-- --wlcores 4 --plcores 12 --test perf_queue --stlist=a \
 --prod_type_timerdev --fwd_latency
 
 PERF_ATQ Test
@@ -585,15 +585,15 @@ Example command to run perf ``all types queue`` test:
 
 .. code-block:: console
 
-   sudo /app/dpdk-test-eventdev --vdev=event_octeontx -- \
+   sudo /app/dpdk-test-eventdev -c 0xf --vdev=event_octeontx -- \
 --test=perf_atq --plcores=2 --wlcore=3 --stlist=p --nb_pkts=0
 
 Example command to run perf ``all types queue`` test with event timer adapter:
 
 .. code-block:: console
 
-   sudo  /app/dpdk-test-eventdev --vdev="event_octeontx" -- \
---wlcores 4 --plcores 12 --test perf_atq --verbose 20 \
+   sudo  /app/dpdk-test-eventdev -c 0xfff1 --vdev="event_octeontx" \
+-- --wlcores 4 --plcores 12 --test perf_atq --verbose 20 \
 --stlist=a --prod_type_timerdev --fwd_latency
 
 
@@ -817,13 +817,13 @@ Example command to run pipeline atq test:
 
 .. code-block:: console
 
-sudo /app/dpdk-test-eventdev -c 0xf -s 0x8 --vdev=event_sw0 -- \
+sudo /app/dpdk-test-eventdev -c 0xf --vdev="event_octeontx" -- \
 --test=pipeline_atq --wlcore=1 --prod_type_ethdev --stlist=a
 
 Example command to run pipeline atq test with vector events:
 
 .. code-block:: console
 
-sudo /app/dpdk-test-eventdev -c 0xf -s 0x8 --vdev=event_sw0 -- \
+sudo /app/dpdk-test-eventdev -c 0xf --vdev="event_octeontx" -- \
 --test=pipeline_atq --wlcore=1 --prod_type_ethdev --stlist=a \
 --enable_vector  --vector_size 512
-- 
2.25.1



Re: [PATCH] doc: update default value for config parameter

2024-01-05 Thread Tyler Retzlaff
On Fri, Jan 05, 2024 at 10:44:17AM +0800, Simei Su wrote:
> Update documentation value to match default value in code base.
> 
> Signed-off-by: Simei Su 
> ---

Acked-by: Tyler Retzlaff 



Re: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Stephen Hemminger
On Fri, 5 Jan 2024 14:44:00 +
Madhuker Mythri  wrote:

> Hi Stephen,
> 
> The BPF helper man pages implies in that way and the SKB data pointer access 
> was working till 5.4 kernel also, however from Kernel-5.15 version, we do see 
> eBPF verifier throws error when we use SKB data pointer access.
> So, I had used this helper functions and able to resolve the errors. This is 
> helper functions are safe to use and also protects from any non-linear skb 
> data buffer access also.
> 
> So, I think using helper functions is better and safe way to access the SKB 
> data, instead of pointer access.
> 
> Thanks,
> Madhuker.

Using the accessors may mean it won't work with older kernels, but that is not
a huge concern given how fragile this code is.


Re: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Stephen Hemminger
On Fri, 5 Jan 2024 14:58:22 +
Madhuker Mythri  wrote:

> Hi Stephen,
> 
> As part of hash calculation logic, the hash value is going beyond 32-bits and 
> thus the eBPF verifier throws error with the 32-bit hash variable.
> So, I need to modify as 64-bit hash variable to resolve the BPF verifier 
> error.
> 
> Here, in the code this rte_softrss_be() function is returning the hash 
> variable, which is a 64-bit value, so modified the return type from 32-bit to 
> 64-bit.

Simple cast should do the necessary truncation.


Re: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Stephen Hemminger
On Fri, 5 Jan 2024 15:11:22 +
Madhuker Mythri  wrote:

> By using this " bpf_skb_load_bytes_relative()" helper function we can 
> directly retrieve the Network header data fields, so we no need to check for 
> L2-header VLAN presence.
> 
> Thanks,
> Madhuker.

No problem, the vlan code was already broken. The kernel offloads vlan header 
to skb
but who ever wrote the original did not know that is what happens in modern 
kernels.


Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> adhuker Mythri 
> 
> When multiple queues configured, internally RSS will be enabled and thus TAP 
> BPF RSS byte-code will be loaded on to the Kernel using BPF system calls.
> 
> Here, the problem is loading the existing BPF byte-code to the Kernel-5.15 
> and above versions throws errors, i.e: Kernel BPF verifier not accepted this 
> existing BPF byte-code and system calls return error code "-7" as follows:
> 
> rss_add_actions(): Failed to load BPF section l3_l4 (7): Argument list too 
> long
> 
> 
> RCA:  These errors started coming after from the Kernel-5.15 version, in 
> which lots of new BPF verification restrictions were added for safe execution 
> of byte-code on to the Kernel, due to which existing BPF program verification 
> does not pass.
> Here are the major BPF verifier restrictions observed:
> 1) Need to use new BPF maps structure.
> 2) Kernel SKB data pointer access not allowed.
> 3) Undefined loops were not allowed(which are bounded by a variable value).
> 4) unreachable instructions(like: undefined array access).
> 
> After addressing all these Kernel BPF verifier restrictions able to load the 
> BPF byte-code onto the Kernel successfully.
> 
> Note: This new BPF changes supports from Kernel:4.10 version.
> 
> Bugzilla Id: 1329
> 
> Signed-off-by: Madhuker Mythri 

I tried this version on Debian testing which has:
kernel 6.5.0-5-amd64
clang 16.0.6

If build and run with the pre-compiled BPF then it will load the
example flow  (see https://doc.dpdk.org/guides/nics/tap.html)

But if I recompile the bpf program by using make in the tap/bpf
directory, then the resulting bpf instructions will not make
it past verifier.

With modified tap_bpf_api can get the log message as:

testpmd> flow create 0 priority 4 ingress pattern eth dst is 0a:0b:0c:0d:0e:0f  
/ ipv4 / tcp / end actions rss queues 0 1 2 3 end / end
rss_add_actions(): Failed to load BPF section l3_l4 (13): func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (bf) r6 = r1   ; R1=ctx(off=0,imm=0) 
R6_w=ctx(off=0,imm=0)
1: (18) r1 = 0x300; R1_w=768
3: (63) *(u32 *)(r10 -84) = r1; R1_w=768 R10=fp0 fp-88=
4: (bf) r2 = r10  ; R2_w=fp0 R10=fp0
5: (07) r2 += -84 ; R2_w=fp-84
6: (18) r1 = 0xfd ; R1_w=253
8: (85) call bpf_map_lookup_elem#1
R1 type=scalar expected=map_ptr
processed 7 insns (limit 100) max_states_per_insn 0 total_states 0 
peak_states 0 mark_read 0

port_flow_complain(): Caught PMD error type 16 (specific action): cause: 
0x7ffcef37e678, action not supported: Operation not supported



[PATCH v4 0/4] changes for 24.03

2024-01-05 Thread Hernan Vargas
v4: Targeting 24.03. Updated FPGA PMD based on review comments.
v3: Made changes requested during review.
v2: Targeting 23.11. Update in commits 1,2 based on review comments.
v1: Targeting 23.07 if possible. Add support for AGX100 (N6000) and corner case 
fixes.

Hernan Vargas (4):
  baseband/fpga_5gnr_fec: renaming for consistency
  baseband/fpga_5gnr_fec: add Vista Creek variant
  baseband/fpga_5gnr_fec: add AGX100 support
  baseband/fpga_5gnr_fec: cosmetic comment changes

 doc/guides/bbdevs/fpga_5gnr_fec.rst   |   76 +-
 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h   |  273 ++
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h|  353 +--
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 2270 -
 .../fpga_5gnr_fec/rte_pmd_fpga_5gnr_fec.h |   27 +-
 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h  |  139 +
 6 files changed, 2204 insertions(+), 934 deletions(-)
 create mode 100644 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h
 create mode 100644 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h

-- 
2.37.1



[PATCH v4 1/4] baseband/fpga_5gnr_fec: renaming for consistency

2024-01-05 Thread Hernan Vargas
Rename generic functions and constants using the FPGA 5GNR prefix naming
to prepare for code reuse for new FPGA implementation variant.
No functional impact.

Signed-off-by: Hernan Vargas 
Reviewed-by: Maxime Coquelin 
---
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h| 117 +++--
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 455 --
 .../fpga_5gnr_fec/rte_pmd_fpga_5gnr_fec.h |  17 +-
 3 files changed, 269 insertions(+), 320 deletions(-)

diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h 
b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
index e3038112fabb..9300349a731b 100644
--- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
+++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
@@ -31,26 +31,26 @@
 #define FPGA_5GNR_FEC_VF_DEVICE_ID (0x0D90)
 
 /* Align DMA descriptors to 256 bytes - cache-aligned */
-#define FPGA_RING_DESC_ENTRY_LENGTH (8)
+#define FPGA_5GNR_RING_DESC_ENTRY_LENGTH (8)
 /* Ring size is in 256 bits (32 bytes) units */
 #define FPGA_RING_DESC_LEN_UNIT_BYTES (32)
 /* Maximum size of queue */
-#define FPGA_RING_MAX_SIZE (1024)
+#define FPGA_5GNR_RING_MAX_SIZE (1024)
 
 #define FPGA_NUM_UL_QUEUES (32)
 #define FPGA_NUM_DL_QUEUES (32)
 #define FPGA_TOTAL_NUM_QUEUES (FPGA_NUM_UL_QUEUES + FPGA_NUM_DL_QUEUES)
 #define FPGA_NUM_INTR_VEC (FPGA_TOTAL_NUM_QUEUES - RTE_INTR_VEC_RXTX_OFFSET)
 
-#define FPGA_INVALID_HW_QUEUE_ID (0x)
+#define FPGA_5GNR_INVALID_HW_QUEUE_ID (0x)
 
-#define FPGA_QUEUE_FLUSH_TIMEOUT_US (1000)
-#define FPGA_HARQ_RDY_TIMEOUT (10)
-#define FPGA_TIMEOUT_CHECK_INTERVAL (5)
-#define FPGA_DDR_OVERFLOW (0x10)
+#define FPGA_5GNR_QUEUE_FLUSH_TIMEOUT_US (1000)
+#define FPGA_5GNR_HARQ_RDY_TIMEOUT (10)
+#define FPGA_5GNR_TIMEOUT_CHECK_INTERVAL (5)
+#define FPGA_5GNR_DDR_OVERFLOW (0x10)
 
-#define FPGA_5GNR_FEC_DDR_WR_DATA_LEN_IN_BYTES 8
-#define FPGA_5GNR_FEC_DDR_RD_DATA_LEN_IN_BYTES 8
+#define FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES 8
+#define FPGA_5GNR_DDR_RD_DATA_LEN_IN_BYTES 8
 
 /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
 #define N_ZC_1 66 /* N = 66 Zc for BG 1 */
@@ -152,7 +152,7 @@ struct __rte_packed fpga_dma_enc_desc {
};
 
uint8_t sw_ctxt[FPGA_RING_DESC_LEN_UNIT_BYTES *
-   (FPGA_RING_DESC_ENTRY_LENGTH - 1)];
+   (FPGA_5GNR_RING_DESC_ENTRY_LENGTH - 1)];
};
 };
 
@@ -197,7 +197,7 @@ struct __rte_packed fpga_dma_dec_desc {
uint8_t cbs_in_op;
};
 
-   uint32_t sw_ctxt[8 * (FPGA_RING_DESC_ENTRY_LENGTH - 1)];
+   uint32_t sw_ctxt[8 * (FPGA_5GNR_RING_DESC_ENTRY_LENGTH - 1)];
};
 };
 
@@ -207,8 +207,8 @@ union fpga_dma_desc {
struct fpga_dma_dec_desc dec_req;
 };
 
-/* FPGA 5GNR FEC Ring Control Register */
-struct __rte_packed fpga_ring_ctrl_reg {
+/* FPGA 5GNR Ring Control Register. */
+struct __rte_packed fpga_5gnr_ring_ctrl_reg {
uint64_t ring_base_addr;
uint64_t ring_head_addr;
uint16_t ring_size:11;
@@ -226,38 +226,37 @@ struct __rte_packed fpga_ring_ctrl_reg {
uint16_t rsrvd3;
uint16_t head_point;
uint16_t rsrvd4;
-
 };
 
-/* Private data structure for each FPGA FEC device */
+/* Private data structure for each FPGA 5GNR device. */
 struct fpga_5gnr_fec_device {
-   /** Base address of MMIO registers (BAR0) */
+   /** Base address of MMIO registers (BAR0). */
void *mmio_base;
-   /** Base address of memory for sw rings */
+   /** Base address of memory for sw rings. */
void *sw_rings;
-   /** Physical address of sw_rings */
+   /** Physical address of sw_rings. */
rte_iova_t sw_rings_phys;
/** Number of bytes available for each queue in device. */
uint32_t sw_ring_size;
-   /** Max number of entries available for each queue in device */
+   /** Max number of entries available for each queue in device. */
uint32_t sw_ring_max_depth;
-   /** Base address of response tail pointer buffer */
+   /** Base address of response tail pointer buffer. */
uint32_t *tail_ptrs;
-   /** Physical address of tail pointers */
+   /** Physical address of tail pointers. */
rte_iova_t tail_ptr_phys;
-   /** Queues flush completion flag */
+   /** Queues flush completion flag. */
uint64_t *flush_queue_status;
-   /* Bitmap capturing which Queues are bound to the PF/VF */
+   /** Bitmap capturing which Queues are bound to the PF/VF. */
uint64_t q_bound_bit_map;
-   /* Bitmap capturing which Queues have already been assigned */
+   /** Bitmap capturing which Queues have already been assigned. */
uint64_t q_assigned_bit_map;
-   /** True if this is a PF FPGA FEC device */
+   /** True if this is a PF FPGA 5GNR device. */
bool pf_device;
 };
 
-/* Structure associated with each queue. */
-struct __rte_cache_aligned fpga_queue {

[PATCH v4 2/4] baseband/fpga_5gnr_fec: add Vista Creek variant

2024-01-05 Thread Hernan Vargas
Create a new file vc_5gnr_pmd.h to store structures and macros specific
to Vista Creek 5G FPGA implementation and rename functions specific to
the Vista Creek variant.

Signed-off-by: Hernan Vargas 
---
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h| 183 ++-
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 475 +-
 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h  | 140 ++
 3 files changed, 398 insertions(+), 400 deletions(-)
 create mode 100644 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h

diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h 
b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
index 9300349a731b..982e956dc819 100644
--- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
+++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
@@ -8,6 +8,8 @@
 #include 
 #include 
 
+#include "vc_5gnr_pmd.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
rte_log(RTE_LOG_ ## level, fpga_5gnr_fec_logtype, fmt "\n", \
@@ -25,32 +27,20 @@
 #define FPGA_5GNR_FEC_PF_DRIVER_NAME intel_fpga_5gnr_fec_pf
 #define FPGA_5GNR_FEC_VF_DRIVER_NAME intel_fpga_5gnr_fec_vf
 
-/* FPGA 5GNR FEC PCI vendor & device IDs */
-#define FPGA_5GNR_FEC_VENDOR_ID (0x8086)
-#define FPGA_5GNR_FEC_PF_DEVICE_ID (0x0D8F)
-#define FPGA_5GNR_FEC_VF_DEVICE_ID (0x0D90)
-
-/* Align DMA descriptors to 256 bytes - cache-aligned */
-#define FPGA_5GNR_RING_DESC_ENTRY_LENGTH (8)
-/* Ring size is in 256 bits (32 bytes) units */
-#define FPGA_RING_DESC_LEN_UNIT_BYTES (32)
-/* Maximum size of queue */
-#define FPGA_5GNR_RING_MAX_SIZE (1024)
-
-#define FPGA_NUM_UL_QUEUES (32)
-#define FPGA_NUM_DL_QUEUES (32)
-#define FPGA_TOTAL_NUM_QUEUES (FPGA_NUM_UL_QUEUES + FPGA_NUM_DL_QUEUES)
-#define FPGA_NUM_INTR_VEC (FPGA_TOTAL_NUM_QUEUES - RTE_INTR_VEC_RXTX_OFFSET)
-
 #define FPGA_5GNR_INVALID_HW_QUEUE_ID (0x)
-
 #define FPGA_5GNR_QUEUE_FLUSH_TIMEOUT_US (1000)
 #define FPGA_5GNR_HARQ_RDY_TIMEOUT (10)
 #define FPGA_5GNR_TIMEOUT_CHECK_INTERVAL (5)
 #define FPGA_5GNR_DDR_OVERFLOW (0x10)
-
 #define FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES 8
 #define FPGA_5GNR_DDR_RD_DATA_LEN_IN_BYTES 8
+/* Align DMA descriptors to 256 bytes - cache-aligned. */
+#define FPGA_5GNR_RING_DESC_ENTRY_LENGTH (8)
+/* Maximum size of queue. */
+#define FPGA_5GNR_RING_MAX_SIZE (1024)
+
+#define VC_5GNR_FPGA_VARIANT   0
+#define AGX100_FPGA_VARIANT1
 
 /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
 #define N_ZC_1 66 /* N = 66 Zc for BG 1 */
@@ -62,32 +52,7 @@
 #define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
 #define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
 
-/* FPGA 5GNR FEC Register mapping on BAR0 */
-enum {
-   FPGA_5GNR_FEC_VERSION_ID = 0x, /* len: 4B */
-   FPGA_5GNR_FEC_CONFIGURATION = 0x0004, /* len: 2B */
-   FPGA_5GNR_FEC_QUEUE_PF_VF_MAP_DONE = 0x0008, /* len: 1B */
-   FPGA_5GNR_FEC_LOAD_BALANCE_FACTOR = 0x000a, /* len: 2B */
-   FPGA_5GNR_FEC_RING_DESC_LEN = 0x000c, /* len: 2B */
-   FPGA_5GNR_FEC_VFQ_FLUSH_STATUS_LW = 0x0018, /* len: 4B */
-   FPGA_5GNR_FEC_VFQ_FLUSH_STATUS_HI = 0x001c, /* len: 4B */
-   FPGA_5GNR_FEC_QUEUE_MAP = 0x0040, /* len: 256B */
-   FPGA_5GNR_FEC_RING_CTRL_REGS = 0x0200, /* len: 2048B */
-   FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS = 0x0A00, /* len: 4B */
-   FPGA_5GNR_FEC_DDR4_WR_DATA_REGS = 0x0A08, /* len: 8B */
-   FPGA_5GNR_FEC_DDR4_WR_DONE_REGS = 0x0A10, /* len: 1B */
-   FPGA_5GNR_FEC_DDR4_RD_ADDR_REGS = 0x0A18, /* len: 4B */
-   FPGA_5GNR_FEC_DDR4_RD_DONE_REGS = 0x0A20, /* len: 1B */
-   FPGA_5GNR_FEC_DDR4_RD_RDY_REGS = 0x0A28, /* len: 1B */
-   FPGA_5GNR_FEC_DDR4_RD_DATA_REGS = 0x0A30, /* len: 8B */
-   FPGA_5GNR_FEC_DDR4_ADDR_RDY_REGS = 0x0A38, /* len: 1B */
-   FPGA_5GNR_FEC_HARQ_BUF_SIZE_RDY_REGS = 0x0A40, /* len: 1B */
-   FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS = 0x0A48, /* len: 4B */
-   FPGA_5GNR_FEC_MUTEX = 0x0A60, /* len: 4B */
-   FPGA_5GNR_FEC_MUTEX_RESET = 0x0A68  /* len: 4B */
-};
-
-/* FPGA 5GNR FEC Ring Control Registers */
+/* FPGA 5GNR Ring Control Registers. */
 enum {
FPGA_5GNR_FEC_RING_HEAD_ADDR = 0x0008,
FPGA_5GNR_FEC_RING_SIZE = 0x0010,
@@ -98,113 +63,27 @@ enum {
FPGA_5GNR_FEC_RING_HEAD_POINT = 0x001C
 };
 
-/* FPGA 5GNR FEC DESCRIPTOR ERROR */
+/* VC 5GNR and AGX100 common register mapping on BAR0. */
 enum {
-   DESC_ERR_NO_ERR = 0x0,
-   DESC_ERR_K_P_OUT_OF_RANGE = 0x1,
-   DESC_ERR_Z_C_NOT_LEGAL = 0x2,
-   DESC_ERR_DESC_OFFSET_ERR = 0x3,
-   DESC_ERR_DESC_READ_FAIL = 0x8,
-   DESC_ERR_DESC_READ_TIMEOUT = 0x9,
-   DESC_ERR_DESC_READ_TLP_POISONED = 0xA,
-   DESC_ERR_HARQ_INPUT_LEN = 0xB,
-   DESC_ERR_CB_READ_FAIL = 0xC,
-   DESC_ERR_CB_READ_TIMEOUT = 0xD,
-   DESC_ERR_CB_READ_TLP_POISONED = 0xE,
-   DESC_ERR_HBSTORE_ERR = 0xF
-};
-
-
-/* FPGA 5GNR FEC DMA Encoding Request Descr

[PATCH v4 3/4] baseband/fpga_5gnr_fec: add AGX100 support

2024-01-05 Thread Hernan Vargas
Add support for new FPGA variant AGX100 (on Arrow Creek N6000).

Signed-off-by: Hernan Vargas 
---
 doc/guides/bbdevs/fpga_5gnr_fec.rst   |   76 +-
 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h   |  273 
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h|   12 +-
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 1230 +++--
 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h  |1 -
 5 files changed, 1459 insertions(+), 133 deletions(-)
 create mode 100644 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h

diff --git a/doc/guides/bbdevs/fpga_5gnr_fec.rst 
b/doc/guides/bbdevs/fpga_5gnr_fec.rst
index 956dd6bed560..1ae192a86b25 100644
--- a/doc/guides/bbdevs/fpga_5gnr_fec.rst
+++ b/doc/guides/bbdevs/fpga_5gnr_fec.rst
@@ -6,12 +6,13 @@ Intel(R) FPGA 5GNR FEC Poll Mode Driver
 
 The BBDEV FPGA 5GNR FEC poll mode driver (PMD) supports an FPGA implementation 
of a VRAN
 LDPC Encode / Decode 5GNR wireless acceleration function, using Intel's PCI-e 
and FPGA
-based Vista Creek device.
+based Vista Creek (N3000, referred to as VC_5GNR in the code) as well as Arrow 
Creek (N6000,
+referred to as AGX100 in the code).
 
 Features
 
 
-FPGA 5GNR FEC PMD supports the following features:
+FPGA 5GNR FEC PMD supports the following BBDEV capabilities:
 
 - LDPC Encode in the DL
 - LDPC Decode in the UL
@@ -67,10 +68,18 @@ Initialization
 
 When the device first powers up, its PCI Physical Functions (PF) can be listed 
through this command:
 
+Vista Creek (N3000)
+
 .. code-block:: console
 
   sudo lspci -vd8086:0d8f
 
+Arrow Creek (N6000)
+
+.. code-block:: console
+
+  sudo lspci -vd8086:5799
+
 The physical and virtual functions are compatible with Linux UIO drivers:
 ``vfio_pci`` and ``igb_uio``. However, in order to work the FPGA 5GNR FEC 
device firstly needs
 to be bound to one of these linux drivers through DPDK.
@@ -78,6 +87,7 @@ to be bound to one of these linux drivers through DPDK.
 For more details on how to bind the PF device and create VF devices, see
 :ref:`linux_gsg_binding_kernel`.
 
+
 Configure the VFs through PF
 
 
@@ -100,7 +110,6 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure:
   uint8_t dl_bandwidth;
   uint8_t ul_load_balance;
   uint8_t dl_load_balance;
-  uint16_t flr_time_out;
   };
 
 - ``pf_mode_en``: identifies whether only PF is to be used, or the VFs. PF and
@@ -111,12 +120,12 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` 
structure:
 
 - ``vf_*l_queues_number``: defines the hardware queue mapping for every VF.
 
-- ``*l_bandwidth``: in case of congestion on PCIe interface. The device
-  allocates different bandwidth to UL and DL. The weight is configured by this
-  setting. The unit of weight is 3 code blocks. For example, if the code block
-  cbps (code block per second) ratio between UL and DL is 12:1, then the
-  configuration value should be set to 36:3. The schedule algorithm is based
-  on code block regardless the length of each block.
+- ``*l_bandwidth``: Only used for the Vista Creek schedule algorithm in case of
+  congestion on PCIe interface. The device allocates different bandwidth to UL
+  and DL. The weight is configured by this setting. The unit of weight is 3 
code
+  blocks. For example, if the code block cbps (code block per second) ratio 
between
+  UL and DL is 12:1, then the configuration value should be set to 36:3.
+  The schedule algorithm is based on code block regardless the length of each 
block.
 
 - ``*l_load_balance``: hardware queues are load-balanced in a round-robin
   fashion. Queues get filled first-in first-out until they reach a pre-defined
@@ -126,10 +135,6 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure:
   If all hardware queues exceeds the watermark, no code blocks will be
   streamed in from UL/DL code block FIFO.
 
-- ``flr_time_out``: specifies how many 16.384us to be FLR time out. The
-  time_out = flr_time_out x 16.384us. For instance, if you want to set 10ms for
-  the FLR time out then set this setting to 0x262=610.
-
 
 An example configuration code calling the function 
``rte_fpga_5gnr_fec_configure()`` is shown
 below:
@@ -154,7 +159,7 @@ below:
   /* setup FPGA PF */
   ret = rte_fpga_5gnr_fec_configure(info->dev_name, &conf);
   TEST_ASSERT_SUCCESS(ret,
-  "Failed to configure 4G FPGA PF for bbdev %s",
+  "Failed to configure 5GNR FPGA PF for bbdev %s",
   info->dev_name);
 
 
@@ -164,8 +169,38 @@ Test Application
 BBDEV provides a test application, ``test-bbdev.py`` and range of test data 
for testing
 the functionality of the device, depending on the device's capabilities.
 
-For more details on how to use the test application,
-see :ref:`test_bbdev_application`.
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params" : EAL arguments which are passed to the test app.
+  "-t", "--timeout": Timeout in seconds (default=300).
+  "-c", "--test-cases" : Defines test cases t

[PATCH v4 4/4] baseband/fpga_5gnr_fec: cosmetic comment changes

2024-01-05 Thread Hernan Vargas
Cosmetic changes for comments.
No functional impact.

Signed-off-by: Hernan Vargas 
---
 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h   |   4 +-
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h|  49 ++--
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 248 +-
 .../fpga_5gnr_fec/rte_pmd_fpga_5gnr_fec.h |  16 +-
 4 files changed, 157 insertions(+), 160 deletions(-)

diff --git a/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h 
b/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h
index fb7085ec2d00..5e562376c966 100644
--- a/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h
+++ b/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h
@@ -95,7 +95,7 @@ struct __rte_packed agx100_dma_enc_desc {
c:10, /**< Total code block number in TB or CBG. */
rsrvd4:2,
num_null:10; /**< Number of null bits. */
-   uint32_t ea:21, /**< Value of E when worload is CB. */
+   uint32_t ea:21, /**< Value of E when workload is CB. */
rsrvd5:11;
uint32_t eb:21, /**< Only valid when workload is TB or CBGs. */
rsrvd6:11;
@@ -194,7 +194,7 @@ struct __rte_packed agx100_dma_dec_desc {
llr_pckg:1, /**< 0: 8-bit LLR 1: 6-bit LLR packed together. */
syndrome_check_mode:1, /**<0: full syndrome check 1: 4-layer 
syndome check.*/
num_null:10; /**< Number of null bits. */
-   uint32_t ea:21, /**< Value of E when worload is CB. */
+   uint32_t ea:21, /**< Value of E when workload is CB. */
rsrvd2:3,
eba:8; /**< Only valid when workload is TB or CBGs. */
uint32_t hbstore_offset_out:24, /**< HARQ buffer write address. */
diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h 
b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
index 224684902569..6e97a3e9e2d4 100644
--- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
+++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
@@ -11,7 +11,7 @@
 #include "agx100_pmd.h"
 #include "vc_5gnr_pmd.h"
 
-/* Helper macro for logging */
+/* Helper macro for logging. */
 #define rte_bbdev_log(level, fmt, ...) \
rte_log(RTE_LOG_ ## level, fpga_5gnr_fec_logtype, fmt "\n", \
##__VA_ARGS__)
@@ -24,7 +24,7 @@
 #define rte_bbdev_log_debug(fmt, ...)
 #endif
 
-/* FPGA 5GNR FEC driver names */
+/* FPGA 5GNR FEC driver names. */
 #define FPGA_5GNR_FEC_PF_DRIVER_NAME intel_fpga_5gnr_fec_pf
 #define FPGA_5GNR_FEC_VF_DRIVER_NAME intel_fpga_5gnr_fec_vf
 
@@ -43,15 +43,15 @@
 #define VC_5GNR_FPGA_VARIANT   0
 #define AGX100_FPGA_VARIANT1
 
-/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
-#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
-#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
-#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
-#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
-#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
-#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
-#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
-#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2. */
+#define N_ZC_1 66 /**< N = 66 Zc for BG 1. */
+#define N_ZC_2 50 /**< N = 50 Zc for BG 2. */
+#define K0_1_1 17 /**< K0 fraction numerator for rv 1 and BG 1. */
+#define K0_1_2 13 /**< K0 fraction numerator for rv 1 and BG 2. */
+#define K0_2_1 33 /**< K0 fraction numerator for rv 2 and BG 1. */
+#define K0_2_2 25 /**< K0 fraction numerator for rv 2 and BG 2. */
+#define K0_3_1 56 /**< K0 fraction numerator for rv 3 and BG 1. */
+#define K0_3_2 43 /**< K0 fraction numerator for rv 3 and BG 2. */
 
 /* FPGA 5GNR Ring Control Registers. */
 enum {
@@ -93,7 +93,7 @@ struct __rte_packed fpga_5gnr_ring_ctrl_reg {
uint64_t ring_head_addr;
uint16_t ring_size:11;
uint16_t rsrvd0;
-   union { /* Miscellaneous register */
+   union { /* Miscellaneous register. */
uint8_t misc;
uint8_t max_ul_dec:5,
max_ul_dec_en:1,
@@ -140,26 +140,23 @@ struct fpga_5gnr_fec_device {
 
 /** Structure associated with each queue. */
 struct __rte_cache_aligned fpga_5gnr_queue {
-   struct fpga_5gnr_ring_ctrl_reg ring_ctrl_reg;  /**< Ring Control 
Register */
+   struct fpga_5gnr_ring_ctrl_reg ring_ctrl_reg;  /**< Ring Control 
Register. */
union {
/** Virtual address of VC 5GNR software ring. */
union vc_5gnr_dma_desc *vc_5gnr_ring_addr;
/** Virtual address of AGX100 software ring. */
union agx100_dma_desc *agx100_ring_addr;
};
-   uint64_t *ring_head_addr;  /* Virtual address of completion_head */
-   uint64_t shadow_completion_head; /* Shadow completion head value */
-   uint16_t head_free_desc;  /* Ring head */
-   uint16_t tail;  /* Ring tail */
-   /* Mask used to wrap enqueued descriptors on the sw ring */
-

Re: [PATCH v4 1/4] baseband/fpga_5gnr_fec: renaming for consistency

2024-01-05 Thread Stephen Hemminger
On Fri,  5 Jan 2024 13:15:16 -0800
Hernan Vargas  wrote:

> +#define FPGA_5GNR_QUEUE_FLUSH_TIMEOUT_US (1000)

Just my opinion, no need it doesn't have to change but.
These variable names are getting quite long which doesn't
improve readability.


[RFC 0/5] BPF infrastructure enhancements

2024-01-05 Thread Stephen Hemminger
While investigating the BPF program load in TAP device
found a number of minor issues that should be addressed.

Stephen Hemminger (5):
  tap: move forward declaration of bpf_load
  tap: remove unnecessary bzero() calls in bpf api
  tap: remove unnecessary cast in call to bpf_load
  tap: get errors from kernel on bpf load failure
  tap: stop "vendoring" linux bpf header

 drivers/net/tap/bpf/bpf_extract.py |   1 -
 drivers/net/tap/tap_bpf.h  | 121 -
 drivers/net/tap/tap_bpf_api.c  |  73 +++--
 drivers/net/tap/tap_bpf_insns.h|   1 -
 drivers/net/tap/tap_flow.c |  16 ++--
 drivers/net/tap/tap_flow.h |   4 +-
 6 files changed, 60 insertions(+), 156 deletions(-)
 delete mode 100644 drivers/net/tap/tap_bpf.h

-- 
2.43.0



[RFC 1/5] tap: move forward declaration of bpf_load

2024-01-05 Thread Stephen Hemminger
The local function bpf_load forward declaration should
be in the one file using it.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/tap_bpf.h | 3 ---
 drivers/net/tap/tap_bpf_api.c | 3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tap/tap_bpf.h b/drivers/net/tap/tap_bpf.h
index 0d38bc111fe0..aa5a733525e1 100644
--- a/drivers/net/tap/tap_bpf.h
+++ b/drivers/net/tap/tap_bpf.h
@@ -115,7 +115,4 @@ enum {
BPF_MAP_ID_SIMPLE,
 };
 
-static int bpf_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-   size_t insns_cnt, const char *license);
-
 #endif /* __TAP_BPF_H__ */
diff --git a/drivers/net/tap/tap_bpf_api.c b/drivers/net/tap/tap_bpf_api.c
index 15283f8917ed..a6adec855dda 100644
--- a/drivers/net/tap/tap_bpf_api.c
+++ b/drivers/net/tap/tap_bpf_api.c
@@ -15,6 +15,9 @@
 #include 
 #include 
 
+static int bpf_load(enum bpf_prog_type type, const struct bpf_insn *insns,
+   size_t insns_cnt, const char *license);
+
 /**
  * Load BPF program (section cls_q) into the kernel and return a bpf fd
  *
-- 
2.43.0



[RFC 2/5] tap: remove unnecessary bzero() calls in bpf api

2024-01-05 Thread Stephen Hemminger
The structures are already fully initialized, bzero() or memset
is redundant.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/tap_bpf_api.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/tap/tap_bpf_api.c b/drivers/net/tap/tap_bpf_api.c
index a6adec855dda..d176da0802eb 100644
--- a/drivers/net/tap/tap_bpf_api.c
+++ b/drivers/net/tap/tap_bpf_api.c
@@ -120,7 +120,6 @@ static int bpf_load(enum bpf_prog_type type,
 {
union bpf_attr attr = {};
 
-   bzero(&attr, sizeof(attr));
attr.prog_type = type;
attr.insn_cnt = (__u32)insns_cnt;
attr.insns = ptr_to_u64(insns);
@@ -153,7 +152,6 @@ int tap_flow_bpf_rss_map_create(unsigned int key_size,
 {
union bpf_attr attr = {};
 
-   bzero(&attr, sizeof(attr));
attr.map_type= BPF_MAP_TYPE_HASH;
attr.key_size= key_size;
attr.value_size  = value_size;
@@ -181,8 +179,6 @@ int tap_flow_bpf_update_rss_elem(int fd, void *key, void 
*value)
 {
union bpf_attr attr = {};
 
-   bzero(&attr, sizeof(attr));
-
attr.map_type = BPF_MAP_TYPE_HASH;
attr.map_fd = fd;
attr.key = ptr_to_u64(key);
-- 
2.43.0



[RFC 3/5] tap: remove unnecessary cast in call to bpf_load

2024-01-05 Thread Stephen Hemminger
The callers already have the correct data type.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/tap_bpf_api.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/tap/tap_bpf_api.c b/drivers/net/tap/tap_bpf_api.c
index d176da0802eb..c754c167a764 100644
--- a/drivers/net/tap/tap_bpf_api.c
+++ b/drivers/net/tap/tap_bpf_api.c
@@ -32,9 +32,8 @@ int tap_flow_bpf_cls_q(__u32 queue_idx)
cls_q_insns[1].imm = queue_idx;
 
return bpf_load(BPF_PROG_TYPE_SCHED_CLS,
-   (struct bpf_insn *)cls_q_insns,
-   RTE_DIM(cls_q_insns),
-   "Dual BSD/GPL");
+   cls_q_insns, RTE_DIM(cls_q_insns),
+   "Dual BSD/GPL");
 }
 
 /**
@@ -55,9 +54,8 @@ int tap_flow_bpf_calc_l3_l4_hash(__u32 key_idx, int map_fd)
l3_l4_hash_insns[9].imm = map_fd;
 
return bpf_load(BPF_PROG_TYPE_SCHED_ACT,
-   (struct bpf_insn *)l3_l4_hash_insns,
-   RTE_DIM(l3_l4_hash_insns),
-   "Dual BSD/GPL");
+   l3_l4_hash_insns, RTE_DIM(l3_l4_hash_insns),
+   "Dual BSD/GPL");
 }
 
 /**
-- 
2.43.0



[RFC 4/5] tap: get errors from kernel on bpf load failure

2024-01-05 Thread Stephen Hemminger
The bpf load kernel API can provide some useful diagnostics
on failure.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/tap_bpf_api.c | 44 +++
 drivers/net/tap/tap_flow.c| 16 -
 drivers/net/tap/tap_flow.h|  4 ++--
 3 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/drivers/net/tap/tap_bpf_api.c b/drivers/net/tap/tap_bpf_api.c
index c754c167a764..29223b7f0ea7 100644
--- a/drivers/net/tap/tap_bpf_api.c
+++ b/drivers/net/tap/tap_bpf_api.c
@@ -15,8 +15,10 @@
 #include 
 #include 
 
-static int bpf_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-   size_t insns_cnt, const char *license);
+static int bpf_load(enum bpf_prog_type type,
+   const struct bpf_insn *insns, size_t insns_cnt,
+   char *log_buf, size_t log_size,
+   const char *license);
 
 /**
  * Load BPF program (section cls_q) into the kernel and return a bpf fd
@@ -24,15 +26,22 @@ static int bpf_load(enum bpf_prog_type type, const struct 
bpf_insn *insns,
  * @param queue_idx
  *   Queue index matching packet cb
  *
+ * @param log_buf
+ *   Buffer to place resulting error message (optional)
+ *
+ * @param log_size
+ *   Size of log_buf
+ *
  * @return
  *   -1 if the BPF program couldn't be loaded. An fd (int) otherwise.
  */
-int tap_flow_bpf_cls_q(__u32 queue_idx)
+int tap_flow_bpf_cls_q(__u32 queue_idx, char *log_buf, size_t log_size)
 {
cls_q_insns[1].imm = queue_idx;
 
return bpf_load(BPF_PROG_TYPE_SCHED_CLS,
cls_q_insns, RTE_DIM(cls_q_insns),
+   log_buf, log_size,
"Dual BSD/GPL");
 }
 
@@ -45,16 +54,23 @@ int tap_flow_bpf_cls_q(__u32 queue_idx)
  * @param[in] map_fd
  *   BPF RSS map file descriptor
  *
+ * @param log_buf
+ *   Buffer to place resulting error message (optional)
+ *
+ * @param log_size
+ *   Size of log_buf
+ *
  * @return
  *   -1 if the BPF program couldn't be loaded. An fd (int) otherwise.
  */
-int tap_flow_bpf_calc_l3_l4_hash(__u32 key_idx, int map_fd)
+int tap_flow_bpf_calc_l3_l4_hash(__u32 key_idx, int map_fd, char *log_buf, 
size_t log_size)
 {
l3_l4_hash_insns[4].imm = key_idx;
l3_l4_hash_insns[9].imm = map_fd;
 
return bpf_load(BPF_PROG_TYPE_SCHED_ACT,
l3_l4_hash_insns, RTE_DIM(l3_l4_hash_insns),
+   log_buf, log_size,
"Dual BSD/GPL");
 }
 
@@ -105,6 +121,12 @@ static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr 
*attr,
  * @param[in] insns_cnt
  *   Number of BPF instructions (size of array)
  *
+ * @param[out] log_buf
+ *   Space for log message
+ *
+ * @param[in] log_size
+ *   Number of characters available in log_buf
+ *
  * @param[in] license
  *   License string that must be acknowledged by the kernel
  *
@@ -112,9 +134,8 @@ static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr 
*attr,
  *   -1 if the BPF program couldn't be loaded, fd (file descriptor) otherwise
  */
 static int bpf_load(enum bpf_prog_type type,
- const struct bpf_insn *insns,
- size_t insns_cnt,
- const char *license)
+   const struct bpf_insn *insns, size_t insns_cnt,
+   char *log_buf, size_t log_size, const char *license)
 {
union bpf_attr attr = {};
 
@@ -122,9 +143,12 @@ static int bpf_load(enum bpf_prog_type type,
attr.insn_cnt = (__u32)insns_cnt;
attr.insns = ptr_to_u64(insns);
attr.license = ptr_to_u64(license);
-   attr.log_buf = ptr_to_u64(NULL);
-   attr.log_level = 0;
-   attr.kern_version = 0;
+
+   if (log_size > 0) {
+   attr.log_level = 2;
+   attr.log_buf = ptr_to_u64(log_buf);
+   attr.log_size = log_size;
+   }
 
return sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
 }
diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
index ed4d42f92f9f..897d71acbad1 100644
--- a/drivers/net/tap/tap_flow.c
+++ b/drivers/net/tap/tap_flow.c
@@ -18,6 +18,8 @@
 #include 
 #include 
 
+#define BPF_LOG_BUFSIZ 1024
+
 #ifndef HAVE_TC_FLOWER
 /*
  * For kernels < 4.2, this enum is not defined. Runtime checks will be made to
@@ -1885,11 +1887,13 @@ static int rss_enable(struct pmd_internals *pmd,
 * the correct queue.
 */
for (i = 0; i < pmd->dev->data->nb_rx_queues; i++) {
-   pmd->bpf_fd[i] = tap_flow_bpf_cls_q(i);
+   char log_buf[BPF_LOG_BUFSIZ];
+
+   pmd->bpf_fd[i] = tap_flow_bpf_cls_q(i, log_buf, 
sizeof(log_buf));
if (pmd->bpf_fd[i] < 0) {
TAP_LOG(ERR,
-   "Failed to load BPF section %s for queue %d",
-   SEC_NAME_CLS_Q, i);
+   "Failed to load BPF section %s for queue %u: 
%s",
+   SEC_NAME_CLS_Q, i, log_buf);

[RFC 5/5] tap: stop "vendoring" linux bpf header

2024-01-05 Thread Stephen Hemminger
The proper place for finding bpf structures and functions is
in linux/bpf.h. The original version was trying to workaround the
case where the build environment was running on old pre BPF
version of Glibc, but the target environment had BPF. This is not
a supportable build method, and not how rest of DPDK works.

Having own private (and divergent) version headers leads to future
problems when BPF definitions evolve.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/bpf/bpf_extract.py |   1 -
 drivers/net/tap/tap_bpf.h  | 118 -
 drivers/net/tap/tap_bpf_api.c  |  16 ++--
 drivers/net/tap/tap_bpf_insns.h|   1 -
 4 files changed, 9 insertions(+), 127 deletions(-)
 delete mode 100644 drivers/net/tap/tap_bpf.h

diff --git a/drivers/net/tap/bpf/bpf_extract.py 
b/drivers/net/tap/bpf/bpf_extract.py
index b630c42b809f..73c4dafe4eca 100644
--- a/drivers/net/tap/bpf/bpf_extract.py
+++ b/drivers/net/tap/bpf/bpf_extract.py
@@ -65,7 +65,6 @@ def write_header(out, source):
 print(f' * Auto-generated from {source}', file=out)
 print(" * This not the original source file. Do NOT edit it.", file=out)
 print(" */\n", file=out)
-print("#include ", file=out)
 
 
 def main():
diff --git a/drivers/net/tap/tap_bpf.h b/drivers/net/tap/tap_bpf.h
deleted file mode 100644
index aa5a733525e1..
--- a/drivers/net/tap/tap_bpf.h
+++ /dev/null
@@ -1,118 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0
- * Copyright 2017 Mellanox Technologies, Ltd
- */
-
-#ifndef __TAP_BPF_H__
-#define __TAP_BPF_H__
-
-#include 
-
-/* Do not #include  since eBPF must compile on different
- * distros which may include partial definitions for eBPF (while the
- * kernel itself may support eBPF). Instead define here all that is needed
- */
-
-/* BPF_MAP_UPDATE_ELEM command flags */
-#defineBPF_ANY 0 /* create a new element or update an existing */
-
-/* BPF architecture instruction struct */
-struct bpf_insn {
-   __u8code;
-   __u8dst_reg:4;
-   __u8src_reg:4;
-   __s16   off;
-   __s32   imm; /* immediate value */
-};
-
-/* BPF program types */
-enum bpf_prog_type {
-   BPF_PROG_TYPE_UNSPEC,
-   BPF_PROG_TYPE_SOCKET_FILTER,
-   BPF_PROG_TYPE_KPROBE,
-   BPF_PROG_TYPE_SCHED_CLS,
-   BPF_PROG_TYPE_SCHED_ACT,
-};
-
-/* BPF commands types */
-enum bpf_cmd {
-   BPF_MAP_CREATE,
-   BPF_MAP_LOOKUP_ELEM,
-   BPF_MAP_UPDATE_ELEM,
-   BPF_MAP_DELETE_ELEM,
-   BPF_MAP_GET_NEXT_KEY,
-   BPF_PROG_LOAD,
-};
-
-/* BPF maps types */
-enum bpf_map_type {
-   BPF_MAP_TYPE_UNSPEC,
-   BPF_MAP_TYPE_HASH,
-};
-
-/* union of anonymous structs used with TAP BPF commands */
-union bpf_attr {
-   /* BPF_MAP_CREATE command */
-   struct {
-   __u32   map_type;
-   __u32   key_size;
-   __u32   value_size;
-   __u32   max_entries;
-   __u32   map_flags;
-   __u32   inner_map_fd;
-   };
-
-   /* BPF_MAP_UPDATE_ELEM, BPF_MAP_DELETE_ELEM commands */
-   struct {
-   __u32   map_fd;
-   __aligned_u64   key;
-   union {
-   __aligned_u64 value;
-   __aligned_u64 next_key;
-   };
-   __u64   flags;
-   };
-
-   /* BPF_PROG_LOAD command */
-   struct {
-   __u32   prog_type;
-   __u32   insn_cnt;
-   __aligned_u64   insns;
-   __aligned_u64   license;
-   __u32   log_level;
-   __u32   log_size;
-   __aligned_u64   log_buf;
-   __u32   kern_version;
-   __u32   prog_flags;
-   };
-} __rte_aligned(8);
-
-#ifndef __NR_bpf
-# if defined(__i386__)
-#  define __NR_bpf 357
-# elif defined(__x86_64__)
-#  define __NR_bpf 321
-# elif defined(__arm__)
-#  define __NR_bpf 386
-# elif defined(__aarch64__)
-#  define __NR_bpf 280
-# elif defined(__sparc__)
-#  define __NR_bpf 349
-# elif defined(__s390__)
-#  define __NR_bpf 351
-# elif defined(__powerpc__)
-#  define __NR_bpf 361
-# elif defined(__riscv)
-#  define __NR_bpf 280
-# elif defined(__loongarch__)
-#  define __NR_bpf 280
-# else
-#  error __NR_bpf not defined
-# endif
-#endif
-
-enum {
-   BPF_MAP_ID_KEY,
-   BPF_MAP_ID_SIMPLE,
-};
-
-#endif /* __TAP_BPF_H__ */
diff --git a/drivers/net/tap/tap_bpf_api.c b/drivers/net/tap/tap_bpf_api.c
index 29223b7f0ea7..54e469ebf5ff 100644
--- a/drivers/net/tap/tap_bpf_api.c
+++ b/drivers/net/tap/tap_bpf_api.c
@@ -2,17 +2,13 @@
  * Copyright 2017 Mellanox Technologies, Ltd
  */
 
-#include 
-#include 
 #include 
-#include 
+#include 
+#include 
 
-#include 
-#include 
 #include 
 #include 
-#include 
-#include 
+
 #include 
 
 static int bpf_load(enum bpf_prog_type type,
@@ -106,7 +102,13 @@ static inline __u64 ptr_to_u64(const void *ptr)
 static

RE: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Madhuker Mythri
The flow creation worked well for us on the Azure/Hyper-V platform using 
failsafe/tap PMD, using Kernel-5.15.0 and 5.4 versions.

So, does the original/existing code works well on Kernel-6.5 version ?

Thanks,
Madhuker.

-Original Message-
From: Stephen Hemminger  
Sent: 06 January 2024 01:01
To: Madhuker Mythri 
Cc: ferruh.yi...@amd.com; dev@dpdk.org
Subject: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the 
new Kernel-version upgrade requirements.

On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> adhuker Mythri 
> 
> When multiple queues configured, internally RSS will be enabled and thus TAP 
> BPF RSS byte-code will be loaded on to the Kernel using BPF system calls.
> 
> Here, the problem is loading the existing BPF byte-code to the Kernel-5.15 
> and above versions throws errors, i.e: Kernel BPF verifier not accepted this 
> existing BPF byte-code and system calls return error code "-7" as follows:
> 
> rss_add_actions(): Failed to load BPF section l3_l4 (7): Argument list 
> too long
> 
> 
> RCA:  These errors started coming after from the Kernel-5.15 version, in 
> which lots of new BPF verification restrictions were added for safe execution 
> of byte-code on to the Kernel, due to which existing BPF program verification 
> does not pass.
> Here are the major BPF verifier restrictions observed:
> 1) Need to use new BPF maps structure.
> 2) Kernel SKB data pointer access not allowed.
> 3) Undefined loops were not allowed(which are bounded by a variable value).
> 4) unreachable instructions(like: undefined array access).
> 
> After addressing all these Kernel BPF verifier restrictions able to load the 
> BPF byte-code onto the Kernel successfully.
> 
> Note: This new BPF changes supports from Kernel:4.10 version.
> 
> Bugzilla Id: 1329
> 
> Signed-off-by: Madhuker Mythri 

I tried this version on Debian testing which has:
kernel 6.5.0-5-amd64
clang 16.0.6

If build and run with the pre-compiled BPF then it will load the example flow  
(see 
https://urldefense.com/v3/__https://doc.dpdk.org/guides/nics/tap.html__;!!ACWV5N9M2RV99hQ!I6LFEsrAnlW2WMQlCy7Sxw-9MxJ_Qtchg-aZdal53Np6QtmC1wWsMdG_uT3zc57Yu7kvJANA-dQIirEUr-BWfaVXhGtd$
 )

But if I recompile the bpf program by using make in the tap/bpf directory, then 
the resulting bpf instructions will not make it past verifier.

With modified tap_bpf_api can get the log message as:

testpmd> flow create 0 priority 4 ingress pattern eth dst is 
testpmd> 0a:0b:0c:0d:0e:0f  / ipv4 / tcp / end actions rss queues 0 1 2 
testpmd> 3 end / end
rss_add_actions(): Failed to load BPF section l3_l4 (13): func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (bf) r6 = r1   ; R1=ctx(off=0,imm=0) 
R6_w=ctx(off=0,imm=0)
1: (18) r1 = 0x300; R1_w=768
3: (63) *(u32 *)(r10 -84) = r1; R1_w=768 R10=fp0 fp-88=
4: (bf) r2 = r10  ; R2_w=fp0 R10=fp0
5: (07) r2 += -84 ; R2_w=fp-84
6: (18) r1 = 0xfd ; R1_w=253
8: (85) call bpf_map_lookup_elem#1
R1 type=scalar expected=map_ptr
processed 7 insns (limit 100) max_states_per_insn 0 total_states 0 
peak_states 0 mark_read 0

port_flow_complain(): Caught PMD error type 16 (specific action): cause: 
0x7ffcef37e678, action not supported: Operation not supported



RE: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-05 Thread Madhuker Mythri
I had tested with 5.4 and 5.15 kernel-versions with this new changes and works 
well.
I found that the helper functions were introduced in Kernel 4.7 version. So, 
thinking this code should work from Kernel-4.9, as mentioned in the TAP PMD 
link: https://doc.dpdk.org/guides/nics/tap.html.

Yes, I understand BPF program code is very sensitive and difficult to debug. 
However, as per testing this work well(able to load the RSS BPF instructions) 
on Azure/Hyper-V platforms with failsafe/tap PMD.

Thanks,
Madhuker.

-Original Message-
From: Stephen Hemminger  
Sent: 05 January 2024 23:10
To: Madhuker Mythri 
Cc: ferruh.yi...@amd.com; dev@dpdk.org
Subject: Re: [External] : Re: [PATCH] net/tap: Modified TAP BPF program as per 
the new Kernel-version upgrade requirements.

On Fri, 5 Jan 2024 14:44:00 +
Madhuker Mythri  wrote:

> Hi Stephen,
> 
> The BPF helper man pages implies in that way and the SKB data pointer access 
> was working till 5.4 kernel also, however from Kernel-5.15 version, we do see 
> eBPF verifier throws error when we use SKB data pointer access.
> So, I had used this helper functions and able to resolve the errors. This is 
> helper functions are safe to use and also protects from any non-linear skb 
> data buffer access also.
> 
> So, I think using helper functions is better and safe way to access the SKB 
> data, instead of pointer access.
> 
> Thanks,
> Madhuker.

Using the accessors may mean it won't work with older kernels, but that is not 
a huge concern given how fragile this code is.