Add a new private API to create/destroy parser for GENEVE TLV options.

Signed-off-by: Michael Baum <>
Signed-off-by: Viacheslav Ovsiienko <>
Acked-by: Suanming Mou <>
 doc/guides/nics/mlx5.rst            | 122 ++++++
 doc/guides/platform/mlx5.rst        |   6 +-
 drivers/net/mlx5/        |   1 +
 drivers/net/mlx5/mlx5.c             |  30 +-
 drivers/net/mlx5/mlx5.h             |   8 +
 drivers/net/mlx5/mlx5_flow.c        |  30 ++
 drivers/net/mlx5/mlx5_flow.h        |  18 +
 drivers/net/mlx5/mlx5_flow_geneve.c | 627 ++++++++++++++++++++++++++++
 drivers/net/mlx5/rte_pmd_mlx5.h     | 102 +++++
 drivers/net/mlx5/        |   3 +
 10 files changed, 945 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_flow_geneve.c

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index f8930cb902..e82f7034aa 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -2314,6 +2314,128 @@ and disables ``avail_thresh_triggered``.
    testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 0 rate 50
+.. _geneve_parser_api:
+GENEVE TLV options parser
+NVIDIA ConnectX and BlueField devices support configure flex parser for
+`GENEVE TLV options 
+Each physical device has 7 DWs for GENEVE TLV options.
+Partial option configuration is supported, mask for data is provided in parser
+creation indicating which DWs configuration is requested. Only masked data DWs
+can be matched later as item field using flow API.
+Matching of ``type`` field is supported for each configured option.
+However, for matching ``class` field, the option should be configured with
+``match_on_class_mode=2``. Matching on ``length`` field is not supported.
+When ``match_on_class_mode=2`` is requested, one extra DW is consumed for it.
+Parser API
+An API to create/destroy GENEVE TLV parser is added.
+Although the parser is created per physical device, this API is port oriented.
+Each port should call this API before using GENEVE OPT item,
+but its configuration must use the same options list with same internal order
+configured by first port.
+Calling this API for different ports under same physical device doesn't consume
+more DWs, the first one creates the parser and the rest use same configuration.
+``struct rte_pmd_mlx5_geneve_tlv`` is used for single option configuration:
+.. _table_rte_pmd_mlx5_geneve_tlv:
+.. table:: GENEVE TLV
+   | Field                   | Value                                           
+   | ``option_class``        | class                                           
+   | ``option_type``         | type                                            
+   | ``option_len``          | data length in DW granularity                   
+   | ``match_on_class_mode`` | indicator about class field role in this option 
+   | ``offset``              | offset of the first sample in DW granularity    
+   | ``sample_len``          | number of DW to sample                          
+   | ``match_data_mask``     | array of DWs which each bit marks if this bit   
+   |                         | should be sampled                               
+Creates GENEVE TLV parser for the selected port.
+This function must be called before first use of GENEVE option.
+.. code-block:: c
+   void *
+   rte_pmd_mlx5_create_geneve_tlv_parser(uint16_t port_id,
+                                         const struct rte_pmd_mlx5_geneve_tlv 
+                                         uint8_t nb_options);
+The parser creation is done once for all GENEVE TLV options.
+For adding a new option, the exist parser should be destroyed first.
+- ``port_id``: port identifier of Ethernet device.
+- ``tlv_list``: list of GENEVE TLV options to create parser for them.
+- ``nb_options``: number of options in TLV list.
+Return values:
+- A valid handle in case of success, NULL otherwise (``rte_errno`` is also 
+  the following errors are defined.
+- ``ENODEV``: there is no Ethernet device for this port id.
+- ``EINVAL``: invalid GENEVE TLV option requested.
+- ``ENOTSUP``: the port doesn't support GENEVE TLV parsing.
+- ``EEXIST``: this port already has GENEVE TLV parser or another port under 
+  physical device has already prepared a different parser.
+- ``ENOMEM``: not enough memory to execute the function, or resource limitation
+  on the device.
+Destroy GENEVE TLV parser created by 
+This function must be called after last use of GENEVE option and before port
+.. code-block:: c
+   int
+   rte_pmd_mlx5_destroy_geneve_tlv_parser(void *handle);
+Failure to destroy a parser handle may occur when one of the options is used by
+valid template table.
+- ``handle``: handle for the GENEVE TLV parser object to be destroyed.
+Return values:
+- 0 on success, a negative errno value otherwise and ``rte_errno`` is set.
+* Supported only in HW steering (``dv_flow_en`` = 2).
+* Supported only when ``FLEX_PARSER_PROFILE_ENABLE`` = 8.
+* Supported for FW version **xx.37.0142** and above.
 Testpmd driver specific commands
diff --git a/doc/guides/platform/mlx5.rst b/doc/guides/platform/mlx5.rst
index 400000e284..d16508d0da 100644
--- a/doc/guides/platform/mlx5.rst
+++ b/doc/guides/platform/mlx5.rst
@@ -536,10 +536,14 @@ Below are some firmware configurations listed.
-- enable Geneve TLV option flow matching::
+- enable Geneve TLV option flow matching in SW steering::
+- enable Geneve TLV option flow matching in HW steering::
 - enable GTP flow matching::
diff --git a/drivers/net/mlx5/ b/drivers/net/mlx5/
index 69771c63ab..d705fe21bb 100644
--- a/drivers/net/mlx5/
+++ b/drivers/net/mlx5/
@@ -46,6 +46,7 @@ sources = files(
 if is_linux
     sources += files(
+            'mlx5_flow_geneve.c',
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index f9fc652136..5f8af31aea 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1722,6 +1722,19 @@ mlx5_get_physical_device(struct mlx5_common_device *cdev)
        return phdev;
+struct mlx5_physical_device *
+mlx5_get_locked_physical_device(struct mlx5_priv *priv)
+       pthread_mutex_lock(&mlx5_dev_ctx_list_mutex);
+       return priv->sh->phdev;
+       pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex);
 static void
 mlx5_physical_device_destroy(struct mlx5_physical_device *phdev)
@@ -2278,6 +2291,7 @@ int
 mlx5_dev_close(struct rte_eth_dev *dev)
        struct mlx5_priv *priv = dev->data->dev_private;
+       struct mlx5_dev_ctx_shared *sh = priv->sh;
        unsigned int i;
        int ret;
@@ -2290,7 +2304,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
                return 0;
-       if (!priv->sh)
+       if (!sh)
                return 0;
        if (priv->shared_refcnt) {
                DRV_LOG(ERR, "port %u is shared host in use (%u)",
@@ -2298,6 +2312,15 @@ mlx5_dev_close(struct rte_eth_dev *dev)
                rte_errno = EBUSY;
                return -EBUSY;
+       /* Check if shared GENEVE options created on context being closed. */
+       ret = mlx5_geneve_tlv_options_check_busy(priv);
+       if (ret) {
+               DRV_LOG(ERR, "port %u maintains shared GENEVE TLV options",
+                       dev->data->port_id);
+               return ret;
+       }
        DRV_LOG(DEBUG, "port %u closing device \"%s\"",
                ((priv->sh->cdev->ctx != NULL) ?
@@ -2330,6 +2353,11 @@ mlx5_dev_close(struct rte_eth_dev *dev)
+       if (priv->tlv_options != NULL) {
+               /* Free the GENEVE TLV parser resource. */
+               claim_zero(mlx5_geneve_tlv_options_destroy(priv->tlv_options, 
+               priv->tlv_options = NULL;
+       }
        if (priv->rxq_privs != NULL) {
                /* XXX race condition if mlx5_rx_burst() is still running. */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8bf7f86416..683029023e 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1419,6 +1419,8 @@ struct mlx5_dev_registers {
+struct mlx5_geneve_tlv_options;
  * Physical device structure.
  * This device is created once per NIC to manage recourses shared by all ports
@@ -1428,6 +1430,7 @@ struct mlx5_physical_device {
        LIST_ENTRY(mlx5_physical_device) next;
        struct mlx5_dev_ctx_shared *sh; /* Created on sherd context. */
        uint64_t guid; /* System image guid, the uniq ID of physical device. */
+       struct mlx5_geneve_tlv_options *tlv_options;
        uint32_t refcnt;
@@ -1950,6 +1953,8 @@ struct mlx5_priv {
        /* Action template list. */
        LIST_HEAD(flow_hw_at, rte_flow_actions_template) flow_hw_at;
        struct mlx5dr_context *dr_ctx; /**< HW steering DR context. */
+       /* Pointer to the GENEVE TLV options. */
+       struct mlx5_geneve_tlv_options *tlv_options;
        /* HW steering queue polling mechanism job descriptor LIFO. */
        uint32_t hws_strict_queue:1;
        /**< Whether all operations strictly happen on the same HWS queue. */
@@ -2088,6 +2093,9 @@ void mlx5_flow_counter_mode_config(struct rte_eth_dev 
 int mlx5_flow_aso_age_mng_init(struct mlx5_dev_ctx_shared *sh);
 int mlx5_aso_flow_mtrs_mng_init(struct mlx5_dev_ctx_shared *sh);
 int mlx5_flow_aso_ct_mng_init(struct mlx5_dev_ctx_shared *sh);
+struct mlx5_physical_device *
+mlx5_get_locked_physical_device(struct mlx5_priv *priv);
+void mlx5_unlock_physical_device(void);
 /* mlx5_ethdev.c */
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index acaf34ce52..5159e8e773 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -12524,3 +12524,33 @@ mlx5_flow_discover_ipv6_tc_support(struct rte_eth_dev 
        flow_list_destroy(dev, MLX5_FLOW_TYPE_GEN, flow_idx);
        return 0;
+void *
+rte_pmd_mlx5_create_geneve_tlv_parser(uint16_t port_id,
+                                     const struct rte_pmd_mlx5_geneve_tlv 
+                                     uint8_t nb_options)
+       return mlx5_geneve_tlv_parser_create(port_id, tlv_list, nb_options);
+       (void)port_id;
+       (void)tlv_list;
+       (void)nb_options;
+       DRV_LOG(ERR, "%s is not supported.", __func__);
+       rte_errno = ENOTSUP;
+       return NULL;
+rte_pmd_mlx5_destroy_geneve_tlv_parser(void *handle)
+       return mlx5_geneve_tlv_parser_destroy(handle);
+       (void)handle;
+       DRV_LOG(ERR, "%s is not supported.", __func__);
+       rte_errno = ENOTSUP;
+       return -rte_errno;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 6f720de14d..4bf9ed7e4d 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1336,6 +1336,8 @@ struct mlx5_action_construct_data {
 /* Flow item template struct. */
 struct rte_flow_pattern_template {
        LIST_ENTRY(rte_flow_pattern_template) next;
@@ -1650,6 +1652,11 @@ struct mlx5_flow_split_info {
        uint64_t prefix_layers; /**< Prefix subflow layers. */
+struct mlx5_hl_data {
+       uint8_t dw_offset;
+       uint32_t dw_mask;
 struct flow_hw_port_info {
        uint32_t regc_mask;
        uint32_t regc_value;
@@ -1765,6 +1772,12 @@ flow_hw_get_reg_id_from_ctx(void *dr_ctx,
        return REG_NON;
+void *
+mlx5_geneve_tlv_parser_create(uint16_t port_id,
+                             const struct rte_pmd_mlx5_geneve_tlv tlv_list[],
+                             uint8_t nb_options);
+int mlx5_geneve_tlv_parser_destroy(void *handle);
 void flow_hw_set_port_info(struct rte_eth_dev *dev);
 void flow_hw_clear_port_info(struct rte_eth_dev *dev);
 int flow_hw_create_vport_action(struct rte_eth_dev *dev);
@@ -2810,6 +2823,11 @@ mlx5_get_tof(const struct rte_flow_item *items,
             enum mlx5_tof_rule_type *rule_type);
 flow_hw_resource_release(struct rte_eth_dev *dev);
+mlx5_geneve_tlv_options_destroy(struct mlx5_geneve_tlv_options *options,
+                               struct mlx5_physical_device *phdev);
+mlx5_geneve_tlv_options_check_busy(struct mlx5_priv *priv);
 flow_hw_rxq_flag_set(struct rte_eth_dev *dev, bool enable);
 int flow_dv_action_validate(struct rte_eth_dev *dev,
diff --git a/drivers/net/mlx5/mlx5_flow_geneve.c 
new file mode 100644
index 0000000000..f23fb31aa0
--- /dev/null
+++ b/drivers/net/mlx5/mlx5_flow_geneve.c
@@ -0,0 +1,627 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+#include <rte_flow.h>
+#include <mlx5_malloc.h>
+#include <stdint.h>
+#include "generic/rte_byteorder.h"
+#include "mlx5.h"
+#include "mlx5_flow.h"
+#include "rte_pmd_mlx5.h"
+ * Single DW inside GENEVE TLV option.
+ */
+struct mlx5_geneve_tlv_resource {
+       struct mlx5_devx_obj *obj; /* FW object returned in parser creation. */
+       uint32_t modify_field; /* Modify field ID for this DW. */
+       uint8_t offset; /* Offset used in obj creation, from option start. */
+ * Single GENEVE TLV option context.
+ * May include some FW objects for different DWs in same option.
+ */
+struct mlx5_geneve_tlv_option {
+       uint8_t type;
+       uint16_t class;
+       uint8_t class_mode;
+       struct mlx5_hl_data match_data[MAX_GENEVE_OPTION_DATA_SIZE];
+       uint32_t match_data_size;
+       struct mlx5_hl_data hl_ok_bit;
+       struct mlx5_geneve_tlv_resource resources[MAX_GENEVE_OPTIONS_RESOURCES];
+       RTE_ATOMIC(uint32_t) refcnt;
+ * List of GENEVE TLV options.
+ */
+struct mlx5_geneve_tlv_options {
+       /* List of configured GENEVE TLV options. */
+       struct mlx5_geneve_tlv_option options[MAX_GENEVE_OPTIONS_RESOURCES];
+       /*
+        * Copy of list given in parser creation, use to compare with new
+        * configuration.
+        */
+       struct rte_pmd_mlx5_geneve_tlv spec[MAX_GENEVE_OPTIONS_RESOURCES];
+       rte_be32_t buffer[MAX_GENEVE_OPTION_TOTAL_DATA_SIZE];
+       uint8_t nb_options; /* Number entries in above lists. */
+       RTE_ATOMIC(uint32_t) refcnt;
+ * Create single GENEVE TLV option sample.
+ *
+ * @param ctx
+ *   Context returned from mlx5 open_device() glue function.
+ * @param attr
+ *   Pointer to GENEVE TLV option attributes structure.
+ * @param query_attr
+ *   Pointer to match sample info attributes structure.
+ * @param match_data
+ *   Pointer to header layout structure to update.
+ * @param resource
+ *   Pointer to single sample context to fill.
+ *
+ * @return
+ *   0 on success, a negative errno otherwise and rte_errno is set.
+ */
+static int
+mlx5_geneve_tlv_option_create_sample(void *ctx,
+                     struct mlx5_devx_geneve_tlv_option_attr *attr,
+                     struct mlx5_devx_match_sample_info_query_attr *query_attr,
+                     struct mlx5_hl_data *match_data,
+                     struct mlx5_geneve_tlv_resource *resource)
+       struct mlx5_devx_obj *obj;
+       int ret;
+       obj = mlx5_devx_cmd_create_geneve_tlv_option(ctx, attr);
+       if (obj == NULL)
+               return -rte_errno;
+       ret = mlx5_devx_cmd_query_geneve_tlv_option(ctx, obj, query_attr);
+       if (ret) {
+               claim_zero(mlx5_devx_cmd_destroy(obj));
+               return ret;
+       }
+       resource->obj = obj;
+       resource->offset = attr->sample_offset;
+       resource->modify_field = query_attr->modify_field_id;
+       match_data->dw_offset = query_attr->sample_dw_data;
+       match_data->dw_mask = 0xffffffff;
+       return 0;
+ * Destroy single GENEVE TLV option sample.
+ *
+ * @param resource
+ *   Pointer to single sample context to clean.
+ */
+static void
+mlx5_geneve_tlv_option_destroy_sample(struct mlx5_geneve_tlv_resource 
+       claim_zero(mlx5_devx_cmd_destroy(resource->obj));
+       resource->obj = NULL;
+ * Create single GENEVE TLV option.
+ *
+ * @param ctx
+ *   Context returned from mlx5 open_device() glue function.
+ * @param spec
+ *   Pointer to user configuration.
+ * @param option
+ *   Pointer to single GENEVE TLV option to fill.
+ *
+ * @return
+ *   0 on success, a negative errno otherwise and rte_errno is set.
+ */
+static int
+mlx5_geneve_tlv_option_create(void *ctx, const struct rte_pmd_mlx5_geneve_tlv 
+                             struct mlx5_geneve_tlv_option *option)
+       struct mlx5_devx_geneve_tlv_option_attr attr = {
+               .option_class = spec->option_class,
+               .option_type = spec->option_type,
+               .option_data_len = spec->option_len,
+               .option_class_ignore = spec->match_on_class_mode == 1 ? 0 : 1,
+               .offset_valid = 1,
+       };
+       struct mlx5_devx_match_sample_info_query_attr query_attr = {0};
+       struct mlx5_geneve_tlv_resource *resource;
+       uint8_t i, resource_id = 0;
+       int ret;
+       if (spec->match_on_class_mode == 2) {
+               /* Header is matchable, create sample for DW0. */
+               attr.sample_offset = 0;
+               resource = &option->resources[resource_id];
+               ret = mlx5_geneve_tlv_option_create_sample(ctx, &attr,
+                                                          &query_attr,
+                                                          resource);
+               if (ret)
+                       return ret;
+               resource_id++;
+       }
+       /*
+        * Create FW object for each DW request by user.
+        * Starting from 1 since FW offset starts from header.
+        */
+       for (i = 1; i <= spec->sample_len; ++i) {
+               if (spec->match_data_mask[i - 1] == 0)
+                       continue;
+               /* offset of data + offset inside data = specific DW offset. */
+               attr.sample_offset = spec->offset + i;
+               resource = &option->resources[resource_id];
+               ret = mlx5_geneve_tlv_option_create_sample(ctx, &attr,
+                                                          &query_attr,
+                                                          resource);
+               if (ret)
+                       goto error;
+               resource_id++;
+       }
+       /*
+        * Update the OK bit information according to last query.
+        * It should be same for each query under same option.
+        */
+       option->hl_ok_bit.dw_offset = query_attr.sample_dw_ok_bit;
+       option->hl_ok_bit.dw_mask = 1 << query_attr.sample_dw_ok_bit_offset;
+       option->match_data_size = spec->sample_len + 1;
+       option->type = spec->option_type;
+       option->class = spec->option_class;
+       option->class_mode = spec->match_on_class_mode;
+       rte_atomic_store_explicit(&option->refcnt, 0, rte_memory_order_relaxed);
+       return 0;
+       for (i = 0; i < resource_id; ++i) {
+               resource = &option->resources[i];
+               mlx5_geneve_tlv_option_destroy_sample(resource);
+       }
+       return ret;
+ * Destroy single GENEVE TLV option.
+ *
+ * @param option
+ *   Pointer to single GENEVE TLV option to destroy.
+ *
+ * @return
+ *   0 on success, a negative errno otherwise and rte_errno is set.
+ */
+static int
+mlx5_geneve_tlv_option_destroy(struct mlx5_geneve_tlv_option *option)
+       uint8_t i;
+       if (rte_atomic_load_explicit(&option->refcnt, 
rte_memory_order_relaxed)) {
+               DRV_LOG(ERR,
+                       "Option type %u class %u is still in used by %u 
+                       option->type, option->class, option->refcnt);
+               rte_errno = EBUSY;
+               return -rte_errno;
+       }
+       for (i = 0; option->resources[i].obj != NULL; ++i)
+               mlx5_geneve_tlv_option_destroy_sample(&option->resources[i]);
+       return 0;
+ * Copy the GENEVE TLV option user configuration for future comparing.
+ *
+ * @param dst
+ *   Pointer to internal user configuration copy.
+ * @param src
+ *   Pointer to user configuration.
+ * @param match_data_mask
+ *   Pointer to allocated data array.
+ */
+static void
+mlx5_geneve_tlv_option_copy(struct rte_pmd_mlx5_geneve_tlv *dst,
+                           const struct rte_pmd_mlx5_geneve_tlv *src,
+                           rte_be32_t *match_data_mask)
+       uint8_t i;
+       dst->option_type = src->option_type;
+       dst->option_class = src->option_class;
+       dst->option_len = src->option_len;
+       dst->offset = src->offset;
+       dst->match_on_class_mode = src->match_on_class_mode;
+       dst->sample_len = src->sample_len;
+       for (i = 0; i < dst->sample_len; ++i)
+               match_data_mask[i] = src->match_data_mask[i];
+       dst->match_data_mask = match_data_mask;
+ * Create list of GENEVE TLV options according to user configuration list.
+ *
+ * @param sh
+ *   Shared context the options are being created on.
+ * @param tlv_list
+ *   A list of GENEVE TLV options to create parser for them.
+ * @param nb_options
+ *   The number of options in TLV list.
+ *
+ * @return
+ *   A pointer to GENEVE TLV options parser structure on success,
+ *   NULL otherwise and rte_errno is set.
+ */
+static struct mlx5_geneve_tlv_options *
+mlx5_geneve_tlv_options_create(struct mlx5_dev_ctx_shared *sh,
+                              const struct rte_pmd_mlx5_geneve_tlv tlv_list[],
+                              uint8_t nb_options)
+       struct mlx5_geneve_tlv_options *options;
+       const struct rte_pmd_mlx5_geneve_tlv *spec;
+       rte_be32_t *data_mask;
+       uint8_t i, j;
+       int ret;
+       options = mlx5_malloc(MLX5_MEM_ZERO | MLX5_MEM_RTE,
+                             sizeof(struct mlx5_geneve_tlv_options),
+                             RTE_CACHE_LINE_SIZE, SOCKET_ID_ANY);
+       if (options == NULL) {
+               DRV_LOG(ERR,
+                       "Failed to allocate memory for GENEVE TLV options.");
+               rte_errno = ENOMEM;
+               return NULL;
+       }
+       for (i = 0; i < nb_options; ++i) {
+               spec = &tlv_list[i];
+               ret = mlx5_geneve_tlv_option_create(sh->cdev->ctx, spec,
+                                                   &options->options[i]);
+               if (ret < 0)
+                       goto error;
+               /* Copy the user list for comparing future configuration. */
+               data_mask = options->buffer + i * MAX_GENEVE_OPTION_DATA_SIZE;
+               mlx5_geneve_tlv_option_copy(&options->spec[i], spec, data_mask);
+       }
+       MLX5_ASSERT(sh->phdev->sh == NULL);
+       sh->phdev->sh = sh;
+       options->nb_options = nb_options;
+       options->refcnt = 1;
+       return options;
+       for (j = 0; j < i; ++j)
+               mlx5_geneve_tlv_option_destroy(&options->options[j]);
+       mlx5_free(options);
+       return NULL;
+ * Destroy GENEVE TLV options structure.
+ *
+ * @param options
+ *   Pointer to GENEVE TLV options structure to destroy.
+ * @param phdev
+ *   Pointer physical device options were created on.
+ *
+ * @return
+ *   0 on success, a negative errno otherwise and rte_errno is set.
+ */
+mlx5_geneve_tlv_options_destroy(struct mlx5_geneve_tlv_options *options,
+                               struct mlx5_physical_device *phdev)
+       uint8_t i;
+       int ret;
+       if (--options->refcnt)
+               return 0;
+       for (i = 0; i < options->nb_options; ++i) {
+               ret = mlx5_geneve_tlv_option_destroy(&options->options[i]);
+               if (ret < 0) {
+                       DRV_LOG(ERR,
+                               "Failed to destroy option %u, %u/%u is already 
+                               i, i, options->nb_options);
+                       return ret;
+               }
+       }
+       mlx5_free(options);
+       phdev->tlv_options = NULL;
+       phdev->sh = NULL;
+       return 0;
+ * Check if GENEVE TLV options are hosted on the current port
+ * and the port can be closed
+ *
+ * @param priv
+ *   Device private data.
+ *
+ * @return
+ *   0 on success, a negative EBUSY and rte_errno is set.
+ */
+mlx5_geneve_tlv_options_check_busy(struct mlx5_priv *priv)
+       struct mlx5_physical_device *phdev = 
+       struct mlx5_dev_ctx_shared *sh = priv->sh;
+       if (!phdev || phdev->sh != sh) {
+               mlx5_unlock_physical_device();
+               return 0;
+       }
+       if (!sh->phdev->tlv_options || sh->phdev->tlv_options->refcnt == 1) {
+               /* Mark port as being closed one */
+               sh->phdev->sh = NULL;
+               mlx5_unlock_physical_device();
+               return 0;
+       }
+       mlx5_unlock_physical_device();
+       rte_errno = EBUSY;
+       return -EBUSY;
+ * Validate GENEVE TLV option user request structure.
+ *
+ * @param attr
+ *   Pointer to HCA attribute structure.
+ * @param option
+ *   Pointer to user configuration.
+ *
+ * @return
+ *   0 on success, a negative errno otherwise and rte_errno is set.
+ */
+static int
+mlx5_geneve_tlv_option_validate(struct mlx5_hca_attr *attr,
+                               const struct rte_pmd_mlx5_geneve_tlv *option)
+       if (option->option_len > attr->max_geneve_tlv_option_data_len) {
+               DRV_LOG(ERR,
+                       "GENEVE TLV option length (%u) exceeds the limit (%u).",
+                       option->option_len,
+                       attr->max_geneve_tlv_option_data_len);
+               rte_errno = ENOTSUP;
+               return -rte_errno;
+       }
+       if (option->option_len < option->offset + option->sample_len) {
+               DRV_LOG(ERR,
+                       "GENEVE TLV option length is smaller than (offset + 
+               rte_errno = EINVAL;
+               return -rte_errno;
+       }
+       if (option->match_on_class_mode > 2) {
+               DRV_LOG(ERR,
+                       "GENEVE TLV option match_on_class_mode is invalid.");
+               rte_errno = EINVAL;
+               return -rte_errno;
+       }
+       return 0;
+ * Get the number of requested DWs in given GENEVE TLV option.
+ *
+ * @param option
+ *   Pointer to user configuration.
+ *
+ * @return
+ *   Number of requested DWs for given GENEVE TLV option.
+ */
+static uint8_t
+mlx5_geneve_tlv_option_get_nb_dws(const struct rte_pmd_mlx5_geneve_tlv *option)
+       uint8_t nb_dws = 0;
+       uint8_t i;
+       if (option->match_on_class_mode == 2)
+               nb_dws++;
+       for (i = 0; i < option->sample_len; ++i) {
+               if (option->match_data_mask[i] == 0xffffffff)
+                       nb_dws++;
+       }
+       return nb_dws;
+ * Compare GENEVE TLV option user request structure.
+ *
+ * @param option1
+ *   Pointer to first user configuration.
+ * @param option2
+ *   Pointer to second user configuration.
+ *
+ * @return
+ *   True if the options are equal, false otherwise.
+ */
+static bool
+mlx5_geneve_tlv_option_compare(const struct rte_pmd_mlx5_geneve_tlv *option1,
+                              const struct rte_pmd_mlx5_geneve_tlv *option2)
+       uint8_t i;
+       if (option1->option_type != option2->option_type ||
+           option1->option_class != option2->option_class ||
+           option1->option_len != option2->option_len ||
+           option1->offset != option2->offset ||
+           option1->match_on_class_mode != option2->match_on_class_mode ||
+           option1->sample_len != option2->sample_len)
+               return false;
+       for (i = 0; i < option1->sample_len; ++i) {
+               if (option1->match_data_mask[i] != option2->match_data_mask[i])
+                       return false;
+       }
+       return true;
+ * Check whether the given GENEVE TLV option list is equal to internal list.
+ * The lists are equal when they have same size and same options in the same
+ * order inside the list.
+ *
+ * @param options
+ *   Pointer to GENEVE TLV options structure.
+ * @param tlv_list
+ *   A list of GENEVE TLV options to compare.
+ * @param nb_options
+ *   The number of options in TLV list.
+ *
+ * @return
+ *   True if the lists are equal, false otherwise.
+ */
+static bool
+mlx5_is_same_geneve_tlv_options(const struct mlx5_geneve_tlv_options *options,
+                               const struct rte_pmd_mlx5_geneve_tlv tlv_list[],
+                               uint8_t nb_options)
+       const struct rte_pmd_mlx5_geneve_tlv *spec = options->spec;
+       uint8_t i;
+       if (options->nb_options != nb_options)
+               return false;
+       for (i = 0; i < nb_options; ++i) {
+               if (!mlx5_geneve_tlv_option_compare(&spec[i], &tlv_list[i]))
+                       return false;
+       }
+       return true;
+void *
+mlx5_geneve_tlv_parser_create(uint16_t port_id,
+                             const struct rte_pmd_mlx5_geneve_tlv tlv_list[],
+                             uint8_t nb_options)
+       struct mlx5_geneve_tlv_options *options = NULL;
+       struct mlx5_physical_device *phdev;
+       struct rte_eth_dev *dev;
+       struct mlx5_priv *priv;
+       struct mlx5_hca_attr *attr;
+       uint8_t total_dws = 0;
+       uint8_t i;
+       /*
+        * Validate the input before taking a lock and before any memory
+        * allocation.
+        */
+       if (rte_eth_dev_is_valid_port(port_id) < 0) {
+               DRV_LOG(ERR, "There is no Ethernet device for port %u.",
+                       port_id);
+               rte_errno = ENODEV;
+               return NULL;
+       }
+       dev = &rte_eth_devices[port_id];
+       priv = dev->data->dev_private;
+       if (priv->tlv_options) {
+               DRV_LOG(ERR, "Port %u already has GENEVE TLV parser.", port_id);
+               rte_errno = EEXIST;
+               return NULL;
+       }
+       if (priv->sh->config.dv_flow_en < 2) {
+               DRV_LOG(ERR,
+                       "GENEVE TLV parser is only supported for HW steering.");
+               rte_errno = ENOTSUP;
+               return NULL;
+       }
+       attr = &priv->sh->cdev->config.hca_attr;
+                   attr->max_geneve_tlv_options);
+       if (!attr->geneve_tlv_option_offset || !attr->geneve_tlv_sample ||
+           !attr->query_match_sample_info || !attr->geneve_tlv_opt) {
+               DRV_LOG(ERR, "Not enough capabilities to support GENEVE TLV 
parser, maybe old FW version");
+               rte_errno = ENOTSUP;
+               return NULL;
+       }
+       if (nb_options > MAX_GENEVE_OPTIONS_RESOURCES) {
+               DRV_LOG(ERR,
+                       "GENEVE TLV option number (%u) exceeds the limit (%u).",
+                       nb_options, MAX_GENEVE_OPTIONS_RESOURCES);
+               rte_errno = EINVAL;
+               return NULL;
+       }
+       for (i = 0; i < nb_options; ++i) {
+               if (mlx5_geneve_tlv_option_validate(attr, &tlv_list[i]) < 0) {
+                       DRV_LOG(ERR, "GENEVE TLV option %u is invalid.", i);
+                       return NULL;
+               }
+               total_dws += mlx5_geneve_tlv_option_get_nb_dws(&tlv_list[i]);
+       }
+       if (total_dws > MAX_GENEVE_OPTIONS_RESOURCES) {
+               DRV_LOG(ERR,
+                       "Total requested DWs (%u) exceeds the limit (%u).",
+                       total_dws, MAX_GENEVE_OPTIONS_RESOURCES);
+               rte_errno = EINVAL;
+               return NULL;
+       }
+       /* Take lock for this physical device and manage the options. */
+       phdev = mlx5_get_locked_physical_device(priv);
+       options = priv->sh->phdev->tlv_options;
+       if (options) {
+               if (!mlx5_is_same_geneve_tlv_options(options, tlv_list,
+                                                    nb_options)) {
+                       mlx5_unlock_physical_device();
+                       DRV_LOG(ERR, "Another port has already prepared 
different GENEVE TLV parser.");
+                       rte_errno = EEXIST;
+                       return NULL;
+               }
+               if (phdev->sh == NULL) {
+                       mlx5_unlock_physical_device();
+                       DRV_LOG(ERR, "GENEVE TLV options are hosted on port 
being closed.");
+                       rte_errno = EBUSY;
+                       return NULL;
+               }
+               /* Use existing options. */
+               options->refcnt++;
+               goto exit;
+       }
+       /* Create GENEVE TLV options for this physical device. */
+       options = mlx5_geneve_tlv_options_create(priv->sh, tlv_list, 
+       if (!options) {
+               mlx5_unlock_physical_device();
+               return NULL;
+       }
+       phdev->tlv_options = options;
+       mlx5_unlock_physical_device();
+       priv->tlv_options = options;
+       return priv;
+mlx5_geneve_tlv_parser_destroy(void *handle)
+       struct mlx5_priv *priv = (struct mlx5_priv *)handle;
+       struct mlx5_physical_device *phdev;
+       int ret;
+       if (priv == NULL) {
+               DRV_LOG(ERR, "Handle input is invalid (NULL).");
+               rte_errno = EINVAL;
+               return -rte_errno;
+       }
+       if (priv->tlv_options == NULL) {
+               DRV_LOG(ERR, "This parser has been already released.");
+               rte_errno = ENOENT;
+               return -rte_errno;
+       }
+       /* Take lock for this physical device and manage the options. */
+       phdev = mlx5_get_locked_physical_device(priv);
+       /* Destroy the options */
+       ret = mlx5_geneve_tlv_options_destroy(phdev->tlv_options, phdev);
+       if (ret < 0) {
+               mlx5_unlock_physical_device();
+               return ret;
+       }
+       priv->tlv_options = NULL;
+       mlx5_unlock_physical_device();
+       return 0;
+#endif /* defined(HAVE_IBV_FLOW_DV_SUPPORT) || 
diff --git a/drivers/net/mlx5/rte_pmd_mlx5.h b/drivers/net/mlx5/rte_pmd_mlx5.h
index 654dd3cff3..004be0eea1 100644
--- a/drivers/net/mlx5/rte_pmd_mlx5.h
+++ b/drivers/net/mlx5/rte_pmd_mlx5.h
@@ -229,6 +229,108 @@ enum rte_pmd_mlx5_flow_engine_mode {
 int rte_pmd_mlx5_flow_engine_set_mode(enum rte_pmd_mlx5_flow_engine_mode mode, 
uint32_t flags);
+ * User configuration structure using to create parser for single GENEVE TLV 
+ */
+struct rte_pmd_mlx5_geneve_tlv {
+       /**
+        * The class of the GENEVE TLV option.
+        * Relevant only when 'match_on_class_mode' is 1.
+        */
+       rte_be16_t option_class;
+       /**
+        * The type of the GENEVE TLV option.
+        * This field is the identifier of the option.
+        */
+       uint8_t option_type;
+       /**
+        * The length of the GENEVE TLV option data excluding the option header
+        * in DW granularity.
+        */
+       uint8_t option_len;
+       /**
+        * Indicator about class field role in this option:
+        *  0 - class is ignored.
+        *  1 - class is fixed (the class defines the option along with the 
+        *  2 - class matching per flow.
+        */
+       uint8_t match_on_class_mode;
+       /**
+        * The offset of the first sample in DW granularity.
+        * This offset is relative to first of option data.
+        * The 'match_data_mask' corresponds to option data since this offset.
+        */
+       uint8_t offset;
+       /**
+        * The number of DW to sample.
+        * This field describes the length of 'match_data_mask' in DW
+        * granularity.
+        */
+       uint8_t sample_len;
+       /**
+        * Array of DWs which each bit marks if this bit should be sampled.
+        * Each nonzero DW consumes one DW from maximum 7 DW in total.
+        */
+       rte_be32_t *match_data_mask;
+ * Creates GENEVE TLV parser for the selected port.
+ * This function must be called before first use of GENEVE option.
+ *
+ * This API is port oriented, but the configuration is done once for all ports
+ * under the same physical device. Each port should call this API before using
+ * GENEVE OPT item, but it must use the same options in the same order inside
+ * the list.
+ *
+ * Each physical device has 7 DWs for GENEVE TLV options. Each nonzero element
+ * in 'match_data_mask' array consumes one DW, and choosing matchable mode for
+ * class consumes additional one.
+ * Calling this API for second port under same physical device doesn't consume
+ * more DW, it uses same configuration.
+ *
+ * @param[in] port_id
+ *   The port identifier of the Ethernet device.
+ * @param[in] tlv_list
+ *   A list of GENEVE TLV options to create parser for them.
+ * @param[in] nb_options
+ *   The number of options in TLV list.
+ *
+ * @return
+ *   A pointer to TLV handle on success, NULL otherwise and rte_errno is set.
+ *   Possible values for rte_errno:
+ *   - ENOMEM - not enough memory to create GENEVE TLV parser.
+ *   - EEXIST - this port already has GENEVE TLV parser or another port under
+ *              same physical device has already prepared a different parser.
+ *   - EINVAL - invalid GENEVE TLV requested.
+ *   - ENODEV - there is no Ethernet device for this port id.
+ *   - ENOTSUP - the port doesn't support GENEVE TLV parsing.
+ */
+void *
+rte_pmd_mlx5_create_geneve_tlv_parser(uint16_t port_id,
+                                     const struct rte_pmd_mlx5_geneve_tlv 
+                                     uint8_t nb_options);
+ * Destroy GENEVE TLV parser for the selected port.
+ * This function must be called after last use of GENEVE option and before port
+ * closing.
+ *
+ * @param[in] handle
+ *   Handle for the GENEVE TLV parser object to be destroyed.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ *   Possible values for rte_errno:
+ *   - EINVAL - invalid handle.
+ *   - ENOENT - there is no valid GENEVE TLV parser in this handle.
+ *   - EBUSY - one of options is in used by template table.
+ */
+rte_pmd_mlx5_destroy_geneve_tlv_parser(void *handle);
 #ifdef __cplusplus
diff --git a/drivers/net/mlx5/ b/drivers/net/mlx5/
index 99f5ab754a..8fb0e07303 100644
--- a/drivers/net/mlx5/
+++ b/drivers/net/mlx5/
@@ -17,4 +17,7 @@ EXPERIMENTAL {
        # added in 23.03
+       # added in 24.03
+       rte_pmd_mlx5_create_geneve_tlv_parser;
+       rte_pmd_mlx5_destroy_geneve_tlv_parser;

Reply via email to