Re: [dpdk-dev] [PATCH v8 0/5] Support power monitor in virtio/vhost PMD

2021-10-25 Thread Xia, Chenbo
Hi Ferruh,

> -Original Message-
> From: Li, Miao 
> Sent: Monday, October 25, 2021 10:47 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; maxime.coque...@redhat.com; Li, Miao
> 
> Subject: [PATCH v8 0/5] Support power monitor in virtio/vhost PMD
> 
> This patchset implements rte_power_monitor API in virtio and vhost PMD
> to reduce power consumption when no packet come in. This API can be
> called and tested in l3fwd-power after adding vhost and virtio support
> in l3fwd-power and ignoring the rx queue information check in
> queue_stopped().
> 
> v8:
> -rebase on lastest repo
> -update the release note
> -modify some titles
> -update commit log
> -add the fixes and stable tags

The new version LGTM. Will you pick up directly to next-net if you also think
it's good?

Thanks,
Chenbo

> --
> 2.25.1



[dpdk-dev] [Bug 834] eventdev/eth_rx: callback not invoked in vector timeout case

2021-10-25 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=834

Bug ID: 834
   Summary: eventdev/eth_rx: callback not invoked in vector
timeout case
   Product: DPDK
   Version: 21.08
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: major
  Priority: Normal
 Component: eventdev
  Assignee: dev@dpdk.org
  Reporter: s.v.naga.haris...@intel.com
  Target Milestone: ---

Hi all,

In Rx_adapter, 
the pending events vectors are checked in the service function for timeout
case. Incase of timeout, the event is made ready by removing the event vector
from the pending vector list and updating event buffer count in
rxa_vector_expire function.

The rx_adapter registered callback function is not invoked inside
rxa_vector_expire function for these timeout vectors.

The expected behavior is that, the callback function need to be invoked for all
successful enqueued packets to event buffer.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[dpdk-dev] [PATCH v2] net/ice: simplify the use of DCF device reset

2021-10-25 Thread dapengx . yu
From: Dapeng Yu 

After DCF is reset by PF, the DCF device un-initialization cannot
function normally since the resource is already invalidated. So
reset DCF twice is necessary, the first reset re-initializes the DCF,
only then second reset can clean the filters successfully.

This patch detects the reset flag, which is set by PF on DCF reset,
if the flag is true, do DCF reset twice automatically.

Fixes: 1a86f4dbdf42 ("net/ice: support DCF device reset")
Cc: sta...@dpdk.org

Signed-off-by: Dapeng Yu 
---
V2:
* Ignore the returned error of dev_uninit when DCF is reset by PF
---
 drivers/net/ice/ice_dcf_ethdev.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/ice/ice_dcf_ethdev.c
index 7cb8066416..f51b7cbb8b 100644
--- a/drivers/net/ice/ice_dcf_ethdev.c
+++ b/drivers/net/ice/ice_dcf_ethdev.c
@@ -1031,8 +1031,27 @@ ice_dcf_tm_ops_get(struct rte_eth_dev *dev __rte_unused,
 static int
 ice_dcf_dev_reset(struct rte_eth_dev *dev)
 {
+   struct ice_dcf_adapter *ad = dev->data->dev_private;
+   struct iavf_hw *hw = &ad->real_hw.avf;
int ret;
 
+   if (!(IAVF_READ_REG(hw, IAVF_VF_ARQLEN1) &
+ IAVF_VF_ARQLEN1_ARQENABLE_MASK)) {
+   if (!ad->real_hw.resetting)
+   ad->real_hw.resetting = true;
+   PMD_DRV_LOG(ERR, "The DCF has been reset by PF");
+
+   /*
+* Do the extra dev uninit/init to make DCF get resource.
+* Then the next uninit/init can clean filters successfully.
+*/
+   ice_dcf_dev_uninit(dev);
+
+   ret = ice_dcf_dev_init(dev);
+   if (ret)
+   return ret;
+   }
+
ret = ice_dcf_dev_uninit(dev);
if (ret)
return ret;
-- 
2.27.0



Re: [dpdk-dev] [PATCH v1 08/14] vhost: improve IO vector logic

2021-10-25 Thread Hu, Jiayu
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Monday, October 18, 2021 9:02 PM
> To: dev@dpdk.org; Xia, Chenbo ; Hu, Jiayu
> ; Wang, YuanX ; Ma,
> WenwuX ; Richardson, Bruce
> ; Mcnamara, John
> ; david.march...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [PATCH v1 08/14] vhost: improve IO vector logic
> 
> IO vectors and their iterators arrays were part of the async metadata but not
> their indexes.
> 
> In order to makes this more consistent, the patch adds the indexes to the
> async metadata. Doing that, we can avoid triggering DMA transfer within the
> loop as it IO vector index overflow is now prevented in the
> async_mbuf_to_desc() function.
> 
> Note that previous detection mechanism was broken since the overflow
> already happened when detected, so OOB memory access would already
> have happened.
> 
> With this changes done, virtio_dev_rx_async_submit_split()
> and virtio_dev_rx_async_submit_packed() can be further simplified.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  lib/vhost/vhost.h  |   2 +
>  lib/vhost/virtio_net.c | 291 ++---
>  2 files changed, 131 insertions(+), 162 deletions(-)
> 
> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index
> dae9a1ac2d..812d4c55a5 100644
> --- a/lib/vhost/vhost.h
> +++ b/lib/vhost/vhost.h
> @@ -134,6 +134,8 @@ struct vhost_async {
> 
>   struct rte_vhost_iov_iter iov_iter[VHOST_MAX_ASYNC_IT];
>   struct rte_vhost_iovec iovec[VHOST_MAX_ASYNC_VEC];
> + uint16_t iter_idx;
> + uint16_t iovec_idx;
> 
>   /* data transfer status */
>   struct async_inflight_info *pkts_info; diff --git 
> a/lib/vhost/virtio_net.c
> b/lib/vhost/virtio_net.c index ae7dded979..c80823a8de 100644
> --- a/lib/vhost/virtio_net.c
> +++ b/lib/vhost/virtio_net.c
> @@ -924,33 +924,86 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct
> vhost_virtqueue *vq,
>   return error;
>  }
> 
> +static __rte_always_inline int
> +async_iter_initialize(struct vhost_async *async) {
> + struct rte_vhost_iov_iter *iter;
> +
> + if (unlikely(async->iovec_idx >= VHOST_MAX_ASYNC_VEC)) {
> + VHOST_LOG_DATA(ERR, "no more async iovec available\n");
> + return -1;
> + }
> +
> + iter = async->iov_iter + async->iter_idx;
> + iter->iov = async->iovec + async->iovec_idx;
> + iter->nr_segs = 0;
> +
> + return 0;
> +}
> +
> +static __rte_always_inline int
> +async_iter_add_iovec(struct vhost_async *async, void *src, void *dst,
> +size_t len) {
> + struct rte_vhost_iov_iter *iter;
> + struct rte_vhost_iovec *iovec;
> +
> + if (unlikely(async->iovec_idx >= VHOST_MAX_ASYNC_VEC)) {
> + VHOST_LOG_DATA(ERR, "no more async iovec available\n");
> + return -1;
> + }

For large packets, like 64KB in iperf test, async_iter_add_iovec() frequently
reports the log above, as we run out of iovecs. I think it's better to change
the log from ERR to DEBUG.

In addition, the size of iovec is too small. For burst 32 and 64KB pkts, it's
easy to run out of iovecs and we will drop the pkts to enqueue if it happens,
which hurts performance. Enlarging the array is a choice to mitigate the
issue, but another solution is to reallocate iovec once we run out of it. How do
you think?

Thanks,
Jiayu
> +
> + iter = async->iov_iter + async->iter_idx;
> + iovec = async->iovec + async->iovec_idx;
> +
> + iovec->src_addr = src;
> + iovec->dst_addr = dst;
> + iovec->len = len;
> +
> + iter->nr_segs++;
> + async->iovec_idx++;
> +
> + return 0;
> +}


Re: [dpdk-dev] [v3] cryptodev: add telemetry endpoint for cryptodev capabilities

2021-10-25 Thread Akhil Goyal
> +#define CRYPTO_CAPS_SZ \
> + (RTE_ALIGN_CEIL(sizeof(struct rte_cryptodev_capabilities), \
> + sizeof(uint64_t)) /\
> +  sizeof(uint64_t))
> +
> +static int
> +crypto_caps_array(struct rte_tel_data *d,
> +   const struct rte_cryptodev_capabilities *capabilities)
> +{
> + const struct rte_cryptodev_capabilities *dev_caps;
> + union caps_u {
> + struct rte_cryptodev_capabilities dev_caps;
> + uint64_t val[CRYPTO_CAPS_SZ];
> + } caps;
> + unsigned int i = 0, j, n = 0;
> +
> + rte_tel_data_start_array(d, RTE_TEL_U64_VAL);
> +
> + while ((dev_caps = &capabilities[i++])->op !=
> +RTE_CRYPTO_OP_TYPE_UNDEFINED) {
> + memset(&caps, 0, sizeof(caps));
> + rte_memcpy(&caps.dev_caps, dev_caps,
> sizeof(capabilities[0]));
> + for (j = 0; j < CRYPTO_CAPS_SZ; j++)
> + rte_tel_data_add_array_u64(d, caps.val[j]);
> + ++n;
> + }
> +
> + return n;
> +}

We do not need 2 iterators i and n. both are for same purpose.
Also, union is not required for caps. 

static int
crypto_caps_array(struct rte_tel_data *d,
  const struct rte_cryptodev_capabilities *capabilities)
{
const struct rte_cryptodev_capabilities *dev_caps;
uint64_t caps_val[CRYPTO_CAPS_SZ];
unsigned int j, n = 0;

rte_tel_data_start_array(d, RTE_TEL_U64_VAL);

while ((dev_caps = &capabilities[n++])->op !=
RTE_CRYPTO_OP_TYPE_UNDEFINED) {
memset(&caps_val, 0, CRYPTO_CAPS_SZ * sizeof(uint64_t));
rte_memcpy(caps_val, dev_caps, sizeof(capabilities[0]));
for (j = 0; j < CRYPTO_CAPS_SZ; j++)
rte_tel_data_add_array_u64(d, caps_val[j]);
}

return n;
}

> +
> +static int
> +cryptodev_handle_dev_caps(const char *cmd __rte_unused, const char
> *params,
> +   struct rte_tel_data *d)
> +{
> + struct rte_cryptodev_info dev_info;
> + struct rte_tel_data *crypto_caps;
> + int crypto_caps_n;
> + char *end_param;
> + int dev_id;
> +
> + if (!params || strlen(params) == 0 || !isdigit(*params))
> + return -EINVAL;
> +
> + dev_id = strtoul(params, &end_param, 0);
> + if (*end_param != '\0')
> + CDEV_LOG_ERR("Extra parameters passed to command,
> ignoring");
> + if (!rte_cryptodev_is_valid_dev(dev_id))
> + return -EINVAL;
> +
> + rte_tel_data_start_dict(d);
> + crypto_caps = rte_tel_data_alloc();
> + if (!crypto_caps)
> + return -ENOMEM;
> +
> + rte_cryptodev_info_get(dev_id, &dev_info);
> + crypto_caps_n = crypto_caps_array(crypto_caps,
> dev_info.capabilities);
> + rte_tel_data_add_dict_container(d, "crypto_caps", crypto_caps, 0);
> + rte_tel_data_add_dict_int(d, "crypto_caps_n", crypto_caps_n);
> +
> + return 0;
> +}
> +
>  RTE_INIT(cryptodev_init_telemetry)
>  {
>   rte_telemetry_register_cmd("/cryptodev/info",
> cryptodev_handle_dev_info,
> @@ -2517,4 +2579,7 @@ RTE_INIT(cryptodev_init_telemetry)
>   rte_telemetry_register_cmd("/cryptodev/stats",
>   cryptodev_handle_dev_stats,
>   "Returns the stats for a cryptodev. Parameters: int
> dev_id");
> + rte_telemetry_register_cmd("/cryptodev/caps",
> + cryptodev_handle_dev_caps,
> + "Returns the capabilities for a cryptodev. Parameters:
> int dev_id");
>  }
> --
> 2.25.1



Re: [dpdk-dev] [PATCH] net: fix pedantic build for L2TPv2 definitions

2021-10-25 Thread David Marchand
On Sun, Oct 24, 2021 at 3:12 PM Raslan Darawsheh  wrote:
> > Build is broken on RHEL7 following introduction of this new protocol.
> >
> > Fixes: 3a929df1f286 ("ethdev: support L2TPv2 and PPP procotol")
> >
> > Signed-off-by: David Marchand 
> Tested-by: Raslan Darawsheh 

Applied, thanks.


-- 
David Marchand



Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library

2021-10-25 Thread Mattias Rönnblom
On 2021-10-19 20:14, jer...@marvell.com wrote:
> From: Jerin Jacob 
>
>
> Dataplane Workload Accelerator library
> ==
>
> Definition of Dataplane Workload Accelerator
> 
> Dataplane Workload Accelerator(DWA) typically contains a set of CPUs,
> Network controllers and programmable data acceleration engines for
> packet processing, cryptography, regex engines, baseband processing, etc.
> This allows DWA to offload  compute/packet processing/baseband/
> cryptography-related workload from the host CPU to save the cost and power.
> Also to enable scaling the workload by adding DWAs to the Host CPU as needed.
>
> Unlike other devices in DPDK, the DWA device is not fixed-function
> due to the fact that it has CPUs and programmable HW accelerators.


There are already several instances of DPDK devices with pure-software 
implementation. In this regard, a DPU/SmartNIC represents nothing new. 
What's new, it seems to me, is a much-increased need to 
configure/arrange the processing in complex manners, to avoid bouncing 
everything to the host CPU. Something like P4 or rte_flow-based hooks or 
some other kind of extension. The eventdev adapters solve the same 
problem (where on some systems packets go through the host CPU on their 
way to the event device, and others do not) - although on a *much* 
smaller scale.


"Not-fixed function" seems to call for more hot plug support in the 
device APIs. Such functionality could then be reused by anything that 
can be reconfigured dynamically (FPGAs, firmware-programmed 
accelerators, etc.), but which may not be able to serve as a RPC 
endpoint, like a SmartNIC.


DWA could be some kind of DPDK-internal framework for managing certain 
type of DPUs, but should it be exposed to the user application?


> This enables DWA personality/workload to be completely programmable.
> Typical examples of DWA offloads are Flow/Session management,
> Virtual switch, TLS offload, IPsec offload, l3fwd offload, etc.
> Motivation for the new library
> --
> Even though, a lot of semiconductor vendors offers a different form of DWA,
> such as DPU(often called Smart-NIC), GPU, IPU, XPU, etc.,
> Due to the lack of standard APIs to "Define the workload" and
> "Communication between HOST and DWA", it is difficult for DPDK
> consumers to use them in a portable way across different DWA vendors
> and enable it in cloud environments.
>
>
> Contents of RFC
> --
> This RFC attempts to define standard APIs for:
>
> 1) Definition of Profiles corresponding to well defined workloads, which 
> includes
> a set of TLV(Messages) as a request  and response scheme to define
> the contract between host and DWA to offload a workload.
> (See lib/dwa/rte_dwa_profile_* header files)
> 2) Discovery of a DWAs capabilities (e.g. which specific workloads it can 
> support)
> in a vendor independent fashion. (See rte_dwa_dev_disc_profiles())
> 3) Attaching a set of profiles to a DWA device(See rte_dwa_dev_attach())
> 4) A communication framework between Host and DWA(See rte_dwa_ctrl_op() for
> control plane and rte_dwa_port_host_* for user plane)
> 5) Virtualization of DWA hardware and firmware (Use standard DPDK device/bus 
> model)
> 6) Enablement of administrative functions such as FW updates,
> resource partitioning in a DWA like items in global in
> nature that is applicable for all DWA device under the DWA.
> (See rte_dwa_profile_admin.h)
>
> Also, this RFC define the L3FWD profile to offload L3FWD workload to DWA.
> This RFC defines an ethernet-style host port for Host to DWA communication.
> Different host port types may be required to cover the large spectrum of DWA 
> types as
> transports like PCIe DMA, Shared Memory, or Ethernet are fundamentally 
> different,
> and optimal performance need host port specific APIs.
>
> The framework does not force an abstract of different transport interfaces as
> single API, instead, decouples TLV from the transport interface and focuses on
> defining the TLVs and leaving vendors to specify the host ports
> specific to their DWA architecture.
>
>
> Roadmap
> ---
> 1) Address the comments for this RFC and enable the common code
> 2) SW drivers/infrastructure for `DWA` and `DWA device`
> as two separate DPDK processes over `memif` DPDK ethdev driver for
> L3FWD offload. This is to enable the framework without any special HW.
> 3) Example DWA device application for L3FWD profile.
> 4) Marvell DWA Device drivers.
> 5) Based on community interest new profile can be added in the future.
>
>
> DWA library framework
> -
>
> DWA components:
>
>+--> rte_dwa_port_host_*()
>|  (User Plane traffic as 
> TLV)
>|
>   +--+ 

[dpdk-dev] [PATCH] net/ice: fix flow redirect failure

2021-10-25 Thread dapengx . yu
From: Dapeng Yu 

When the switch flow rules are redirected, if rule is removed but not
added successfully, the rule will lost meta data, and cannot be added.

This patch saves the flow rule's meta, so when the flow rule is added
again, the meta can be used to make addition succeed.

Fixes: 397b4b3c5095 ("net/ice: enable flow redirect on switch")
Cc: sta...@dpdk.org

Signed-off-by: Dapeng Yu 
---
 drivers/net/ice/ice_switch_filter.c | 161 ++--
 1 file changed, 128 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ice/ice_switch_filter.c 
b/drivers/net/ice/ice_switch_filter.c
index 6b0c1bff1e..1c931787e3 100644
--- a/drivers/net/ice/ice_switch_filter.c
+++ b/drivers/net/ice/ice_switch_filter.c
@@ -180,6 +180,12 @@ struct sw_meta {
struct ice_adv_rule_info rule_info;
 };
 
+struct sw_rule_query_meta {
+   struct ice_rule_query_data *sw_query_data;
+   /* When redirect, if rule is removed but not added, save meta here */
+   struct sw_meta *sw_meta;
+};
+
 static struct ice_flow_parser ice_switch_dist_parser;
 static struct ice_flow_parser ice_switch_perm_parser;
 
@@ -359,7 +365,7 @@ ice_switch_create(struct ice_adapter *ad,
struct ice_pf *pf = &ad->pf;
struct ice_hw *hw = ICE_PF_TO_HW(pf);
struct ice_rule_query_data rule_added = {0};
-   struct ice_rule_query_data *filter_ptr;
+   struct sw_rule_query_meta *query_meta_ptr = NULL;
struct ice_adv_lkup_elem *list =
((struct sw_meta *)meta)->list;
uint16_t lkups_cnt =
@@ -381,18 +387,30 @@ ice_switch_create(struct ice_adapter *ad,
}
ret = ice_add_adv_rule(hw, list, lkups_cnt, rule_info, &rule_added);
if (!ret) {
-   filter_ptr = rte_zmalloc("ice_switch_filter",
-   sizeof(struct ice_rule_query_data), 0);
-   if (!filter_ptr) {
+   query_meta_ptr = rte_zmalloc("ice_switch_query_meta",
+   sizeof(struct sw_rule_query_meta), 0);
+   if (!query_meta_ptr) {
+   rte_flow_error_set(error, EINVAL,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "No memory for ice_switch_query_meta");
+   goto error;
+   }
+
+   query_meta_ptr->sw_query_data =
+   rte_zmalloc("ice_switch_query",
+   sizeof(struct ice_rule_query_data), 0);
+   if (!query_meta_ptr->sw_query_data) {
rte_flow_error_set(error, EINVAL,
   RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
   "No memory for ice_switch_filter");
goto error;
}
-   flow->rule = filter_ptr;
-   rte_memcpy(filter_ptr,
-   &rule_added,
-   sizeof(struct ice_rule_query_data));
+
+   rte_memcpy(query_meta_ptr->sw_query_data,
+  &rule_added,
+  sizeof(struct ice_rule_query_data));
+
+   flow->rule = query_meta_ptr;
} else {
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
@@ -407,10 +425,28 @@ ice_switch_create(struct ice_adapter *ad,
 error:
rte_free(list);
rte_free(meta);
+   rte_free(query_meta_ptr);
 
return -rte_errno;
 }
 
+static void
+ice_switch_filter_rule_free(struct rte_flow *flow)
+{
+   struct sw_rule_query_meta *query_meta_ptr =
+   (struct sw_rule_query_meta *)flow->rule;
+
+   if (query_meta_ptr) {
+   rte_free(query_meta_ptr->sw_query_data);
+
+   if (query_meta_ptr->sw_meta)
+   rte_free(query_meta_ptr->sw_meta->list);
+
+   rte_free(query_meta_ptr->sw_meta);
+   }
+   rte_free(query_meta_ptr);
+}
+
 static int
 ice_switch_destroy(struct ice_adapter *ad,
struct rte_flow *flow,
@@ -418,12 +454,10 @@ ice_switch_destroy(struct ice_adapter *ad,
 {
struct ice_hw *hw = &ad->hw;
int ret;
-   struct ice_rule_query_data *filter_ptr;
-
-   filter_ptr = (struct ice_rule_query_data *)
-   flow->rule;
+   struct sw_rule_query_meta *query_meta_ptr =
+   (struct sw_rule_query_meta *)flow->rule;
 
-   if (!filter_ptr) {
+   if (!query_meta_ptr || !query_meta_ptr->sw_query_data) {
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
"no such flow"
@@ -431,7 +465,7 @@ ice_switch_destroy(struct ice_adapter *ad,
return -rte_errno;
}
 
-   ret = ice_rem_adv_rule_by_id(hw, filter_ptr);
+   ret = ice_rem_adv_rule_by_id(hw, query_meta_ptr->sw_query_data);
if (ret) {
rte_flow_error_set(error, EINVAL,
  

Re: [dpdk-dev] [PATCH v1] test: fix devargs test case memory leak

2021-10-25 Thread David Marchand
On Sat, Oct 23, 2021 at 2:40 PM David Marchand
 wrote:
> On Sat, Oct 23, 2021 at 2:18 PM Xueming Li  wrote:
> >
> > In layer argument test function, kvargs are parsed and checked without
> > free. This patch calls rte_kvargs_free() function to avoid memory leak.
> >
>
> Coverity issue: 373631
> > Fixes: a4975cd20dca ("test: add devargs test cases")
> >
> > Signed-off-by: Xueming Li 
> Reviewed-by: David Marchand 

Applied, thanks.


-- 
David Marchand



[dpdk-dev] [Bug 835] Previous patch introduced bugs in rte_ipv4_fragment_packet functions

2021-10-25 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=835

Bug ID: 835
   Summary: Previous patch introduced bugs in
rte_ipv4_fragment_packet functions
   Product: DPDK
   Version: unspecified
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: critical
  Priority: High
 Component: other
  Assignee: dev@dpdk.org
  Reporter: chcch...@163.com
CC: konstantin.anan...@intel.com
  Target Milestone: ---

Overview:
The patch 567473433b7e("ip_frag: fix fragmenting IPv4 fragment") introduces
a bug and needs to be rolled back. This is because the patch
and variables "flag_offset" conflict with each other.

>> diff --git a/lib/ip_frag/rte_ipv4_fragmentation.c
>> b/lib/ip_frag/rte_ipv4_fragmentation.c
>> index 2e7739d..fead5a9 100644
>> --- a/lib/ip_frag/rte_ipv4_fragmentation.c
>> +++ b/lib/ip_frag/rte_ipv4_fragmentation.c
>> @@ -75,7 +75,7 @@ static inline void __free_fragments(struct rte_mbuf *mb[],
>> uint32_t num)
>>  uint32_t out_pkt_pos, in_seg_data_pos;
>>  uint32_t more_in_segs;
>>  uint16_t fragment_offset, flag_offset, frag_size, header_len;
>> -uint16_t frag_bytes_remaining;
>> +uint16_t frag_bytes_remaining, not_last_frag;
>> 
>>  /*
>>   * Formal parameter checking.
>> @@ -116,7 +116,9 @@ static inline void __free_fragments(struct rte_mbuf
>> *mb[], uint32_t num)
>>  in_seg = pkt_in;
>>  in_seg_data_pos = header_len;
>>  out_pkt_pos = 0;
>> -fragment_offset = 0;
>> +fragment_offset = (uint16_t)((flag_offset &
>> +RTE_IPV4_HDR_OFFSET_MASK) << RTE_IPV4_HDR_FO_SHIFT);
>> +not_last_frag = (uint16_t)(flag_offset & IPV4_HDR_MF_MASK);
>> 
>>  more_in_segs = 1;
>>  while (likely(more_in_segs)) {
>> @@ -186,7 +188,8 @@ static inline void __free_fragments(struct rte_mbuf
>> *mb[], uint32_t num)
>> 
>>  __fill_ipv4hdr_frag(out_hdr, in_hdr, header_len,
>>  (uint16_t)out_pkt->pkt_len,
>> -flag_offset, fragment_offset, more_in_segs);
>> +flag_offset, fragment_offset,
>> +not_last_frag || more_in_segs);
>> 
>>  fragment_offset = (uint16_t)(fragment_offset +
>>  out_pkt->pkt_len - header_len);

"flag_offset" or “fofs” contains all the information about fragment,so this
patch is no longer needed.

flag_offset = rte_cpu_to_be_16(in_hdr->fragment_offset);

static inline void __fill_ipv4hdr_frag(struct rte_ipv4_hdr *dst,
const struct rte_ipv4_hdr *src, uint16_t header_len,
uint16_t len, uint16_t fofs, uint16_t dofs, uint32_t mf)
{
rte_memcpy(dst, src, header_len);
fofs = (uint16_t)(fofs + (dofs >> RTE_IPV4_HDR_FO_SHIFT));
fofs = (uint16_t)(fofs | mf << RTE_IPV4_HDR_MF_SHIFT);
dst->fragment_offset = rte_cpu_to_be_16(fofs);
dst->total_length = rte_cpu_to_be_16(len);
dst->hdr_checksum = 0;
}

Steps to Reproduce:
1) Use a fragment that is not the last fragment to test
rte_ipv4_fragment_packet function:MTU is 68,ip pkt_size is 104(not include ip
header len), MF is 1,fragment_offset is 13.
2) Start the test to see the fragmenting results: the values of fragment number
and fragment_offset.

Actual Results:
fragment number: 3
fragment_offset: 0x201A 0x2020 0x2026

Expected Results:
fragment number: 3
fragment_offset: 0x200D 0x2013 0x2019

Build Date & Hardware:
Build 2021-10-25 on Linux OS 3.10.0

-- 
You are receiving this mail because:
You are the assignee for the bug.

[dpdk-dev] [PATCH] ip_frag: fix the buf of fragmenting IPv4 fragment

2021-10-25 Thread huichao cai
The patch ("ip_frag: fix fragmenting IPv4 fragment") introduces
a bug and needs to be rolled back. This is because the patch
and variables "flag_offset" conflict with each other.

Bugzilla ID: 835
Fixes: 567473433b7e ("ip_frag: fix fragmenting IPv4 fragment")
Cc: sta...@dpdk.org
Signed-off-by: huichao cai 
---
 lib/ip_frag/rte_ipv4_fragmentation.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/lib/ip_frag/rte_ipv4_fragmentation.c 
b/lib/ip_frag/rte_ipv4_fragmentation.c
index fead5a9..2e7739d 100644
--- a/lib/ip_frag/rte_ipv4_fragmentation.c
+++ b/lib/ip_frag/rte_ipv4_fragmentation.c
@@ -75,7 +75,7 @@ static inline void __free_fragments(struct rte_mbuf *mb[], 
uint32_t num)
uint32_t out_pkt_pos, in_seg_data_pos;
uint32_t more_in_segs;
uint16_t fragment_offset, flag_offset, frag_size, header_len;
-   uint16_t frag_bytes_remaining, not_last_frag;
+   uint16_t frag_bytes_remaining;
 
/*
 * Formal parameter checking.
@@ -116,9 +116,7 @@ static inline void __free_fragments(struct rte_mbuf *mb[], 
uint32_t num)
in_seg = pkt_in;
in_seg_data_pos = header_len;
out_pkt_pos = 0;
-   fragment_offset = (uint16_t)((flag_offset &
-   RTE_IPV4_HDR_OFFSET_MASK) << RTE_IPV4_HDR_FO_SHIFT);
-   not_last_frag = (uint16_t)(flag_offset & IPV4_HDR_MF_MASK);
+   fragment_offset = 0;
 
more_in_segs = 1;
while (likely(more_in_segs)) {
@@ -188,8 +186,7 @@ static inline void __free_fragments(struct rte_mbuf *mb[], 
uint32_t num)
 
__fill_ipv4hdr_frag(out_hdr, in_hdr, header_len,
(uint16_t)out_pkt->pkt_len,
-   flag_offset, fragment_offset,
-   not_last_frag || more_in_segs);
+   flag_offset, fragment_offset, more_in_segs);
 
fragment_offset = (uint16_t)(fragment_offset +
out_pkt->pkt_len - header_len);
-- 
1.8.3.1



[dpdk-dev] [PATCH] test/ipfrag: add test content to the test unit

2021-10-25 Thread huichao cai
Add the test content of the fragment_offset(offset and MF)
to the test_ip_frag function. Add test data for a fragment
that is not the last fragment.

Signed-off-by: huichao cai 
---
 app/test/test_ipfrag.c | 95 +-
 1 file changed, 79 insertions(+), 16 deletions(-)

diff --git a/app/test/test_ipfrag.c b/app/test/test_ipfrag.c
index da8c212..1ced25a 100644
--- a/app/test/test_ipfrag.c
+++ b/app/test/test_ipfrag.c
@@ -89,12 +89,14 @@ static void ut_teardown(void)
 }
 
 static void
-v4_allocate_packet_of(struct rte_mbuf *b, int fill, size_t s, int df,
+v4_allocate_packet_of(struct rte_mbuf *b, int fill,
+ size_t s, int df, uint8_t mf, uint16_t off,
  uint8_t ttl, uint8_t proto, uint16_t pktid)
 {
/* Create a packet, 2k bytes long */
b->data_off = 0;
char *data = rte_pktmbuf_mtod(b, char *);
+   rte_be16_t fragment_offset = 0; /**< fragmentation offset */
 
memset(data, fill, sizeof(struct rte_ipv4_hdr) + s);
 
@@ -106,9 +108,17 @@ static void ut_teardown(void)
b->data_len = b->pkt_len;
hdr->total_length = rte_cpu_to_be_16(b->pkt_len);
hdr->packet_id = rte_cpu_to_be_16(pktid);
-   hdr->fragment_offset = 0;
+
if (df)
-   hdr->fragment_offset = rte_cpu_to_be_16(0x4000);
+   fragment_offset |= 0x4000;
+
+   if (mf)
+   fragment_offset |= 0x2000;
+
+   if (off)
+   fragment_offset |= off;
+
+   hdr->fragment_offset = rte_cpu_to_be_16(fragment_offset);
 
if (!ttl)
ttl = 64; /* default to 64 */
@@ -155,38 +165,73 @@ static void ut_teardown(void)
rte_pktmbuf_free(mb[i]);
 }
 
+static inline void
+test_get_offset(struct rte_mbuf **mb, int32_t len,
+   uint16_t *offset, int ipv)
+{
+   int32_t i;
+
+   for (i = 0; i < len; i++) {
+   if (ipv == 4) {
+   struct rte_ipv4_hdr *iph =
+   rte_pktmbuf_mtod(mb[i], struct rte_ipv4_hdr *);
+   offset[i] = iph->fragment_offset;
+   } else if (ipv == 6) {
+   struct ipv6_extension_fragment *fh =
+   rte_pktmbuf_mtod_offset(
+   mb[i],
+   struct ipv6_extension_fragment *,
+   sizeof(struct rte_ipv6_hdr));
+   offset[i] = fh->frag_data;
+   }
+   }
+}
+
 static int
 test_ip_frag(void)
 {
static const uint16_t RND_ID = UINT16_MAX;
int result = TEST_SUCCESS;
-   size_t i;
+   size_t i, j;
 
struct test_ip_frags {
int  ipv;
size_t   mtu_size;
size_t   pkt_size;
int  set_df;
+   uint8_t  set_mf;
+   uint16_t set_of;
uint8_t  ttl;
uint8_t  proto;
uint16_t pkt_id;
int  expected_frags;
+   uint16_t expected_fragment_offset[BURST];
} tests[] = {
-{4, 1280, 1400, 0, 64, IPPROTO_ICMP, RND_ID, 2},
-{4, 1280, 1400, 0, 64, IPPROTO_ICMP, 0,  2},
-{4,  600, 1400, 0, 64, IPPROTO_ICMP, RND_ID, 3},
-{4,4, 1400, 0, 64, IPPROTO_ICMP, RND_ID, -EINVAL},
-{4,  600, 1400, 1, 64, IPPROTO_ICMP, RND_ID, -ENOTSUP},
-{4,  600, 1400, 0,  0, IPPROTO_ICMP, RND_ID, 3},
-
-{6, 1280, 1400, 0, 64, IPPROTO_ICMP, RND_ID, 2},
-{6, 1300, 1400, 0, 64, IPPROTO_ICMP, RND_ID, 2},
-{6,4, 1400, 0, 64, IPPROTO_ICMP, RND_ID, -EINVAL},
-{6, 1300, 1400, 0,  0, IPPROTO_ICMP, RND_ID, 2},
+{4, 1280, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID,   2,
+ {0x2000, 0x009D}},
+{4, 1280, 1400, 0, 0, 0, 64, IPPROTO_ICMP, 0,2,
+ {0x2000, 0x009D}},
+{4,  600, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID,   3,
+ {0x2000, 0x2048, 0x0090}},
+{4, 4, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID,-EINVAL},
+{4, 600, 1400, 1, 0, 0, 64, IPPROTO_ICMP, RND_ID, -ENOTSUP},
+{4, 600, 1400, 0, 0, 0, 0, IPPROTO_ICMP, RND_ID, 3,
+ {0x2000, 0x2048, 0x0090}},
+{4, 68, 104, 0, 1, 13, 0, IPPROTO_ICMP, RND_ID,  3,
+ {0x200D, 0x2013, 0x2019}},
+
+{6, 1280, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID,   2,
+ {0x0001, 0x04D0}},
+{6, 1300, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID,   2,
+ {0x0001, 0x04E0}},
+{6, 4, 1400, 0, 0, 0, 64, IPPROTO_ICMP, RND_ID,-EINVAL},
+{6, 1300, 1400, 0, 0, 0, 0, IPPROT

Re: [dpdk-dev] [RFC PATCH 0/1] Dataplane Workload Accelerator library

2021-10-25 Thread Jerin Jacob
On Mon, Oct 25, 2021 at 1:05 PM Mattias Rönnblom
 wrote:
>
> On 2021-10-19 20:14, jer...@marvell.com wrote:
> > From: Jerin Jacob 
> >
> >
> > Dataplane Workload Accelerator library
> > ==
> >
> > Definition of Dataplane Workload Accelerator
> > 
> > Dataplane Workload Accelerator(DWA) typically contains a set of CPUs,
> > Network controllers and programmable data acceleration engines for
> > packet processing, cryptography, regex engines, baseband processing, etc.
> > This allows DWA to offload  compute/packet processing/baseband/
> > cryptography-related workload from the host CPU to save the cost and power.
> > Also to enable scaling the workload by adding DWAs to the Host CPU as 
> > needed.
> >
> > Unlike other devices in DPDK, the DWA device is not fixed-function
> > due to the fact that it has CPUs and programmable HW accelerators.
>
>
> There are already several instances of DPDK devices with pure-software
> implementation. In this regard, a DPU/SmartNIC represents nothing new.
> What's new, it seems to me, is a much-increased need to
> configure/arrange the processing in complex manners, to avoid bouncing
> everything to the host CPU.

Yes and No. It will be based on the profile. The TLV type TYPE_USER_PLANE will
have user plane traffic from/to host. For example, offloading ORAN split 7.2
baseband profile. Transport blocks sent to/from host as TYPE_USER_PLANE.

> Something like P4 or rte_flow-based hooks or
> some other kind of extension. The eventdev adapters solve the same
> problem (where on some systems packets go through the host CPU on their
> way to the event device, and others do not) - although on a *much*
> smaller scale.

Yes. Eventdev Adapters only for event device plumbing.


>
>
> "Not-fixed function" seems to call for more hot plug support in the
> device APIs. Such functionality could then be reused by anything that
> can be reconfigured dynamically (FPGAs, firmware-programmed
> accelerators, etc.),

Yes.

> but which may not be able to serve as a RPC
> endpoint, like a SmartNIC.

It can. That's the reason for choosing TLVs. So that
any higher level language can use TLVs like https://github.com/ustropo/uttlv
to communicate with the accelerator.  TLVs follow the request and
response scheme like RPC. So it can warp it under application if needed.

>
>
> DWA could be some kind of DPDK-internal framework for managing certain
> type of DPUs, but should it be exposed to the user application?


Could you clarify a bit more.
The offload is represented as a set of TLVs in generic fashion. There
is no DPU specific bit in offload representation. See
rte_dwa_profiile_l3fwd.h header file.

TB hosted a meeting for this at Date: Wednesday, October 27th Time:
3pm UTC, https://meet.jit.si/DPDK
Feel free to join.


>
>
> > This enables DWA personality/workload to be completely programmable.
> > Typical examples of DWA offloads are Flow/Session management,
> > Virtual switch, TLS offload, IPsec offload, l3fwd offload, etc.
> > Motivation for the new library
> > --
> > Even though, a lot of semiconductor vendors offers a different form of DWA,
> > such as DPU(often called Smart-NIC), GPU, IPU, XPU, etc.,
> > Due to the lack of standard APIs to "Define the workload" and
> > "Communication between HOST and DWA", it is difficult for DPDK
> > consumers to use them in a portable way across different DWA vendors
> > and enable it in cloud environments.
> >
> >
> > Contents of RFC
> > --
> > This RFC attempts to define standard APIs for:
> >
> > 1) Definition of Profiles corresponding to well defined workloads, which 
> > includes
> > a set of TLV(Messages) as a request  and response scheme to define
> > the contract between host and DWA to offload a workload.
> > (See lib/dwa/rte_dwa_profile_* header files)
> > 2) Discovery of a DWAs capabilities (e.g. which specific workloads it can 
> > support)
> > in a vendor independent fashion. (See rte_dwa_dev_disc_profiles())
> > 3) Attaching a set of profiles to a DWA device(See rte_dwa_dev_attach())
> > 4) A communication framework between Host and DWA(See rte_dwa_ctrl_op() for
> > control plane and rte_dwa_port_host_* for user plane)
> > 5) Virtualization of DWA hardware and firmware (Use standard DPDK 
> > device/bus model)
> > 6) Enablement of administrative functions such as FW updates,
> > resource partitioning in a DWA like items in global in
> > nature that is applicable for all DWA device under the DWA.
> > (See rte_dwa_profile_admin.h)
> >
> > Also, this RFC define the L3FWD profile to offload L3FWD workload to DWA.
> > This RFC defines an ethernet-style host port for Host to DWA communication.
> > Different host port types may be required to cover the large spectrum of 
> > DWA types as
> > transports like PCIe DMA, Shared Memory, or Ethernet are fundamentally 
> > different,
> > and optimal performan

Re: [dpdk-dev] [EXT] [PATCH 1/2] ipsec: add transmit segmentation offload support

2021-10-25 Thread Akhil Goyal
> Subject: [EXT] [PATCH 1/2] ipsec: add transmit segmentation offload support
> 
Title should be ipsec: support TSO

> Add support for transmit segmentation offload to inline crypto processing
> mode. This offload is not supported by other offload modes, as at a
> minimum it requires inline crypto for IPsec to be supported on the
> network interface.
> 
> Signed-off-by: Declan Doherty 
> Signed-off-by: Radu Nicolau 
> Signed-off-by: Abhijit Sinha 
> Signed-off-by: Daniel Martin Buckley 
> Acked-by: Fan Zhang 
> ---
>  doc/guides/prog_guide/ipsec_lib.rst|   2 +
>  doc/guides/rel_notes/release_21_11.rst |   1 +
>  lib/ipsec/esp_outb.c   | 131 +++--
>  3 files changed, 106 insertions(+), 28 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/ipsec_lib.rst
> b/doc/guides/prog_guide/ipsec_lib.rst
> index 1bafdc608c..2a262f8c51 100644
> --- a/doc/guides/prog_guide/ipsec_lib.rst
> +++ b/doc/guides/prog_guide/ipsec_lib.rst
> @@ -315,6 +315,8 @@ Supported features
> 
>  *  NAT-T / UDP encapsulated ESP.
> 
> +*  TSO support (only for inline crypto mode)
> +
The word support can be dropped here as it is a list of supported features.

>  *  algorithms: 3DES-CBC, AES-CBC, AES-CTR, AES-GCM, AES_CCM,
> CHACHA20_POLY1305,
> AES_GMAC, HMAC-SHA1, NULL.
> 
> diff --git a/doc/guides/rel_notes/release_21_11.rst
> b/doc/guides/rel_notes/release_21_11.rst
> index f6d2bc6f48..955b0bd68f 100644
> --- a/doc/guides/rel_notes/release_21_11.rst
> +++ b/doc/guides/rel_notes/release_21_11.rst
> @@ -201,6 +201,7 @@ New Features
>* Added support for NAT-T / UDP encapsulated ESP
>* Added support for SA telemetry.
>* Added support for setting a non default starting ESN value.
> +  * Added support TSO offload support; only supported for inline crypto
> mode.

The word support is added three times in a single sentence.
It can be rephrased as
* Added support for TSO in inline crypto mode.


> 
> 
>  Removed Items
> diff --git a/lib/ipsec/esp_outb.c b/lib/ipsec/esp_outb.c
> index b6c72f58a4..c9fba662f2 100644
> --- a/lib/ipsec/esp_outb.c
> +++ b/lib/ipsec/esp_outb.c
> @@ -18,7 +18,7 @@
> 
>  typedef int32_t (*esp_outb_prepare_t)(struct rte_ipsec_sa *sa, rte_be64_t
> sqc,
>   const uint64_t ivp[IPSEC_MAX_IV_QWORD], struct rte_mbuf *mb,
> - union sym_op_data *icv, uint8_t sqh_len);
> + union sym_op_data *icv, uint8_t sqh_len, uint8_t tso);
> 
>  /*
>   * helper function to fill crypto_sym op for cipher+auth algorithms.
> @@ -139,7 +139,7 @@ outb_cop_prepare(struct rte_crypto_op *cop,
>  static inline int32_t
>  outb_tun_pkt_prepare(struct rte_ipsec_sa *sa, rte_be64_t sqc,
>   const uint64_t ivp[IPSEC_MAX_IV_QWORD], struct rte_mbuf *mb,
> - union sym_op_data *icv, uint8_t sqh_len)
> + union sym_op_data *icv, uint8_t sqh_len, uint8_t tso)
>  {
>   uint32_t clen, hlen, l2len, pdlen, pdofs, plen, tlen;
>   struct rte_mbuf *ml;
> @@ -157,11 +157,20 @@ outb_tun_pkt_prepare(struct rte_ipsec_sa *sa,
> rte_be64_t sqc,
> 
>   /* number of bytes to encrypt */
>   clen = plen + sizeof(*espt);
> - clen = RTE_ALIGN_CEIL(clen, sa->pad_align);
> +
> + /* We don't need to pad/align packet when using TSO offload */
> + if (!tso)
> + clen = RTE_ALIGN_CEIL(clen, sa->pad_align);
> +
> 
>   /* pad length + esp tail */
>   pdlen = clen - plen;
> - tlen = pdlen + sa->icv_len + sqh_len;
> +
> + /* We don't append ICV length when using TSO offload */
> + if (!tso)
> + tlen = pdlen + sa->icv_len + sqh_len;
> + else
> + tlen = pdlen + sqh_len;
This is a data path function, 2 extra checks are added for tso in the same 
function
if (tso) {
pdlen = clen - plen;
tlen = pdlen + sqh_len;
} else {
clen = RTE_ALIGN_CEIL(clen, sa->pad_align);
pdlen = clen - plen;
tlen = pdlen + sa->icv_len + sqh_len;
}


> 
>   /* do append and prepend */
>   ml = rte_pktmbuf_lastseg(mb);
> @@ -309,7 +318,7 @@ esp_outb_tun_prepare(const struct
> rte_ipsec_session *ss, struct rte_mbuf *mb[],
> 
>   /* try to update the packet itself */
>   rc = outb_tun_pkt_prepare(sa, sqc, iv, mb[i], &icv,
> -   sa->sqh_len);
> +   sa->sqh_len, 0);
>   /* success, setup crypto op */
>   if (rc >= 0) {
>   outb_pkt_xprepare(sa, sqc, &icv);
> @@ -336,7 +345,7 @@ esp_outb_tun_prepare(const struct
> rte_ipsec_session *ss, struct rte_mbuf *mb[],
>  static inline int32_t
>  outb_trs_pkt_prepare(struct rte_ipsec_sa *sa, rte_be64_t sqc,
>   const uint64_t ivp[IPSEC_MAX_IV_QWORD], struct rte_mbuf *mb,
> - union sym_op_data *icv, uint8_t sqh_len)
> + union sym_op_data *icv, uint8_t sqh_len, uint8_t tso)
>  {
>   uint8_t np;
>   uint32_t clen, hlen, pdlen, pdofs, plen, tlen, uhlen;
> @@ -358,11 +367,19 @@ outb_trs_pkt_prepare(struct rte_ipsec_sa *sa,
> rte_

[dpdk-dev] 回复: [PATCH v4 1/5] eal: add new definitions for wait scheme

2021-10-25 Thread Feifei Wang


> -邮件原件-
> 发件人: Ananyev, Konstantin 
> 发送时间: Friday, October 22, 2021 12:25 AM
> 收件人: Feifei Wang ; Ruifeng Wang
> 
> 抄送: dev@dpdk.org; nd 
> 主题: RE: [PATCH v4 1/5] eal: add new definitions for wait scheme
> 
> > Introduce macros as generic interface for address monitoring.
> >
> > Signed-off-by: Feifei Wang 
> > Reviewed-by: Ruifeng Wang 
> > ---
> >  lib/eal/arm/include/rte_pause_64.h  | 126
> >   lib/eal/include/generic/rte_pause.h |
> > 32 +++
> >  2 files changed, 104 insertions(+), 54 deletions(-)
> >
> > diff --git a/lib/eal/arm/include/rte_pause_64.h
> > b/lib/eal/arm/include/rte_pause_64.h
> > index e87d10b8cc..23954c2de2 100644
> > --- a/lib/eal/arm/include/rte_pause_64.h
> > +++ b/lib/eal/arm/include/rte_pause_64.h
> > @@ -31,20 +31,12 @@ static inline void rte_pause(void)
> >  /* Put processor into low power WFE(Wait For Event) state. */
> > #define __WFE() { asm volatile("wfe" : : : "memory"); }
> >
> > -static __rte_always_inline void
> > -rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > -   int memorder)
> > -{
> > -   uint16_t value;
> > -
> > -   assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> __ATOMIC_RELAXED);
> > -
> > -   /*
> > -* Atomic exclusive load from addr, it returns the 16-bit content of
> > -* *addr while making it 'monitored',when it is written by someone
> > -* else, the 'monitored' state is cleared and a event is generated
> > -* implicitly to exit WFE.
> > -*/
> > +/*
> > + * Atomic exclusive load from addr, it returns the 16-bit content of
> > + * *addr while making it 'monitored', when it is written by someone
> > + * else, the 'monitored' state is cleared and a event is generated
> > + * implicitly to exit WFE.
> > + */
> >  #define __LOAD_EXC_16(src, dst, memorder) {   \
> > if (memorder == __ATOMIC_RELAXED) {   \
> > asm volatile("ldxrh %w[tmp], [%x[addr]]"  \ @@ -58,6 +50,52
> @@
> > rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > : "memory");  \
> > } }
> >
> > +/*
> > + * Atomic exclusive load from addr, it returns the 32-bit content of
> > + * *addr while making it 'monitored', when it is written by someone
> > + * else, the 'monitored' state is cleared and a event is generated
> > + * implicitly to exit WFE.
> > + */
> > +#define __LOAD_EXC_32(src, dst, memorder) {  \
> > +   if (memorder == __ATOMIC_RELAXED) {  \
> > +   asm volatile("ldxr %w[tmp], [%x[addr]]"  \
> > +   : [tmp] "=&r" (dst)  \
> > +   : [addr] "r"(src)\
> > +   : "memory"); \
> > +   } else { \
> > +   asm volatile("ldaxr %w[tmp], [%x[addr]]" \
> > +   : [tmp] "=&r" (dst)  \
> > +   : [addr] "r"(src)\
> > +   : "memory"); \
> > +   } }
> > +
> > +/*
> > + * Atomic exclusive load from addr, it returns the 64-bit content of
> > + * *addr while making it 'monitored', when it is written by someone
> > + * else, the 'monitored' state is cleared and a event is generated
> > + * implicitly to exit WFE.
> > + */
> > +#define __LOAD_EXC_64(src, dst, memorder) {  \
> > +   if (memorder == __ATOMIC_RELAXED) {  \
> > +   asm volatile("ldxr %x[tmp], [%x[addr]]"  \
> > +   : [tmp] "=&r" (dst)  \
> > +   : [addr] "r"(src)\
> > +   : "memory"); \
> > +   } else { \
> > +   asm volatile("ldaxr %x[tmp], [%x[addr]]" \
> > +   : [tmp] "=&r" (dst)  \
> > +   : [addr] "r"(src)\
> > +   : "memory"); \
> > +   } }
> > +
> > +static __rte_always_inline void
> > +rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > +   int memorder)
> > +{
> > +   uint16_t value;
> > +
> > +   assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > +__ATOMIC_RELAXED);
> > +
> > __LOAD_EXC_16(addr, value, memorder)
> > if (value != expected) {
> > __SEVL()
> > @@ -66,7 +104,6 @@ rte_wait_until_equal_16(volatile uint16_t *addr,
> uint16_t expected,
> > __LOAD_EXC_16(addr, value, memorder)
> > } while (value != expected);
> > }
> > -#undef __LOAD_EXC_16
> >  }
> >
> >  static __rte_always_inline void
> > @@ -77,25 +114,6 @@ rte_wait_until_equal_32(volatile uint32_t *addr,
> > uint32_t expected,
> >
> > assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > __ATOMIC_RELAXED);
> >
> > -   /*
> > -* Atomic exclusive load from addr, it returns the 32-bit content of
> > -* *addr while making it 'monitored',when it is

[dpdk-dev] 回复: [PATCH v4 1/5] eal: add new definitions for wait scheme

2021-10-25 Thread Feifei Wang
> -邮件原件-
> 发件人: Jerin Jacob 
> 发送时间: Friday, October 22, 2021 8:10 AM
> 收件人: Feifei Wang 
> 抄送: Ruifeng Wang ; Ananyev, Konstantin
> ; dpdk-dev ; nd
> 
> 主题: Re: [dpdk-dev] [PATCH v4 1/5] eal: add new definitions for wait scheme
> 
> On Wed, Oct 20, 2021 at 2:16 PM Feifei Wang 
> wrote:
> >
> > Introduce macros as generic interface for address monitoring.
> >
> > Signed-off-by: Feifei Wang 
> > Reviewed-by: Ruifeng Wang 
> > ---
> >  lib/eal/arm/include/rte_pause_64.h  | 126
> >   lib/eal/include/generic/rte_pause.h |
> > 32 +++
> >  2 files changed, 104 insertions(+), 54 deletions(-)
> >
> > diff --git a/lib/eal/arm/include/rte_pause_64.h
> > b/lib/eal/arm/include/rte_pause_64.h
> > index e87d10b8cc..23954c2de2 100644
> > --- a/lib/eal/arm/include/rte_pause_64.h
> > +++ b/lib/eal/arm/include/rte_pause_64.h
> > @@ -31,20 +31,12 @@ static inline void rte_pause(void)
> >  /* Put processor into low power WFE(Wait For Event) state. */
> > #define __WFE() { asm volatile("wfe" : : : "memory"); }
> >
> > -static __rte_always_inline void
> > -rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > -   int memorder)
> > -{
> > -   uint16_t value;
> > -
> > -   assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> __ATOMIC_RELAXED);
> > -
> > -   /*
> > -* Atomic exclusive load from addr, it returns the 16-bit content of
> > -* *addr while making it 'monitored',when it is written by someone
> > -* else, the 'monitored' state is cleared and a event is generated
> 
> a event -> an event in all the occurrence.
> 
> > -* implicitly to exit WFE.
> > -*/
> > +/*
> > + * Atomic exclusive load from addr, it returns the 16-bit content of
> > + * *addr while making it 'monitored', when it is written by someone
> > + * else, the 'monitored' state is cleared and a event is generated
> > + * implicitly to exit WFE.
> > + */
> >  #define __LOAD_EXC_16(src, dst, memorder) {   \
> > if (memorder == __ATOMIC_RELAXED) {   \
> > asm volatile("ldxrh %w[tmp], [%x[addr]]"  \ @@ -58,6
> > +50,52 @@ rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t
> expected,
> > : "memory");  \
> > } }
> >
> > +/*
> > + * Atomic exclusive load from addr, it returns the 32-bit content of
> > + * *addr while making it 'monitored', when it is written by someone
> > + * else, the 'monitored' state is cleared and a event is generated
> > + * implicitly to exit WFE.
> > + */
> > +#define __LOAD_EXC_32(src, dst, memorder) {  \
> > +   if (memorder == __ATOMIC_RELAXED) {  \
> > +   asm volatile("ldxr %w[tmp], [%x[addr]]"  \
> > +   : [tmp] "=&r" (dst)  \
> > +   : [addr] "r"(src)\
> > +   : "memory"); \
> > +   } else { \
> > +   asm volatile("ldaxr %w[tmp], [%x[addr]]" \
> > +   : [tmp] "=&r" (dst)  \
> > +   : [addr] "r"(src)\
> > +   : "memory"); \
> > +   } }
> > +
> > +/*
> > + * Atomic exclusive load from addr, it returns the 64-bit content of
> > + * *addr while making it 'monitored', when it is written by someone
> > + * else, the 'monitored' state is cleared and a event is generated
> > + * implicitly to exit WFE.
> > + */
> > +#define __LOAD_EXC_64(src, dst, memorder) {  \
> > +   if (memorder == __ATOMIC_RELAXED) {  \
> > +   asm volatile("ldxr %x[tmp], [%x[addr]]"  \
> > +   : [tmp] "=&r" (dst)  \
> > +   : [addr] "r"(src)\
> > +   : "memory"); \
> > +   } else { \
> > +   asm volatile("ldaxr %x[tmp], [%x[addr]]" \
> > +   : [tmp] "=&r" (dst)  \
> > +   : [addr] "r"(src)\
> > +   : "memory"); \
> > +   } }
> > +
> > +static __rte_always_inline void
> > +rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > +   int memorder)
> > +{
> > +   uint16_t value;
> > +
> > +   assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > + __ATOMIC_RELAXED);
> > +
> > __LOAD_EXC_16(addr, value, memorder)
> > if (value != expected) {
> > __SEVL()
> > @@ -66,7 +104,6 @@ rte_wait_until_equal_16(volatile uint16_t *addr,
> uint16_t expected,
> > __LOAD_EXC_16(addr, value, memorder)
> > } while (value != expected);
> > }
> > -#undef __LOAD_EXC_16
> >  }
> >
> >  static __rte_always_inline void
> > @@ -77,25 +114,6

Re: [dpdk-dev] [PATCH 1/2] ethdev: fix log level of Tx and Rx dummy functions

2021-10-25 Thread Thomas Monjalon
24/10/2021 13:48, Ananyev, Konstantin:
> 
> > > > > When stopping a port, the data path Tx and Rx burst functions
> > > should
> > > > > be stopped firstly conventionally. Then the dummy functions are
> > > used
> > > > > to replace the callback functions provided by the PMD.
> > > > >
> > > > > When the application stops a port without or before stopping the
> > > > > data path handling.
> > >
> > > If the application really does that, then it is a severe bug in the
> > > application, then needs to be fixed ASAP.
> > 
> > I agree, this should be some improper / wrong behavior in the application.
> > 
> > >
> > > > The dummy functions may be invoked heavily and a lot
> > > > > of logs in these dummy functions will result in a flood.
> > > >
> > > > Why does it happen? We should not use a stopped port.
> > > > Is it a problem of core synchronization?
> > > >
> > > > > Debug level log should be enough instead of the error level.
> > > >
> > > >
> > >
> > > Dummy function is supposed to be set only when device is not able to
> > > do RX/TX properly (not attached, or attached but not configured, or
> > > attached and configured, but not started).
> > > Obviously if app calls rx/tx_burst for such port it is a major issue,
> > > that should be flagged immediately.
> > > So I believe having ERR level here makes a perfect sense here.
> > 
> > I do not insist on this. Some notification to the application may be 
> > needed. While to my understanding, the log flood should be prevented,
> > or the logs may slow down the application, the IO, and would also have 
> > impact on other logs and some information may get lost (but that is
> > the users' decision).
> > Since the rx/tx burst are usually in the data path and invoked heavily, if 
> > the log is needed, how about print it only once? WDYT?
> > 
> 
> Correctly behaving app should never call these stub functions and should 
> never see these messages.
> If your app ended up inside this function, then there something really wrong 
> is going on,
> that can cause app crash, silent memory corruption, NIC HW hang, or many 
> other nasty things.
> The aim of this stubs mechanism:
> 1) minimize (but not completely avoid) risk of such damage to happen in case 
> of
> programming error within user app.
> 2) flag to the user that something very wrong is going on within his app.
> In such situation, possible slowdown of misbehaving program is out of my 
> concern.  

There is a concern about getting efficient log report,
especially when looking at CI issues.





Re: [dpdk-dev] [PATCH v4 1/5] eal: add new definitions for wait scheme

2021-10-25 Thread Jerin Jacob
On Mon, Oct 25, 2021 at 3:01 PM Feifei Wang  wrote:
>
> > -邮件原件-
> > 发件人: Jerin Jacob 
> > 发送时间: Friday, October 22, 2021 8:10 AM
> > 收件人: Feifei Wang 
> > 抄送: Ruifeng Wang ; Ananyev, Konstantin
> > ; dpdk-dev ; nd
> > 
> > 主题: Re: [dpdk-dev] [PATCH v4 1/5] eal: add new definitions for wait scheme
> >
> > On Wed, Oct 20, 2021 at 2:16 PM Feifei Wang 
> > wrote:
> > >
> > > Introduce macros as generic interface for address monitoring.
> > >
> > > Signed-off-by: Feifei Wang 
> > > Reviewed-by: Ruifeng Wang 
> > > ---
> > >  lib/eal/arm/include/rte_pause_64.h  | 126
> > >   lib/eal/include/generic/rte_pause.h |
> > > 32 +++
> > >  2 files changed, 104 insertions(+), 54 deletions(-)
> > >
> > > diff --git a/lib/eal/arm/include/rte_pause_64.h
> > > b/lib/eal/arm/include/rte_pause_64.h
> > > index e87d10b8cc..23954c2de2 100644
> > > --- a/lib/eal/arm/include/rte_pause_64.h
> > > +++ b/lib/eal/arm/include/rte_pause_64.h
> > > @@ -31,20 +31,12 @@ static inline void rte_pause(void)
> > >  /* Put processor into low power WFE(Wait For Event) state. */
> > > #define __WFE() { asm volatile("wfe" : : : "memory"); }
> > >
> > > -static __rte_always_inline void
> > > -rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > > -   int memorder)
> > > -{
> > > -   uint16_t value;
> > > -
> > > -   assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > __ATOMIC_RELAXED);
> > > -
> > > -   /*
> > > -* Atomic exclusive load from addr, it returns the 16-bit content 
> > > of
> > > -* *addr while making it 'monitored',when it is written by someone
> > > -* else, the 'monitored' state is cleared and a event is generated
> >
> > a event -> an event in all the occurrence.
> >
> > > -* implicitly to exit WFE.
> > > -*/
> > > +/*
> > > + * Atomic exclusive load from addr, it returns the 16-bit content of
> > > + * *addr while making it 'monitored', when it is written by someone
> > > + * else, the 'monitored' state is cleared and a event is generated
> > > + * implicitly to exit WFE.
> > > + */
> > >  #define __LOAD_EXC_16(src, dst, memorder) {   \
> > > if (memorder == __ATOMIC_RELAXED) {   \
> > > asm volatile("ldxrh %w[tmp], [%x[addr]]"  \ @@ -58,6
> > > +50,52 @@ rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t
> > expected,
> > > : "memory");  \
> > > } }
> > >
> > > +/*
> > > + * Atomic exclusive load from addr, it returns the 32-bit content of
> > > + * *addr while making it 'monitored', when it is written by someone
> > > + * else, the 'monitored' state is cleared and a event is generated
> > > + * implicitly to exit WFE.
> > > + */
> > > +#define __LOAD_EXC_32(src, dst, memorder) {  \
> > > +   if (memorder == __ATOMIC_RELAXED) {  \
> > > +   asm volatile("ldxr %w[tmp], [%x[addr]]"  \
> > > +   : [tmp] "=&r" (dst)  \
> > > +   : [addr] "r"(src)\
> > > +   : "memory"); \
> > > +   } else { \
> > > +   asm volatile("ldaxr %w[tmp], [%x[addr]]" \
> > > +   : [tmp] "=&r" (dst)  \
> > > +   : [addr] "r"(src)\
> > > +   : "memory"); \
> > > +   } }
> > > +
> > > +/*
> > > + * Atomic exclusive load from addr, it returns the 64-bit content of
> > > + * *addr while making it 'monitored', when it is written by someone
> > > + * else, the 'monitored' state is cleared and a event is generated
> > > + * implicitly to exit WFE.
> > > + */
> > > +#define __LOAD_EXC_64(src, dst, memorder) {  \
> > > +   if (memorder == __ATOMIC_RELAXED) {  \
> > > +   asm volatile("ldxr %x[tmp], [%x[addr]]"  \
> > > +   : [tmp] "=&r" (dst)  \
> > > +   : [addr] "r"(src)\
> > > +   : "memory"); \
> > > +   } else { \
> > > +   asm volatile("ldaxr %x[tmp], [%x[addr]]" \
> > > +   : [tmp] "=&r" (dst)  \
> > > +   : [addr] "r"(src)\
> > > +   : "memory"); \
> > > +   } }
> > > +
> > > +static __rte_always_inline void
> > > +rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > > +   int memorder)
> > > +{
> > > +   uint16_t value;
> > > +
> > > +   assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > > + __ATOMIC_RELAXED);
> > > +
> > > __LOAD_EXC_16(addr, value, memorder)
> > > if (value != expected) {
> > > __SEVL()
> > > @@ -66,7 +104,6 @@ rte_wa

Re: [dpdk-dev] [PATCH 1/2] ethdev: fix log level of Tx and Rx dummy functions

2021-10-25 Thread David Marchand
On Mon, Oct 25, 2021 at 11:43 AM Thomas Monjalon  wrote:
> > Correctly behaving app should never call these stub functions and should 
> > never see these messages.
> > If your app ended up inside this function, then there something really 
> > wrong is going on,
> > that can cause app crash, silent memory corruption, NIC HW hang, or many 
> > other nasty things.
> > The aim of this stubs mechanism:
> > 1) minimize (but not completely avoid) risk of such damage to happen in 
> > case of
> > programming error within user app.
> > 2) flag to the user that something very wrong is going on within his app.
> > In such situation, possible slowdown of misbehaving program is out of my 
> > concern.

If correctly behaving app should not do this, why not put an assert()
or a rte_panic?
This way, the users will definitely catch it.


>
> There is a concern about getting efficient log report,
> especially when looking at CI issues.

+1.
The current solution with logs is a real pain.


-- 
David Marchand



Re: [dpdk-dev] [PATCH] app/test: fix event timer adapter create unit test

2021-10-25 Thread Thomas Monjalon
22/10/2021 06:34, Shijith Thotton:
> >>
> >> Removed freeing of unallocated mempool in event timer adapter create unit
> >> test.
> >>
> >> Fixes: d1f3385d0076 ("test: add event timer adapter auto-test")
> >>
> >> Signed-off-by: Shijith Thotton 
> >Acked-by: Erik Gabriel Carrillo 
>  
> Thomas, Please merge this patch.

It is eventdev related, it should have been delegated to Jerin.




Re: [dpdk-dev] [PATCH] app/test: fix event timer adapter create unit test

2021-10-25 Thread Thomas Monjalon
09/09/2021 22:11, Carrillo, Erik G:
> > From: Shijith Thotton 
> > 
> > Removed freeing of unallocated mempool in event timer adapter create unit
> > test.
> > 
> > Fixes: d1f3385d0076 ("test: add event timer adapter auto-test")
> > 
> > Signed-off-by: Shijith Thotton 
> Acked-by: Erik Gabriel Carrillo 

Applied, thanks.




Re: [dpdk-dev] [PATCH v1 08/14] vhost: improve IO vector logic

2021-10-25 Thread Maxime Coquelin

Hi Jiayu,

On 10/25/21 09:22, Hu, Jiayu wrote:

Hi Maxime,


-Original Message-
From: Maxime Coquelin 
Sent: Monday, October 18, 2021 9:02 PM
To: dev@dpdk.org; Xia, Chenbo ; Hu, Jiayu
; Wang, YuanX ; Ma,
WenwuX ; Richardson, Bruce
; Mcnamara, John
; david.march...@redhat.com
Cc: Maxime Coquelin 
Subject: [PATCH v1 08/14] vhost: improve IO vector logic

IO vectors and their iterators arrays were part of the async metadata but not
their indexes.

In order to makes this more consistent, the patch adds the indexes to the
async metadata. Doing that, we can avoid triggering DMA transfer within the
loop as it IO vector index overflow is now prevented in the
async_mbuf_to_desc() function.

Note that previous detection mechanism was broken since the overflow
already happened when detected, so OOB memory access would already
have happened.

With this changes done, virtio_dev_rx_async_submit_split()
and virtio_dev_rx_async_submit_packed() can be further simplified.

Signed-off-by: Maxime Coquelin 
---
  lib/vhost/vhost.h  |   2 +
  lib/vhost/virtio_net.c | 291 ++---
  2 files changed, 131 insertions(+), 162 deletions(-)

diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index
dae9a1ac2d..812d4c55a5 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -134,6 +134,8 @@ struct vhost_async {

struct rte_vhost_iov_iter iov_iter[VHOST_MAX_ASYNC_IT];
struct rte_vhost_iovec iovec[VHOST_MAX_ASYNC_VEC];
+   uint16_t iter_idx;
+   uint16_t iovec_idx;

/* data transfer status */
struct async_inflight_info *pkts_info; diff --git 
a/lib/vhost/virtio_net.c
b/lib/vhost/virtio_net.c index ae7dded979..c80823a8de 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -924,33 +924,86 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct
vhost_virtqueue *vq,
return error;
  }

+static __rte_always_inline int
+async_iter_initialize(struct vhost_async *async) {
+   struct rte_vhost_iov_iter *iter;
+
+   if (unlikely(async->iovec_idx >= VHOST_MAX_ASYNC_VEC)) {
+   VHOST_LOG_DATA(ERR, "no more async iovec available\n");
+   return -1;
+   }
+
+   iter = async->iov_iter + async->iter_idx;
+   iter->iov = async->iovec + async->iovec_idx;
+   iter->nr_segs = 0;
+
+   return 0;
+}
+
+static __rte_always_inline int
+async_iter_add_iovec(struct vhost_async *async, void *src, void *dst,
+size_t len) {
+   struct rte_vhost_iov_iter *iter;
+   struct rte_vhost_iovec *iovec;
+
+   if (unlikely(async->iovec_idx >= VHOST_MAX_ASYNC_VEC)) {
+   VHOST_LOG_DATA(ERR, "no more async iovec available\n");
+   return -1;
+   }


For large packets, like 64KB in iperf test, async_iter_add_iovec() frequently
reports the log above, as we run out of iovecs. I think it's better to change
the log from ERR to DEBUG.


I think it is better to keep it as an error, we want to see it if it
happens without having the user to enable debug.

But maybe we can only print it once, not to flood the logs.


In addition, the size of iovec is too small. For burst 32 and 64KB pkts, it's
easy to run out of iovecs and we will drop the pkts to enqueue if it happens,
which hurts performance. Enlarging the array is a choice to mitigate the
issue, but another solution is to reallocate iovec once we run out of it. How do
you think?


I would prefer we enlarge the array, reallocating the array when the
issue happens sounds like over-engineering to me.

Any idea what size it should be based on your experiments?

Thanks,
Maxime


Thanks,
Jiayu

+
+   iter = async->iov_iter + async->iter_idx;
+   iovec = async->iovec + async->iovec_idx;
+
+   iovec->src_addr = src;
+   iovec->dst_addr = dst;
+   iovec->len = len;
+
+   iter->nr_segs++;
+   async->iovec_idx++;
+
+   return 0;
+}






[dpdk-dev] [Bug 836] DPDK doesn't build under meson 0.60

2021-10-25 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=836

Bug ID: 836
   Summary: DPDK doesn't build under meson 0.60
   Product: DPDK
   Version: 21.08
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: major
  Priority: Normal
 Component: meson
  Assignee: dev@dpdk.org
  Reporter: michallinuxst...@gmail.com
  Target Milestone: ---

Due to
https://github.com/mesonbuild/meson/commit/0a3a9fa0c3ebf45c94d9009a59cead571cbecf7b
gen-pmdinfo-cfile.py fails to run `ar x` against the .a files which were
created as thin archives. This happens for all the .a files for which meson
generates `LINK_ARGS = csrDT` (note the T option) inside the build.ninja.
Here's a failing trace:

ninja: Entering directory `/git_repos/spdk_repo/spdk/dpdk/build-tmp'
[243/272] Generating drivers/rte_bus_vdev.pmd.c with a custom command
FAILED: drivers/rte_bus_vdev.pmd.c
/usr/bin/python3 ../buildtools/gen-pmdinfo-cfile.py
/git_repos/spdk_repo/spdk/dpdk/build-tmp/buildtools ar
/git_repos/spdk_repo/spdk/dpdk/build-tmp/drivers/libtmp_rte_bus_vdev.a
drivers/rte_bus_vdev.pmd.c /usr/bin/python3 ../buildtools/pmdinfogen.py elf
ar: `x' cannot be used on thin archives.
Traceback (most recent call last):
  File
"/git_repos/spdk_repo/spdk/dpdk/build-tmp/../buildtools/gen-pmdinfo-cfile.py",
line 28, in 
run_ar("x")
  File
"/git_repos/spdk_repo/spdk/dpdk/build-tmp/../buildtools/gen-pmdinfo-cfile.py",
line 23, in 
run_ar = lambda command: subprocess.run(
  File "/usr/lib64/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ar', 'x',
'/git_repos/spdk_repo/spdk/dpdk/build-tmp/drivers/libtmp_rte_bus_vdev.a']'
returned non-zero exit status 1.


If I comment the `run_ar("x")` out the build succeeds (my understanding is that
if the .a is indeed a thin archive it consists of only references to .c.o files
which are already available so we are not missing anything). Skipping the thin
.as also seems to be working. That said, my dpdk-fu in terms of its building
components is not strong enough, hence I am not really sure what would be right
fix here. 

Reverting to meson < 0.60 (e.g. 0.59.2) also does the trick, but any hints,
suggestions as to the right fix would be appreciated.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [dpdk-dev] [PATCH v6 4/9] alarm: remove direct access to interrupt handle

2021-10-25 Thread Dmitry Kozlyuk
2021-10-24 22:04 (UTC+0200), David Marchand:
> From: Harman Kalra 
> 
> Removing direct access to interrupt handle structure fields,
> rather use respective get set APIs for the same.
> Making changes to all the libraries access the interrupt handle fields.
> 
> Implementing alarm cleanup routine, where the memory allocated
> for interrupt instance can be freed.
> 
> Signed-off-by: Harman Kalra 
> Signed-off-by: David Marchand 
> ---
> Changes since v5:
> - split from patch4,
> - merged patch6,
> - renamed rte_eal_alarm_fini as rte_eal_alarm_cleanup,
> 
> ---
[...]
> diff --git a/lib/eal/freebsd/eal_alarm.c b/lib/eal/freebsd/eal_alarm.c
> index c38b2e04f8..1a8fcf24c5 100644
> --- a/lib/eal/freebsd/eal_alarm.c
> +++ b/lib/eal/freebsd/eal_alarm.c
> @@ -32,7 +32,7 @@
>  
>  struct alarm_entry {
>   LIST_ENTRY(alarm_entry) next;
> - struct rte_intr_handle handle;
> + struct rte_intr_handle *handle;

This field is never used and can be just removed.

>   struct timespec time;
>   rte_eal_alarm_callback cb_fn;
>   void *cb_arg;
[...]


Re: [dpdk-dev] [PATCH v6 9/9] interrupts: extend event list

2021-10-25 Thread Dmitry Kozlyuk
Hi David,

With some nits below,
Acked-by: Dmitry Kozlyuk 

2021-10-24 22:04 (UTC+0200), David Marchand:
> From: Harman Kalra 
> 
> Dynamically allocating the efds and elist array os intr_handle

Typo: "os" -> "of"

> structure, based on size provided by user. Eg size can be
> MSIX interrupts supported by a PCI device.
> 
> Signed-off-by: Harman Kalra 
> Signed-off-by: David Marchand 
> ---
> Changes since v5:
> - split from patch5,
> 
> ---
[...]
> diff --git a/lib/eal/common/eal_common_interrupts.c 
> b/lib/eal/common/eal_common_interrupts.c
> index 3285c4335f..7feb9da8fa 100644
> --- a/lib/eal/common/eal_common_interrupts.c
> +++ b/lib/eal/common/eal_common_interrupts.c
[...]
>  int rte_intr_fd_set(struct rte_intr_handle *intr_handle, int fd)
> @@ -239,6 +330,12 @@ int rte_intr_efds_index_get(const struct rte_intr_handle 
> *intr_handle,
>  {
>   CHECK_VALID_INTR_HANDLE(intr_handle);
>  
> + if (intr_handle->efds == NULL) {
> + RTE_LOG(ERR, EAL, "Event fd list not allocated\n");
> + rte_errno = EFAULT;
> + goto fail;
> + }
> +

Here and below:
The check for `nb_intr` will already catch not allocated `efds`,
because `nb_intr` is necessarily 0 in this case.

>   if (index >= intr_handle->nb_intr) {
>   RTE_LOG(ERR, EAL, "Invalid index %d, max limit %d\n", index,
>   intr_handle->nb_intr);
> @@ -256,6 +353,12 @@ int rte_intr_efds_index_set(struct rte_intr_handle 
> *intr_handle,
>  {
>   CHECK_VALID_INTR_HANDLE(intr_handle);
>  
> + if (intr_handle->efds == NULL) {
> + RTE_LOG(ERR, EAL, "Event fd list not allocated\n");
> + rte_errno = EFAULT;
> + goto fail;
> + }
> +
>   if (index >= intr_handle->nb_intr) {
>   RTE_LOG(ERR, EAL, "Invalid size %d, max limit %d\n", index,
>   intr_handle->nb_intr);
> @@ -275,6 +378,12 @@ struct rte_epoll_event *rte_intr_elist_index_get(
>  {
>   CHECK_VALID_INTR_HANDLE(intr_handle);
>  
> + if (intr_handle->elist == NULL) {
> + RTE_LOG(ERR, EAL, "Event list not allocated\n");
> + rte_errno = EFAULT;
> + goto fail;
> + }
> +
>   if (index >= intr_handle->nb_intr) {
>   RTE_LOG(ERR, EAL, "Invalid index %d, max limit %d\n", index,
>   intr_handle->nb_intr);
> @@ -292,6 +401,12 @@ int rte_intr_elist_index_set(struct rte_intr_handle 
> *intr_handle,
>  {
>   CHECK_VALID_INTR_HANDLE(intr_handle);
>  
> + if (intr_handle->elist == NULL) {
> + RTE_LOG(ERR, EAL, "Event list not allocated\n");
> + rte_errno = EFAULT;
> + goto fail;
> + }
> +
>   if (index >= intr_handle->nb_intr) {
>   RTE_LOG(ERR, EAL, "Invalid index %d, max limit %d\n", index,
>   intr_handle->nb_intr);
[...]



[dpdk-dev] [PATCH 1/7] net/sfc: do not allow flow rules to refer to VF representors

2021-10-25 Thread Ivan Malov
VF representors do not own dedicated m-ports and thus cannot
be referred to as traffic endpoints in flow items or actions.

Fixes: a62ec90522a6 ("net/sfc: add port representors infrastructure")
Fixes: f55b61cec94a ("net/sfc: support port representor flow item")

Signed-off-by: Ivan Malov 
Reviewed-by: Andrew Rybchenko 
---
 drivers/net/sfc/sfc_switch.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sfc/sfc_switch.c b/drivers/net/sfc/sfc_switch.c
index a28e861de5..265a17f4c4 100644
--- a/drivers/net/sfc/sfc_switch.c
+++ b/drivers/net/sfc/sfc_switch.c
@@ -512,7 +512,7 @@ sfc_mae_clear_switch_port(uint16_t switch_domain_id,
 static int
 sfc_mae_find_switch_port_by_ethdev(uint16_t switch_domain_id,
   uint16_t ethdev_port_id,
-  efx_mport_sel_t *mport_sel)
+  struct sfc_mae_switch_port **switch_port)
 {
struct sfc_mae_switch_domain *domain;
struct sfc_mae_switch_port *port;
@@ -528,7 +528,7 @@ sfc_mae_find_switch_port_by_ethdev(uint16_t 
switch_domain_id,
 
TAILQ_FOREACH(port, &domain->ports, switch_domain_ports) {
if (port->ethdev_port_id == ethdev_port_id) {
-   *mport_sel = port->ethdev_mport;
+   *switch_port = port;
return 0;
}
}
@@ -541,11 +541,27 @@ sfc_mae_switch_port_by_ethdev(uint16_t switch_domain_id,
  uint16_t ethdev_port_id,
  efx_mport_sel_t *mport_sel)
 {
+   struct sfc_mae_switch_port *port;
int rc;
 
rte_spinlock_lock(&sfc_mae_switch.lock);
rc = sfc_mae_find_switch_port_by_ethdev(switch_domain_id,
-   ethdev_port_id, mport_sel);
+   ethdev_port_id, &port);
+   if (rc != 0)
+   goto unlock;
+
+   if (port->type != SFC_MAE_SWITCH_PORT_INDEPENDENT) {
+   /*
+* The ethdev is a "VF representor". It does not own
+* a dedicated m-port suitable for use in flow rules.
+*/
+   rc = ENOTSUP;
+   goto unlock;
+   }
+
+   *mport_sel = port->ethdev_mport;
+
+unlock:
rte_spinlock_unlock(&sfc_mae_switch.lock);
 
return rc;
-- 
2.20.1



[dpdk-dev] [PATCH 3/7] net/sfc: improve m-port related log messages

2021-10-25 Thread Ivan Malov
Make these messages more specific.

Signed-off-by: Ivan Malov 
Reviewed-by: Andrew Rybchenko 
---
 drivers/net/sfc/sfc_mae.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 84b13925ff..a4a22f32c6 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -1281,7 +1281,7 @@ sfc_mae_rule_parse_item_port_id(const struct 
rte_flow_item *item,
if (rc != 0) {
return rte_flow_error_set(error, rc,
RTE_FLOW_ERROR_TYPE_ITEM, item,
-   "Can't find RTE ethdev by the port ID");
+   "Can't get m-port for the given ethdev");
}
 
rc = efx_mae_match_spec_mport_set(ctx_mae->match_spec,
@@ -1341,7 +1341,7 @@ sfc_mae_rule_parse_item_port_representor(const struct 
rte_flow_item *item,
if (rc != 0) {
return rte_flow_error_set(error, rc,
RTE_FLOW_ERROR_TYPE_ITEM, item,
-   "Can't find RTE ethdev by the port ID");
+   "Can't get m-port for the given ethdev");
}
 
rc = efx_mae_match_spec_mport_set(ctx_mae->match_spec,
@@ -3409,7 +3409,7 @@ sfc_mae_rule_parse_action_port_id(struct sfc_adapter *sa,
rc = sfc_mae_switch_get_ethdev_mport(mae->switch_domain_id,
 port_id, &mport);
if (rc != 0) {
-   sfc_err(sa, "failed to find MAE switch port SW entry for RTE 
ethdev port %u: %s",
+   sfc_err(sa, "failed to get m-port for the given ethdev 
(port_id=%u): %s",
port_id, strerror(rc));
return rc;
}
-- 
2.20.1



[dpdk-dev] [PATCH 2/7] net/sfc: rename ethdev m-port retrieval helper

2021-10-25 Thread Ivan Malov
The function in question has an unfortunate name that reads
like finding a SW switch port entry. In fact just one of
the two m-ports is retrieved from that entry.

Signed-off-by: Ivan Malov 
Reviewed-by: Andrew Rybchenko 
---
 drivers/net/sfc/sfc_mae.c| 10 +-
 drivers/net/sfc/sfc_switch.c |  6 +++---
 drivers/net/sfc/sfc_switch.h |  6 +++---
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 455744c570..84b13925ff 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -1276,8 +1276,8 @@ sfc_mae_rule_parse_item_port_id(const struct 
rte_flow_item *item,
  "The port ID is too large");
}
 
-   rc = sfc_mae_switch_port_by_ethdev(ctx_mae->sa->mae.switch_domain_id,
-  spec->id, &mport_sel);
+   rc = sfc_mae_switch_get_ethdev_mport(ctx_mae->sa->mae.switch_domain_id,
+spec->id, &mport_sel);
if (rc != 0) {
return rte_flow_error_set(error, rc,
RTE_FLOW_ERROR_TYPE_ITEM, item,
@@ -1335,7 +1335,7 @@ sfc_mae_rule_parse_item_port_representor(const struct 
rte_flow_item *item,
if (spec == NULL)
return 0;
 
-   rc = sfc_mae_switch_port_by_ethdev(
+   rc = sfc_mae_switch_get_ethdev_mport(
ctx_mae->sa->mae.switch_domain_id,
spec->port_id, &mport_sel);
if (rc != 0) {
@@ -3406,8 +3406,8 @@ sfc_mae_rule_parse_action_port_id(struct sfc_adapter *sa,
 
port_id = (conf->original != 0) ? sas->port_id : conf->id;
 
-   rc = sfc_mae_switch_port_by_ethdev(mae->switch_domain_id,
-  port_id, &mport);
+   rc = sfc_mae_switch_get_ethdev_mport(mae->switch_domain_id,
+port_id, &mport);
if (rc != 0) {
sfc_err(sa, "failed to find MAE switch port SW entry for RTE 
ethdev port %u: %s",
port_id, strerror(rc));
diff --git a/drivers/net/sfc/sfc_switch.c b/drivers/net/sfc/sfc_switch.c
index 265a17f4c4..3f7518fa30 100644
--- a/drivers/net/sfc/sfc_switch.c
+++ b/drivers/net/sfc/sfc_switch.c
@@ -537,9 +537,9 @@ sfc_mae_find_switch_port_by_ethdev(uint16_t 
switch_domain_id,
 }
 
 int
-sfc_mae_switch_port_by_ethdev(uint16_t switch_domain_id,
- uint16_t ethdev_port_id,
- efx_mport_sel_t *mport_sel)
+sfc_mae_switch_get_ethdev_mport(uint16_t switch_domain_id,
+   uint16_t ethdev_port_id,
+   efx_mport_sel_t *mport_sel)
 {
struct sfc_mae_switch_port *port;
int rc;
diff --git a/drivers/net/sfc/sfc_switch.h b/drivers/net/sfc/sfc_switch.h
index 7917141038..a5a0fb4fc5 100644
--- a/drivers/net/sfc/sfc_switch.h
+++ b/drivers/net/sfc/sfc_switch.h
@@ -102,9 +102,9 @@ int sfc_mae_assign_switch_port(uint16_t switch_domain_id,
 int sfc_mae_clear_switch_port(uint16_t switch_domain_id,
  uint16_t switch_port_id);
 
-int sfc_mae_switch_port_by_ethdev(uint16_t switch_domain_id,
- uint16_t ethdev_port_id,
- efx_mport_sel_t *mport_sel);
+int sfc_mae_switch_get_ethdev_mport(uint16_t switch_domain_id,
+   uint16_t ethdev_port_id,
+   efx_mport_sel_t *mport_sel);
 
 int sfc_mae_switch_port_id_by_entity(uint16_t switch_domain_id,
 const efx_mport_sel_t *entity_mportp,
-- 
2.20.1



[dpdk-dev] [PATCH 4/7] net/sfc: assign correct m-ports to independent switch ports

2021-10-25 Thread Ivan Malov
In accordance with patches [1-4], MAE admin ethdev represents a
network port and not the PF which it sits on. Rework the way
how "ethdev" and "entity" m-ports are assigned in SW switch
port entries of independent ethdevs. Explain in comments.

[1] commit 081e42dab11d ("ethdev: add port representor item to flow API")
[2] commit 49863ae2bf95 ("ethdev: add represented port item to flow API")
[3] commit 8edb6bc0263e ("ethdev: add port representor action
to flow API")
[4] commit 88caad251c8d ("ethdev: add represented port action
to flow API")

Signed-off-by: Ivan Malov 
Reviewed-by: Andrew Rybchenko 
---
 drivers/net/sfc/sfc_mae.c | 41 ---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index a4a22f32c6..bd8a913a49 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -23,7 +23,7 @@
 #include "sfc_service.h"
 
 static int
-sfc_mae_assign_entity_mport(struct sfc_adapter *sa,
+sfc_mae_assign_ethdev_mport(struct sfc_adapter *sa,
efx_mport_sel_t *mportp)
 {
const efx_nic_cfg_t *encp = efx_nic_cfg_get(sa->nic);
@@ -32,6 +32,35 @@ sfc_mae_assign_entity_mport(struct sfc_adapter *sa,
  mportp);
 }
 
+static int
+sfc_mae_assign_entity_mport(struct sfc_adapter *sa,
+   efx_mport_sel_t *mportp)
+{
+   const efx_nic_cfg_t *encp = efx_nic_cfg_get(sa->nic);
+   int rc = 0;
+
+   if (encp->enc_mae_admin) {
+   /*
+* This ethdev sits on MAE admin PF. The represented
+* entity is the network port assigned to that PF.
+*/
+   rc = efx_mae_mport_by_phy_port(encp->enc_assigned_port, mportp);
+   } else {
+   /*
+* This ethdev sits on unprivileged PF / VF. The entity
+* represented by the ethdev can change dynamically
+* as MAE admin changes default traffic rules.
+*
+* For the sake of simplicity, do not fill in the m-port
+* and assume that flow rules should not be allowed to
+* reference the entity represented by this ethdev.
+*/
+   efx_mae_mport_invalid(mportp);
+   }
+
+   return rc;
+}
+
 static int
 sfc_mae_counter_registry_init(struct sfc_mae_counter_registry *registry,
  uint32_t nb_counters_max)
@@ -184,6 +213,7 @@ sfc_mae_attach(struct sfc_adapter *sa)
struct sfc_adapter_shared * const sas = sfc_sa2shared(sa);
struct sfc_mae_switch_port_request switch_port_request = {0};
const efx_nic_cfg_t *encp = efx_nic_cfg_get(sa->nic);
+   efx_mport_sel_t ethdev_mport;
efx_mport_sel_t entity_mport;
struct sfc_mae *mae = &sa->mae;
struct sfc_mae_bounce_eh *bounce_eh = &mae->bounce_eh;
@@ -218,6 +248,11 @@ sfc_mae_attach(struct sfc_adapter *sa)
}
}
 
+   sfc_log_init(sa, "assign ethdev MPORT");
+   rc = sfc_mae_assign_ethdev_mport(sa, ðdev_mport);
+   if (rc != 0)
+   goto fail_mae_assign_ethdev_mport;
+
sfc_log_init(sa, "assign entity MPORT");
rc = sfc_mae_assign_entity_mport(sa, &entity_mport);
if (rc != 0)
@@ -230,9 +265,8 @@ sfc_mae_attach(struct sfc_adapter *sa)
 
sfc_log_init(sa, "assign RTE switch port");
switch_port_request.type = SFC_MAE_SWITCH_PORT_INDEPENDENT;
+   switch_port_request.ethdev_mportp = ðdev_mport;
switch_port_request.entity_mportp = &entity_mport;
-   /* RTE ethdev MPORT matches that of the entity for independent ports. */
-   switch_port_request.ethdev_mportp = &entity_mport;
switch_port_request.ethdev_port_id = sas->port_id;
switch_port_request.port_data.indep.mae_admin =
encp->enc_mae_admin == B_TRUE;
@@ -272,6 +306,7 @@ sfc_mae_attach(struct sfc_adapter *sa)
 fail_mae_assign_switch_port:
 fail_mae_assign_switch_domain:
 fail_mae_assign_entity_mport:
+fail_mae_assign_ethdev_mport:
if (encp->enc_mae_admin)
sfc_mae_counter_registry_fini(&mae->counter_registry);
 
-- 
2.20.1



[dpdk-dev] [PATCH 5/7] net/sfc: support represented port flow item

2021-10-25 Thread Ivan Malov
Add support for item REPRESENTED_PORT to match on traffic entering
the embedded switch from the entity represented by the given
ethdev (network port or VF).

Signed-off-by: Ivan Malov 
Reviewed-by: Andrew Rybchenko 
---
 doc/guides/nics/features/sfc.ini |  1 +
 doc/guides/nics/sfc_efx.rst  |  2 ++
 drivers/net/sfc/sfc_mae.c| 52 +---
 drivers/net/sfc/sfc_switch.c | 29 ++
 drivers/net/sfc/sfc_switch.h |  4 +++
 5 files changed, 77 insertions(+), 11 deletions(-)

diff --git a/doc/guides/nics/features/sfc.ini b/doc/guides/nics/features/sfc.ini
index 7db868e59f..c830426eb2 100644
--- a/doc/guides/nics/features/sfc.ini
+++ b/doc/guides/nics/features/sfc.ini
@@ -53,6 +53,7 @@ port_id  = Y
 port_representor = Y
 pppoed   = Y
 pppoes   = Y
+represented_port = Y
 tcp  = Y
 udp  = Y
 vf   = Y
diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 843c24991c..8dbd250b3c 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -194,6 +194,8 @@ Supported pattern items (***transfer*** rules):
 
 - PORT_REPRESENTOR (cannot repeat; conflicts with other traffic source items)
 
+- REPRESENTED_PORT (cannot repeat; conflicts with other traffic source items)
+
 - PORT_ID (cannot repeat; conflicts with other traffic source items)
 
 - PHY_PORT (cannot repeat; conflicts with other traffic source items)
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index bd8a913a49..8ec4036275 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -1333,9 +1333,9 @@ sfc_mae_rule_parse_item_port_id(const struct 
rte_flow_item *item,
 }
 
 static int
-sfc_mae_rule_parse_item_port_representor(const struct rte_flow_item *item,
-struct sfc_flow_parse_ctx *ctx,
-struct rte_flow_error *error)
+sfc_mae_rule_parse_item_ethdev_based(const struct rte_flow_item *item,
+struct sfc_flow_parse_ctx *ctx,
+struct rte_flow_error *error)
 {
struct sfc_mae_parse_ctx *ctx_mae = ctx->mae;
const struct rte_flow_item_ethdev supp_mask = {
@@ -1363,20 +1363,38 @@ sfc_mae_rule_parse_item_port_representor(const struct 
rte_flow_item *item,
if (mask->port_id != supp_mask.port_id) {
return rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ITEM, item,
-   "Bad mask in the PORT_REPRESENTOR pattern 
item");
+   "Bad mask in the ethdev-based pattern item");
}
 
/* If "spec" is not set, could be any port ID */
if (spec == NULL)
return 0;
 
-   rc = sfc_mae_switch_get_ethdev_mport(
-   ctx_mae->sa->mae.switch_domain_id,
-   spec->port_id, &mport_sel);
-   if (rc != 0) {
-   return rte_flow_error_set(error, rc,
+   switch (item->type) {
+   case RTE_FLOW_ITEM_TYPE_PORT_REPRESENTOR:
+   rc = sfc_mae_switch_get_ethdev_mport(
+   ctx_mae->sa->mae.switch_domain_id,
+   spec->port_id, &mport_sel);
+   if (rc != 0) {
+   return rte_flow_error_set(error, rc,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "Can't get m-port for the given 
ethdev");
+   }
+   break;
+   case RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT:
+   rc = sfc_mae_switch_get_entity_mport(
+   ctx_mae->sa->mae.switch_domain_id,
+   spec->port_id, &mport_sel);
+   if (rc != 0) {
+   return rte_flow_error_set(error, rc,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "Can't get m-port for the given 
ethdev");
+   }
+   break;
+   default:
+   return rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ITEM, item,
-   "Can't get m-port for the given ethdev");
+   "Unsupported ethdev-based flow item");
}
 
rc = efx_mae_match_spec_mport_set(ctx_mae->match_spec,
@@ -2329,7 +2347,19 @@ static const struct sfc_flow_item sfc_flow_items[] = {
.prev_layer = SFC_FLOW_ITEM_ANY_LAYER,
.layer = SFC_FLOW_ITEM_ANY_LAYER,
.ctx_type = SFC_FLOW_PARSE_CTX_MAE,
-   .parse = sfc_mae_rule_parse_item_port_representor,
+   .parse = sfc_mae_rule_parse_item_ethdev_based,
+   },
+   {
+   .type = RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT,
+   .name = "

[dpdk-dev] [PATCH 6/7] net/sfc: support port representor related flow actions

2021-10-25 Thread Ivan Malov
Add support for actions PORT_REPRESENTOR and REPRESENTED_PORT.

The former should be used instead of ambiguous PORT_ID.

The latter sends traffic to the entity represented by
the given ethdev (network port or VF).

Signed-off-by: Ivan Malov 
Reviewed-by: Andrew Rybchenko 
---
 doc/guides/nics/features/sfc.ini |  2 +
 doc/guides/nics/sfc_efx.rst  |  4 ++
 drivers/net/sfc/sfc_mae.c| 66 
 3 files changed, 72 insertions(+)

diff --git a/doc/guides/nics/features/sfc.ini b/doc/guides/nics/features/sfc.ini
index c830426eb2..0d785f4765 100644
--- a/doc/guides/nics/features/sfc.ini
+++ b/doc/guides/nics/features/sfc.ini
@@ -73,6 +73,8 @@ of_set_vlan_vid  = Y
 pf   = Y
 phy_port = Y
 port_id  = Y
+port_representor = Y
+represented_port = Y
 queue= Y
 rss  = Y
 vf   = Y
diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 8dbd250b3c..960e25bf98 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -248,6 +248,10 @@ Supported actions (***transfer*** rules):
 
 - VF
 
+- PORT_REPRESENTOR
+
+- REPRESENTED_PORT
+
 - PORT_ID
 
 - COUNT
diff --git a/drivers/net/sfc/sfc_mae.c b/drivers/net/sfc/sfc_mae.c
index 8ec4036275..411f2ac27e 100644
--- a/drivers/net/sfc/sfc_mae.c
+++ b/drivers/net/sfc/sfc_mae.c
@@ -3488,6 +3488,58 @@ sfc_mae_rule_parse_action_port_id(struct sfc_adapter *sa,
return rc;
 }
 
+static int
+sfc_mae_rule_parse_action_port_representor(struct sfc_adapter *sa,
+   const struct rte_flow_action_ethdev *conf,
+   efx_mae_actions_t *spec)
+{
+   struct sfc_mae *mae = &sa->mae;
+   efx_mport_sel_t mport;
+   int rc;
+
+   rc = sfc_mae_switch_get_ethdev_mport(mae->switch_domain_id,
+conf->port_id, &mport);
+   if (rc != 0) {
+   sfc_err(sa, "failed to get m-port for the given ethdev 
(port_id=%u): %s",
+   conf->port_id, strerror(rc));
+   return rc;
+   }
+
+   rc = efx_mae_action_set_populate_deliver(spec, &mport);
+   if (rc != 0) {
+   sfc_err(sa, "failed to request action DELIVER with m-port 
selector 0x%08x: %s",
+   mport.sel, strerror(rc));
+   }
+
+   return rc;
+}
+
+static int
+sfc_mae_rule_parse_action_represented_port(struct sfc_adapter *sa,
+   const struct rte_flow_action_ethdev *conf,
+   efx_mae_actions_t *spec)
+{
+   struct sfc_mae *mae = &sa->mae;
+   efx_mport_sel_t mport;
+   int rc;
+
+   rc = sfc_mae_switch_get_entity_mport(mae->switch_domain_id,
+conf->port_id, &mport);
+   if (rc != 0) {
+   sfc_err(sa, "failed to get m-port for the given ethdev 
(port_id=%u): %s",
+   conf->port_id, strerror(rc));
+   return rc;
+   }
+
+   rc = efx_mae_action_set_populate_deliver(spec, &mport);
+   if (rc != 0) {
+   sfc_err(sa, "failed to request action DELIVER with m-port 
selector 0x%08x: %s",
+   mport.sel, strerror(rc));
+   }
+
+   return rc;
+}
+
 static const char * const action_names[] = {
[RTE_FLOW_ACTION_TYPE_VXLAN_DECAP] = "VXLAN_DECAP",
[RTE_FLOW_ACTION_TYPE_OF_POP_VLAN] = "OF_POP_VLAN",
@@ -3501,6 +3553,8 @@ static const char * const action_names[] = {
[RTE_FLOW_ACTION_TYPE_PF] = "PF",
[RTE_FLOW_ACTION_TYPE_VF] = "VF",
[RTE_FLOW_ACTION_TYPE_PORT_ID] = "PORT_ID",
+   [RTE_FLOW_ACTION_TYPE_PORT_REPRESENTOR] = "PORT_REPRESENTOR",
+   [RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT] = "REPRESENTED_PORT",
[RTE_FLOW_ACTION_TYPE_DROP] = "DROP",
[RTE_FLOW_ACTION_TYPE_JUMP] = "JUMP",
 };
@@ -3609,6 +3663,18 @@ sfc_mae_rule_parse_action(struct sfc_adapter *sa,
   bundle->actions_mask);
rc = sfc_mae_rule_parse_action_port_id(sa, action->conf, spec);
break;
+   case RTE_FLOW_ACTION_TYPE_PORT_REPRESENTOR:
+   SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_PORT_REPRESENTOR,
+  bundle->actions_mask);
+   rc = sfc_mae_rule_parse_action_port_representor(sa,
+   action->conf, spec);
+   break;
+   case RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT:
+   SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT,
+  bundle->actions_mask);
+   rc = sfc_mae_rule_parse_action_represented_port(sa,
+   action->conf, spec);
+   break;
case RTE_FLOW_ACTION_TYPE_DROP:
SFC_BUILD_SET_OVERFLOW(RTE_FLOW_ACTION_TYPE_DROP,
   bundle->actions_mask);
-- 
2.20.1



[dpdk-dev] [PATCH 7/7] net/sfc: ignore direction attributes in transfer flows

2021-10-25 Thread Ivan Malov
[1] has deprecated the use of direction attributes in "transfer"
flows. Ignore them during the transition period.

[1] commit 9d2a349b388a ("ethdev: deprecate direction attributes
in transfer flows")

Signed-off-by: Ivan Malov 
Reviewed-by: Andrew Rybchenko 
---
 drivers/net/sfc/sfc_flow.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index be2dfe778a..fc74c8035e 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1252,13 +1252,13 @@ sfc_flow_parse_attr(struct sfc_adapter *sa,
   "Groups are not supported");
return -rte_errno;
}
-   if (attr->egress != 0) {
+   if (attr->egress != 0 && attr->transfer == 0) {
rte_flow_error_set(error, ENOTSUP,
   RTE_FLOW_ERROR_TYPE_ATTR_EGRESS, attr,
   "Egress is not supported");
return -rte_errno;
}
-   if (attr->ingress == 0) {
+   if (attr->ingress == 0 && attr->transfer == 0) {
rte_flow_error_set(error, ENOTSUP,
   RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, attr,
   "Ingress is compulsory");
-- 
2.20.1



Re: [dpdk-dev] [PATCH v6 4/9] alarm: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
On Mon, Oct 25, 2021 at 12:49 PM Dmitry Kozlyuk
 wrote:
> > diff --git a/lib/eal/freebsd/eal_alarm.c b/lib/eal/freebsd/eal_alarm.c
> > index c38b2e04f8..1a8fcf24c5 100644
> > --- a/lib/eal/freebsd/eal_alarm.c
> > +++ b/lib/eal/freebsd/eal_alarm.c
> > @@ -32,7 +32,7 @@
> >
> >  struct alarm_entry {
> >   LIST_ENTRY(alarm_entry) next;
> > - struct rte_intr_handle handle;
> > + struct rte_intr_handle *handle;
>
> This field is never used and can be just removed.

Indeed, removed.
>
> >   struct timespec time;
> >   rte_eal_alarm_callback cb_fn;
> >   void *cb_arg;
> [...]
>


-- 
David Marchand



Re: [dpdk-dev] [PATCH v6 9/9] interrupts: extend event list

2021-10-25 Thread David Marchand
On Mon, Oct 25, 2021 at 12:49 PM Dmitry Kozlyuk
 wrote:
> > diff --git a/lib/eal/common/eal_common_interrupts.c 
> > b/lib/eal/common/eal_common_interrupts.c
> > index 3285c4335f..7feb9da8fa 100644
> > --- a/lib/eal/common/eal_common_interrupts.c
> > +++ b/lib/eal/common/eal_common_interrupts.c
> [...]
> >  int rte_intr_fd_set(struct rte_intr_handle *intr_handle, int fd)
> > @@ -239,6 +330,12 @@ int rte_intr_efds_index_get(const struct 
> > rte_intr_handle *intr_handle,
> >  {
> >   CHECK_VALID_INTR_HANDLE(intr_handle);
> >
> > + if (intr_handle->efds == NULL) {
> > + RTE_LOG(ERR, EAL, "Event fd list not allocated\n");
> > + rte_errno = EFAULT;
> > + goto fail;
> > + }
> > +
>
> Here and below:
> The check for `nb_intr` will already catch not allocated `efds`,
> because `nb_intr` is necessarily 0 in this case.

+1.
Thanks Dmitry.


-- 
David Marchand



[dpdk-dev] [PATCH v18 0/5] Add PIE support for HQoS library

2021-10-25 Thread Liguzinski, WojciechX
DPDK sched library is equipped with mechanism that secures it from the 
bufferbloat problem
which is a situation when excess buffers in the network cause high latency and 
latency
variation. Currently, it supports RED for active queue management. However, more
advanced queue management is required to address this problem and provide 
desirable
quality of service to users.

This solution (RFC) proposes usage of new algorithm called "PIE" (Proportional 
Integral
controller Enhanced) that can effectively and directly control queuing latency 
to address
the bufferbloat problem.

The implementation of mentioned functionality includes modification of existing 
and
adding a new set of data structures to the library, adding PIE related APIs.
This affects structures in public API/ABI. That is why deprecation notice is 
going
to be prepared and sent.

Liguzinski, WojciechX (5):
  sched: add PIE based congestion management
  example/qos_sched: add PIE support
  example/ip_pipeline: add PIE support
  doc/guides/prog_guide: added PIE
  app/test: add tests for PIE

 app/test/meson.build |4 +
 app/test/test_pie.c  | 1065 ++
 config/rte_config.h  |1 -
 doc/guides/prog_guide/glossary.rst   |3 +
 doc/guides/prog_guide/qos_framework.rst  |   64 +-
 doc/guides/prog_guide/traffic_management.rst |   13 +-
 drivers/net/softnic/rte_eth_softnic_tm.c |6 +-
 examples/ip_pipeline/tmgr.c  |  142 +--
 examples/qos_sched/cfg_file.c|  127 ++-
 examples/qos_sched/cfg_file.h|5 +
 examples/qos_sched/init.c|   27 +-
 examples/qos_sched/main.h|3 +
 examples/qos_sched/profile.cfg   |  196 ++--
 lib/sched/meson.build|3 +-
 lib/sched/rte_pie.c  |   86 ++
 lib/sched/rte_pie.h  |  398 +++
 lib/sched/rte_sched.c|  241 ++--
 lib/sched/rte_sched.h|   63 +-
 lib/sched/version.map|4 +
 19 files changed, 2172 insertions(+), 279 deletions(-)
 create mode 100644 app/test/test_pie.c
 create mode 100644 lib/sched/rte_pie.c
 create mode 100644 lib/sched/rte_pie.h

-- 
2.25.1

Series-acked-by: Cristian Dumitrescu 


[dpdk-dev] [PATCH v18 1/5] sched: add PIE based congestion management

2021-10-25 Thread Liguzinski, WojciechX
Implement PIE based congestion management based on rfc8033

Signed-off-by: Liguzinski, WojciechX 
--
Changes in V18:
- Resolved merge conflict in lib/sched/meson.build after rebasing ontop of main
- Reverted whitespace change in app_thread.c - comment from Stephen Hemminger

Changes in V17:
- Corrected paragraph link naming in qos_framework.rst to fix CI builds

Changes in V16:
- Fixed 'title underline too short' error in qos_framework.rst
- Applied __rte_unused macro to parameters in rte_sched_port_pie_dequeue()

---
 drivers/net/softnic/rte_eth_softnic_tm.c |   6 +-
 lib/sched/meson.build|   3 +-
 lib/sched/rte_pie.c  |  82 +
 lib/sched/rte_pie.h  | 393 +++
 lib/sched/rte_sched.c| 241 +-
 lib/sched/rte_sched.h|  63 +++-
 lib/sched/version.map|   4 +
 7 files changed, 702 insertions(+), 90 deletions(-)
 create mode 100644 lib/sched/rte_pie.c
 create mode 100644 lib/sched/rte_pie.h

diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c 
b/drivers/net/softnic/rte_eth_softnic_tm.c
index 90baba15ce..e74092ce7f 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -420,7 +420,7 @@ pmd_tm_node_type_get(struct rte_eth_dev *dev,
return 0;
 }
 
-#ifdef RTE_SCHED_RED
+#ifdef RTE_SCHED_CMAN
 #define WRED_SUPPORTED 1
 #else
 #define WRED_SUPPORTED 0
@@ -2306,7 +2306,7 @@ tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t 
tc_id)
return NULL;
 }
 
-#ifdef RTE_SCHED_RED
+#ifdef RTE_SCHED_CMAN
 
 static void
 wred_profiles_set(struct rte_eth_dev *dev, uint32_t subport_id)
@@ -2321,7 +2321,7 @@ wred_profiles_set(struct rte_eth_dev *dev, uint32_t 
subport_id)
for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++)
for (color = RTE_COLOR_GREEN; color < RTE_COLORS; color++) {
struct rte_red_params *dst =
-   &pp->red_params[tc_id][color];
+   &pp->cman_params->red_params[tc_id][color];
struct tm_wred_profile *src_wp =
tm_tc_wred_profile_get(dev, tc_id);
struct rte_tm_red_params *src =
diff --git a/lib/sched/meson.build b/lib/sched/meson.build
index 8ced4547aa..df75db51ed 100644
--- a/lib/sched/meson.build
+++ b/lib/sched/meson.build
@@ -7,11 +7,12 @@ if is_windows
 subdir_done()
 endif
 
-sources = files('rte_sched.c', 'rte_red.c', 'rte_approx.c')
+sources = files('rte_sched.c', 'rte_red.c', 'rte_approx.c', 'rte_pie.c')
 headers = files(
 'rte_approx.h',
 'rte_red.h',
 'rte_sched.h',
 'rte_sched_common.h',
+'rte_pie.h',
 )
 deps += ['mbuf', 'meter']
diff --git a/lib/sched/rte_pie.c b/lib/sched/rte_pie.c
new file mode 100644
index 00..2fcecb2db4
--- /dev/null
+++ b/lib/sched/rte_pie.c
@@ -0,0 +1,82 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include 
+
+#include "rte_pie.h"
+#include 
+#include 
+#include 
+
+#ifdef __INTEL_COMPILER
+#pragma warning(disable:2259) /* conversion may lose significant bits */
+#endif
+
+void
+rte_pie_rt_data_init(struct rte_pie *pie)
+{
+   if (pie == NULL) {
+   /* Allocate memory to use the PIE data structure */
+   pie = rte_malloc(NULL, sizeof(struct rte_pie), 0);
+
+   if (pie == NULL)
+   RTE_LOG(ERR, SCHED, "%s: Memory allocation fails\n", 
__func__);
+   }
+
+   pie->active = 0;
+   pie->in_measurement = 0;
+   pie->departed_bytes_count = 0;
+   pie->start_measurement = 0;
+   pie->last_measurement = 0;
+   pie->qlen = 0;
+   pie->avg_dq_time = 0;
+   pie->burst_allowance = 0;
+   pie->qdelay_old = 0;
+   pie->drop_prob = 0;
+   pie->accu_prob = 0;
+}
+
+int
+rte_pie_config_init(struct rte_pie_config *pie_cfg,
+   const uint16_t qdelay_ref,
+   const uint16_t dp_update_interval,
+   const uint16_t max_burst,
+   const uint16_t tailq_th)
+{
+   uint64_t tsc_hz = rte_get_tsc_hz();
+
+   if (pie_cfg == NULL)
+   return -1;
+
+   if (qdelay_ref <= 0) {
+   RTE_LOG(ERR, SCHED,
+   "%s: Incorrect value for qdelay_ref\n", __func__);
+   return -EINVAL;
+   }
+
+   if (dp_update_interval <= 0) {
+   RTE_LOG(ERR, SCHED,
+   "%s: Incorrect value for dp_update_interval\n", 
__func__);
+   return -EINVAL;
+   }
+
+   if (max_burst <= 0) {
+   RTE_LOG(ERR, SCHED,
+   "%s: Incorrect value for max_burst\n", __func__);
+   return -EINVAL;
+   }
+
+   if (tailq_th <= 0) {
+   RTE_LOG(ERR, SCHED,
+  

[dpdk-dev] [PATCH v18 2/5] example/qos_sched: add PIE support

2021-10-25 Thread Liguzinski, WojciechX
patch add support enable PIE or RED by
parsing config file.

Signed-off-by: Liguzinski, WojciechX 
---
 config/rte_config.h|   1 -
 examples/qos_sched/cfg_file.c  | 127 +++--
 examples/qos_sched/cfg_file.h  |   5 +
 examples/qos_sched/init.c  |  27 +++--
 examples/qos_sched/main.h  |   3 +
 examples/qos_sched/profile.cfg | 196 ++---
 6 files changed, 250 insertions(+), 109 deletions(-)

diff --git a/config/rte_config.h b/config/rte_config.h
index e0ead8b251..740f42c7e9 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -90,7 +90,6 @@
 #define RTE_MAX_LCORE_FREQS 64
 
 /* rte_sched defines */
-#undef RTE_SCHED_RED
 #undef RTE_SCHED_COLLECT_STATS
 #undef RTE_SCHED_SUBPORT_TC_OV
 #define RTE_SCHED_PORT_N_GRINDERS 8
diff --git a/examples/qos_sched/cfg_file.c b/examples/qos_sched/cfg_file.c
index cd167bd8e6..450482f07d 100644
--- a/examples/qos_sched/cfg_file.c
+++ b/examples/qos_sched/cfg_file.c
@@ -229,6 +229,40 @@ cfg_load_subport_profile(struct rte_cfgfile *cfg,
return 0;
 }
 
+#ifdef RTE_SCHED_CMAN
+void set_subport_cman_params(struct rte_sched_subport_params *subport_p,
+   struct rte_sched_cman_params cman_p)
+{
+   int j, k;
+   subport_p->cman_params->cman_mode = cman_p.cman_mode;
+
+   for (j = 0; j < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; j++) {
+   if (subport_p->cman_params->cman_mode ==
+   RTE_SCHED_CMAN_RED) {
+   for (k = 0; k < RTE_COLORS; k++) {
+   subport_p->cman_params->red_params[j][k].min_th 
=
+   cman_p.red_params[j][k].min_th;
+   subport_p->cman_params->red_params[j][k].max_th 
=
+   cman_p.red_params[j][k].max_th;
+   
subport_p->cman_params->red_params[j][k].maxp_inv =
+   cman_p.red_params[j][k].maxp_inv;
+   
subport_p->cman_params->red_params[j][k].wq_log2 =
+   cman_p.red_params[j][k].wq_log2;
+   }
+   } else {
+   subport_p->cman_params->pie_params[j].qdelay_ref =
+   cman_p.pie_params[j].qdelay_ref;
+   
subport_p->cman_params->pie_params[j].dp_update_interval =
+   cman_p.pie_params[j].dp_update_interval;
+   subport_p->cman_params->pie_params[j].max_burst =
+   cman_p.pie_params[j].max_burst;
+   subport_p->cman_params->pie_params[j].tailq_th =
+   cman_p.pie_params[j].tailq_th;
+   }
+   }
+}
+#endif
+
 int
 cfg_load_subport(struct rte_cfgfile *cfg, struct rte_sched_subport_params 
*subport_params)
 {
@@ -242,25 +276,26 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct 
rte_sched_subport_params *subpo
memset(active_queues, 0, sizeof(active_queues));
n_active_queues = 0;
 
-#ifdef RTE_SCHED_RED
-   char sec_name[CFG_NAME_LEN];
-   struct rte_red_params 
red_params[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE][RTE_COLORS];
+#ifdef RTE_SCHED_CMAN
+   struct rte_sched_cman_params cman_params = {
+   .cman_mode = RTE_SCHED_CMAN_RED,
+   .red_params = { },
+   };
 
-   snprintf(sec_name, sizeof(sec_name), "red");
-
-   if (rte_cfgfile_has_section(cfg, sec_name)) {
+   if (rte_cfgfile_has_section(cfg, "red")) {
+   cman_params.cman_mode = RTE_SCHED_CMAN_RED;
 
for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
char str[32];
 
-   /* Parse WRED min thresholds */
-   snprintf(str, sizeof(str), "tc %d wred min", i);
-   entry = rte_cfgfile_get_entry(cfg, sec_name, str);
+   /* Parse RED min thresholds */
+   snprintf(str, sizeof(str), "tc %d red min", i);
+   entry = rte_cfgfile_get_entry(cfg, "red", str);
if (entry) {
char *next;
/* for each packet colour (green, yellow, red) 
*/
for (j = 0; j < RTE_COLORS; j++) {
-   red_params[i][j].min_th
+   cman_params.red_params[i][j].min_th
= (uint16_t)strtol(entry, 
&next, 10);
if (next == NULL)
break;
@@ -268,14 +303,14 @@ cfg_load_subport(struct rte_cfgfile *cfg, struct 
rte_sched_subport_params *subpo
}
}
 
-   /* Parse WRED m

[dpdk-dev] [PATCH v18 3/5] example/ip_pipeline: add PIE support

2021-10-25 Thread Liguzinski, WojciechX
Adding the PIE support for IP Pipeline

Signed-off-by: Liguzinski, WojciechX 
---
 examples/ip_pipeline/tmgr.c | 142 +++-
 1 file changed, 74 insertions(+), 68 deletions(-)

diff --git a/examples/ip_pipeline/tmgr.c b/examples/ip_pipeline/tmgr.c
index e4e364cbc0..b138e885cf 100644
--- a/examples/ip_pipeline/tmgr.c
+++ b/examples/ip_pipeline/tmgr.c
@@ -17,6 +17,77 @@ static uint32_t n_subport_profiles;
 static struct rte_sched_pipe_params
pipe_profile[TMGR_PIPE_PROFILE_MAX];
 
+#ifdef RTE_SCHED_CMAN
+static struct rte_sched_cman_params cman_params = {
+   .red_params = {
+   /* Traffic Class 0 Colors Green / Yellow / Red */
+   [0][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [0][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [0][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 1 - Colors Green / Yellow / Red */
+   [1][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [1][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [1][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 2 - Colors Green / Yellow / Red */
+   [2][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [2][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [2][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 3 - Colors Green / Yellow / Red */
+   [3][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [3][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [3][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 4 - Colors Green / Yellow / Red */
+   [4][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [4][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [4][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 5 - Colors Green / Yellow / Red */
+   [5][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [5][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [5][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 6 - Colors Green / Yellow / Red */
+   [6][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [6][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [6][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 7 - Colors Green / Yellow / Red */
+   [7][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [7][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [7][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 8 - Colors Green / Yellow / Red */
+   [8][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [8][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [8][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 9 - Colors Green / Yellow / Red */
+   [9][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [9][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [9][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 10 - Colors Green / Yellow / Red */
+   [10][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [10][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [10][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 11 - Colors Green / Yellow / Red */
+   [11][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [11][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [11][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+
+   /* Traffic Class 12 - Colors Green / Yellow / Red */
+   [12][0] = {.min_th = 48, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [12][1] = {.min_th = 40, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   [12][2] = {.min_th = 32, .max_th = 64, .maxp_inv = 10, .wq_log2 
= 9},
+   },
+};
+#endif /* RTE_SCHED_CMAN */
+
 static uint32_t n_pipe_profiles;
 
 static const str

[dpdk-dev] [PATCH v18 4/5] doc/guides/prog_guide: added PIE

2021-10-25 Thread Liguzinski, WojciechX
Added PIE related information to documentation.

Signed-off-by: Liguzinski, WojciechX 
---
 doc/guides/prog_guide/glossary.rst   |  3 +
 doc/guides/prog_guide/qos_framework.rst  | 64 +---
 doc/guides/prog_guide/traffic_management.rst | 13 +++-
 3 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/doc/guides/prog_guide/glossary.rst 
b/doc/guides/prog_guide/glossary.rst
index 7044a7df2a..fb0910ba5b 100644
--- a/doc/guides/prog_guide/glossary.rst
+++ b/doc/guides/prog_guide/glossary.rst
@@ -158,6 +158,9 @@ PCI
 PHY
An abbreviation for the physical layer of the OSI model.
 
+PIE
+   Proportional Integral Controller Enhanced (RFC8033)
+
 pktmbuf
An *mbuf* carrying a network packet.
 
diff --git a/doc/guides/prog_guide/qos_framework.rst 
b/doc/guides/prog_guide/qos_framework.rst
index 3b8a1184b0..7c37b78804 100644
--- a/doc/guides/prog_guide/qos_framework.rst
+++ b/doc/guides/prog_guide/qos_framework.rst
@@ -56,7 +56,8 @@ A functional description of each block is provided in the 
following table.
|   ||  
  |

+---+++
| 7 | Dropper| Congestion management using the Random Early 
Detection (RED) algorithm |
-   |   || (specified by the Sally Floyd - Van Jacobson 
paper) or Weighted RED (WRED).|
+   |   || (specified by the Sally Floyd - Van Jacobson 
paper) or Weighted RED (WRED) |
+   |   || or Proportional Integral Controller Enhanced 
(PIE).|
|   || Drop packets based on the current scheduler 
queue load level and packet|
|   || priority. When congestion is experienced, 
lower priority packets are dropped   |
|   || first.   
  |
@@ -421,7 +422,7 @@ No input packet can be part of more than one pipeline stage 
at a given time.
 The congestion management scheme implemented by the enqueue pipeline described 
above is very basic:
 packets are enqueued until a specific queue becomes full,
 then all the packets destined to the same queue are dropped until packets are 
consumed (by the dequeue operation).
-This can be improved by enabling RED/WRED as part of the enqueue pipeline 
which looks at the queue occupancy and
+This can be improved by enabling RED/WRED or PIE as part of the enqueue 
pipeline which looks at the queue occupancy and
 packet priority in order to yield the enqueue/drop decision for a specific 
packet
 (as opposed to enqueuing all packets / dropping all packets indiscriminately).
 
@@ -1155,13 +1156,13 @@ If the number of queues is small,
 then the performance of the port scheduler for the same level of active 
traffic is expected to be worse than
 the performance of a small set of message passing queues.
 
-.. _Dropper:
+.. _Droppers:
 
-Dropper

+Droppers
+
 
 The purpose of the DPDK dropper is to drop packets arriving at a packet 
scheduler to avoid congestion.
-The dropper supports the Random Early Detection (RED),
+The dropper supports the Proportional Integral Controller Enhanced (PIE), 
Random Early Detection (RED),
 Weighted Random Early Detection (WRED) and tail drop algorithms.
 :numref:`figure_blk_diag_dropper` illustrates how the dropper integrates with 
the scheduler.
 The DPDK currently does not support congestion management
@@ -1174,9 +1175,13 @@ so the dropper provides the only method for congestion 
avoidance.
High-level Block Diagram of the DPDK Dropper
 
 
-The dropper uses the Random Early Detection (RED) congestion avoidance 
algorithm as documented in the reference publication.
-The purpose of the RED algorithm is to monitor a packet queue,
+The dropper uses one of two congestion avoidance algorithms:
+   - the Random Early Detection (RED) as documented in the reference 
publication.
+   - the Proportional Integral Controller Enhanced (PIE) as documented in 
RFC8033 publication.
+
+The purpose of the RED/PIE algorithm is to monitor a packet queue,
 determine the current congestion level in the queue and decide whether an 
arriving packet should be enqueued or dropped.
+
 The RED algorithm uses an Exponential Weighted Moving Average (EWMA) filter to 
compute average queue size which
 gives an indication of the current congestion level in the queue.
 
@@ -1192,7 +1197,7 @@ This occurs when a packet queue has reached maximum 
capacity and cannot store an
 In this situation, all arriving packets are dropped.
 
 The flow through the dropper is illustrated in 
:numref:`figure_flow_tru_droppper`.
-The RED/WRED algorithm is exercised first and tail drop second.
+The RED/WRED/PIE algorithm is exercised first and tail drop

[dpdk-dev] [PATCH v18 5/5] app/test: add tests for PIE

2021-10-25 Thread Liguzinski, WojciechX
Tests for PIE code added to test application.

Signed-off-by: Liguzinski, WojciechX 
---
 app/test/meson.build |4 +
 app/test/test_pie.c  | 1065 ++
 lib/sched/rte_pie.c  |6 +-
 lib/sched/rte_pie.h  |   17 +-
 4 files changed, 1085 insertions(+), 7 deletions(-)
 create mode 100644 app/test/test_pie.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 20f36a1803..2ac716629b 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -115,6 +115,7 @@ test_sources = files(
 'test_reciprocal_division.c',
 'test_reciprocal_division_perf.c',
 'test_red.c',
+'test_pie.c',
 'test_reorder.c',
 'test_rib.c',
 'test_rib6.c',
@@ -249,6 +250,7 @@ fast_tests = [
 ['prefetch_autotest', true],
 ['rcu_qsbr_autotest', true],
 ['red_autotest', true],
+['pie_autotest', true],
 ['rib_autotest', true],
 ['rib6_autotest', true],
 ['ring_autotest', true],
@@ -300,6 +302,7 @@ perf_test_names = [
 'fib_slow_autotest',
 'fib_perf_autotest',
 'red_all',
+'pie_all',
 'barrier_autotest',
 'hash_multiwriter_autotest',
 'timer_racecond_autotest',
@@ -313,6 +316,7 @@ perf_test_names = [
 'fib6_perf_autotest',
 'rcu_qsbr_perf_autotest',
 'red_perf',
+'pie_perf',
 'distributor_perf_autotest',
 'pmd_perf_autotest',
 'stack_perf_autotest',
diff --git a/app/test/test_pie.c b/app/test/test_pie.c
new file mode 100644
index 00..dfa69d1c7e
--- /dev/null
+++ b/app/test/test_pie.c
@@ -0,0 +1,1065 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#include 
+
+#ifdef __INTEL_COMPILER
+#pragma warning(disable:2259)   /* conversion may lose significant bits */
+#pragma warning(disable:181)/* Arg incompatible with format string */
+#endif
+
+/**< structures for testing rte_pie performance and function */
+struct test_rte_pie_config {/**< Test structure for RTE_PIE config */
+   struct rte_pie_config *pconfig; /**< RTE_PIE configuration parameters */
+   uint8_t num_cfg;/**< Number of RTE_PIE configs to test 
*/
+   uint16_t qdelay_ref;/**< Latency Target (milliseconds) */
+   uint16_t *dp_update_interval;   /**< Update interval for drop 
probability
+ * (milliseconds)
+ */
+   uint16_t *max_burst;/**< Max Burst Allowance (milliseconds) 
*/
+   uint16_t tailq_th;  /**< Tailq drop threshold (packet 
counts) */
+};
+
+struct test_queue { /**< Test structure for RTE_PIE Queues */
+   struct rte_pie *pdata_in;   /**< RTE_PIE runtime data input */
+   struct rte_pie *pdata_out;  /**< RTE_PIE runtime data 
output*/
+   uint32_t num_queues;/**< Number of RTE_PIE queues to test */
+   uint32_t *qlen; /**< Queue size */
+   uint32_t q_ramp_up; /**< Num of enqueues to ramp up the 
queue */
+   double drop_tolerance;  /**< Drop tolerance of packets not 
enqueued */
+};
+
+struct test_var {   /**< Test variables used for testing 
RTE_PIE */
+   uint32_t num_iterations;/**< Number of test iterations */
+   uint32_t num_ops;   /**< Number of test operations */
+   uint64_t clk_freq;  /**< CPU clock frequency */
+   uint32_t *dropped;  /**< Test operations dropped */
+   uint32_t *enqueued; /**< Test operations enqueued */
+   uint32_t *dequeued; /**< Test operations dequeued */
+};
+
+struct test_config {/**< Primary test structure for RTE_PIE */
+   const char *ifname; /**< Interface name */
+   const char *msg;/**< Test message for display */
+   const char *htxt;   /**< Header txt display for result 
output */
+   struct test_rte_pie_config *tconfig; /**< Test structure for RTE_PIE 
config */
+   struct test_queue *tqueue;  /**< Test structure for RTE_PIE Queues 
*/
+   struct test_var *tvar;  /**< Test variables used for testing 
RTE_PIE */
+   uint32_t *tlevel;   /**< Queue levels */
+};
+
+enum test_result {
+   FAIL = 0,
+   PASS
+};
+
+/**< Test structure to define tests to run */
+struct tests {
+   struct test_config *testcfg;
+   enum test_result (*testfn)(struct test_config *cfg);
+};
+
+struct rdtsc_prof {
+   uint64_t clk_start;
+   uint64_t clk_min;   /**< min clocks */
+   uint64_t clk_max;   /**< max clocks */
+   uint64_t clk_avgc;  /**< cou

Re: [dpdk-dev] [dpdk-stable] [PATCH] pipeline: fix instruction label check

2021-10-25 Thread Thomas Monjalon
21/10/2021 05:23, Yogesh Jangra:
> The instruction_data array was incorrectly indexed, which resulted in
> the array index getting out of bounds and sometimes segfault.
> 
> Fixes: a1711f (“pipeline: add SWX Rx and extract instructions“)
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Yogesh Jangra 
> Acked-by: Cristian Dumitrescu 

Applied, thanks.





[dpdk-dev] [PATCH] eal/windows: fix IOVA mode detection and handling

2021-10-25 Thread Dmitry Kozlyuk
Windows EAL did not detect IOVA mode and worked incorrectly
if physical addresses could not be obtained
(if virt2phys driver was missing or inaccessible).
In this case, rte_mem_virt2iova() reported RTE_BAD_IOVA for any address.
Inability to obtain IOVA, be it PA or VA, should cause a failure
for the DPDK allocator, but it was hidden by the implementation,
so allocations did not fail when they should.
The mode when DPDK cannot obtain PA but can work is IOVA-as-VA mode.
However, rte_eal_iova_mode() always returned RTE_IOVA_DC
(while it should only ever return RTE_IOVA_PA or RTE_IOVA_VA),
because IOVA mode detection was not implemented.

Implement IOVA mode detection:
1. Always allow to force --iova-mode=va.
2. Allow to force --iova-mode=pa only if virt2phys is available.
3. If no mode is forced and virt2phys is available,
   select the mode according to bus requests, default to PA.
4. If no mode is forced but virt2phys is unavailable, default to VA.
Fix rte_mem_virt2iova() by returning VA when using IOVA-as-VA.
Fix rte_eal_iova_mode() by returning the selected mode.

Fixes: 2a5d547a4a9b ("eal/windows: implement basic memory management")
Cc: sta...@dpdk.org

Reported-by: Tal Shnaiderman 
Signed-off-by: Dmitry Kozlyuk 
---
Fixes tag points to the commit that introduced the wrong behavior.
Commit fec28ca0e3a9 ("net/mlx5: support mempool registration")
exposed it, because since commit 11541c5c81dd ("mempool: add non-IO flag")
RTE_MEMPOOL_F_NON_IO was mistakenly set for all mempools on Windows
when virt2phys was not available.

 lib/eal/windows/eal.c  | 63 --
 lib/eal/windows/eal_memalloc.c | 15 +++-
 lib/eal/windows/eal_memory.c   |  6 ++--
 3 files changed, 51 insertions(+), 33 deletions(-)

diff --git a/lib/eal/windows/eal.c b/lib/eal/windows/eal.c
index 3d8c520412..f7ce1b6671 100644
--- a/lib/eal/windows/eal.c
+++ b/lib/eal/windows/eal.c
@@ -276,6 +276,8 @@ rte_eal_init(int argc, char **argv)
const struct rte_config *config = rte_eal_get_configuration();
struct internal_config *internal_conf =
eal_get_internal_configuration();
+   bool has_phys_addr;
+   enum rte_iova_mode iova_mode;
int ret;
 
eal_log_init(NULL, 0);
@@ -322,18 +324,59 @@ rte_eal_init(int argc, char **argv)
internal_conf->memory = MEMSIZE_IF_NO_HUGE_PAGE;
}
 
+   if (rte_eal_intr_init() < 0) {
+   rte_eal_init_alert("Cannot init interrupt-handling thread");
+   return -1;
+   }
+
+   if (rte_eal_timer_init() < 0) {
+   rte_eal_init_alert("Cannot init TSC timer");
+   rte_errno = EFAULT;
+   return -1;
+   }
+
+   bscan = rte_bus_scan();
+   if (bscan < 0) {
+   rte_eal_init_alert("Cannot scan the buses");
+   rte_errno = ENODEV;
+   return -1;
+   }
+
if (eal_mem_win32api_init() < 0) {
rte_eal_init_alert("Cannot access Win32 memory management");
rte_errno = ENOTSUP;
return -1;
}
 
+   has_phys_addr = true;
if (eal_mem_virt2iova_init() < 0) {
/* Non-fatal error if physical addresses are not required. */
-   RTE_LOG(WARNING, EAL, "Cannot access virt2phys driver, "
+   RTE_LOG(DEBUG, EAL, "Cannot access virt2phys driver, "
"PA will not be available\n");
+   has_phys_addr = false;
}
 
+   iova_mode = internal_conf->iova_mode;
+   if (iova_mode == RTE_IOVA_PA && !has_phys_addr) {
+   rte_eal_init_alert("Cannot use IOVA as 'PA' since physical 
addresses are not available");
+   rte_errno = EINVAL;
+   return -1;
+   }
+   if (iova_mode == RTE_IOVA_DC) {
+   RTE_LOG(DEBUG, EAL, "Specific IOVA mode is not requested, 
autodetecting\n");
+   if (has_phys_addr) {
+   RTE_LOG(DEBUG, EAL, "Selecting IOVA mode according to 
bus requests\n");
+   iova_mode = rte_bus_get_iommu_class();
+   if (iova_mode == RTE_IOVA_DC)
+   iova_mode = RTE_IOVA_PA;
+   } else {
+   iova_mode = RTE_IOVA_VA;
+   }
+   }
+   RTE_LOG(DEBUG, EAL, "Selected IOVA mode '%s'\n",
+   iova_mode == RTE_IOVA_PA ? "PA" : "VA");
+   rte_eal_get_configuration()->iova_mode = iova_mode;
+
if (rte_eal_memzone_init() < 0) {
rte_eal_init_alert("Cannot init memzone");
rte_errno = ENODEV;
@@ -358,27 +401,9 @@ rte_eal_init(int argc, char **argv)
return -1;
}
 
-   if (rte_eal_intr_init() < 0) {
-   rte_eal_init_alert("Cannot init interrupt-handling thread");
-   return -1;
-   }
-
-   if (rte_eal_timer_init() < 0) {
-   rte_eal_init_alert("Cannot init TS

Re: [dpdk-dev] [PATCH] port: configure loop count for source port

2021-10-25 Thread Thomas Monjalon
17/09/2021 12:32, Yogesh Jangra:
> Add support for configurable number of loops through the input PCAP
> file for the source port. Added an additional parameter to source
> port CLI command.
> 
> Signed-off-by: Yogesh Jangra 
> Acked-by: Cristian Dumitrescu 

Applied, thanks.






[dpdk-dev] [PATCH] net: remove endianness annotations for L2TPv2 bitfields

2021-10-25 Thread David Marchand
Endianness is already handled by the checks on RTE_BYTE_ORDER.
Marking bitfields with endianness types is at best unneeded, at worse it
breaks build with OVS (with sparse enabled).

Example:
../../lib/ofp-packet.c: note: in included file (through
  .../ovs/dpdk-dir/build/include/rte_flow.h, ../../lib/netdev-dpdk.h,
  ../../lib/dp-packet.h):
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:92:37:
  error: invalid bitfield specifier for type restricted ovs_be16.
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:93:37:
  error: invalid bitfield specifier for type restricted ovs_be16.
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:94:40:
  error: invalid bitfield specifier for type restricted ovs_be16.
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:95:37:
  error: invalid bitfield specifier for type restricted ovs_be16.
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:96:40:
  error: invalid bitfield specifier for type restricted ovs_be16.
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:97:37:
  error: invalid bitfield specifier for type restricted ovs_be16.
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:98:37:
  error: invalid bitfield specifier for type restricted ovs_be16.
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:99:40:
  error: invalid bitfield specifier for type restricted ovs_be16.
.../ovs/dpdk-dir/build/include/rte_l2tpv2.h:100:39:
  error: invalid bitfield specifier for type restricted ovs_be16.
make[3]: *** [lib/ofp-packet.lo] Error 1

Fixes: 3a929df1f286 ("ethdev: support L2TPv2 and PPP procotol")

Signed-off-by: David Marchand 
---
 lib/net/rte_l2tpv2.h | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/lib/net/rte_l2tpv2.h b/lib/net/rte_l2tpv2.h
index 670fe5470e..b90e36cf12 100644
--- a/lib/net/rte_l2tpv2.h
+++ b/lib/net/rte_l2tpv2.h
@@ -89,25 +89,25 @@ struct rte_l2tpv2_common_hdr {
__extension__
struct {
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
-   rte_be16_t t:1; /**< message Type */
-   rte_be16_t l:1; /**< length option bit */
-   rte_be16_t res1:2;  /**< reserved */
-   rte_be16_t s:1; /**< ns/nr option bit */
-   rte_be16_t res2:1;  /**< reserved */
-   rte_be16_t o:1; /**< offset option bit */
-   rte_be16_t p:1; /**< priority option bit */
-   rte_be16_t res3:4;  /**< reserved */
-   rte_be16_t ver:4;   /**< protocol version */
+   uint16_t t:1;   /**< message Type */
+   uint16_t l:1;   /**< length option bit */
+   uint16_t res1:2;/**< reserved */
+   uint16_t s:1;   /**< ns/nr option bit */
+   uint16_t res2:1;/**< reserved */
+   uint16_t o:1;   /**< offset option bit */
+   uint16_t p:1;   /**< priority option bit */
+   uint16_t res3:4;/**< reserved */
+   uint16_t ver:4; /**< protocol version */
 #elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
-   rte_be16_t ver:4;   /**< protocol version */
-   rte_be16_t res3:4;  /**< reserved */
-   rte_be16_t p:1; /**< priority option bit */
-   rte_be16_t o:1; /**< offset option bit */
-   rte_be16_t res2:1;  /**< reserved */
-   rte_be16_t s:1; /**< ns/nr option bit */
-   rte_be16_t res1:2;  /**< reserved */
-   rte_be16_t l:1; /**< length option bit */
-   rte_be16_t t:1; /**< message Type */
+   uint16_t ver:4; /**< protocol version */
+   uint16_t res3:4;/**< reserved */
+   uint16_t p:1;   /**< priority option bit */
+   uint16_t o:1;   /**< offset option bit */
+   uint16_t res2:1;/**< reserved */
+   uint16_t s:1;   /**< ns/nr option bit */
+   uint16_t res1:2;/**< reserved */
+   uint16_t l:1;   /**< length option bit */
+   uint16_t t:1;   /**< message Type */
 #endif
};
};
-- 
2.23.0



[dpdk-dev] [PATCH v3 0/5] Support MLX5 crypto driver on Windows

2021-10-25 Thread Tal Shnaiderman
Support the MLX5 crypto driver on Windows OS by moving the driver's
control path communication with the Kernel to be OS agnostic.
---
v3: Remove code which was already introduced in previous patches.
Rebase on master and remove "Depends-on" message.
v2: Split build change for mlx5 only and the rest of the drivers [AkhilG]
---

Tal Shnaiderman (5):
  common/mlx5: add DV enums to Windows defs file
  crypto/mlx5: modify unix pthread code
  crypto/mlx5: fix size of UMR WQE
  build: check Windows support per driver
  crypto/mlx5: support on Windows

 doc/guides/cryptodevs/mlx5.rst   | 15 ---
 doc/guides/rel_notes/release_21_11.rst   |  1 +
 drivers/common/mlx5/version.map  |  2 +-
 drivers/common/mlx5/windows/mlx5_common_os.c |  2 +-
 drivers/common/mlx5/windows/mlx5_win_defs.h  | 12 
 drivers/crypto/armv8/meson.build |  6 ++
 drivers/crypto/bcmfs/meson.build |  6 ++
 drivers/crypto/ccp/meson.build   |  1 +
 drivers/crypto/ipsec_mb/meson.build  |  6 ++
 drivers/crypto/meson.build   |  3 ---
 drivers/crypto/mlx5/meson.build  |  4 ++--
 drivers/crypto/mlx5/mlx5_crypto.c|  8 ++--
 drivers/crypto/mvsam/meson.build |  6 ++
 drivers/crypto/null/meson.build  |  6 ++
 drivers/crypto/octeontx/meson.build  |  6 ++
 drivers/crypto/openssl/meson.build   |  6 ++
 drivers/crypto/qat/meson.build   |  6 ++
 drivers/crypto/scheduler/meson.build |  6 ++
 drivers/crypto/virtio/meson.build|  6 ++
 19 files changed, 96 insertions(+), 12 deletions(-)

-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v3 1/5] common/mlx5: add DV enums to Windows defs file

2021-10-25 Thread Tal Shnaiderman
Add needed DV enums used by the crypto PMD and missing
for Windows OS.

Signed-off-by: Tal Shnaiderman 
Acked-by: Matan Azrad 
---
 drivers/common/mlx5/windows/mlx5_win_defs.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/common/mlx5/windows/mlx5_win_defs.h 
b/drivers/common/mlx5/windows/mlx5_win_defs.h
index 47bfc907e7..9f709ff30d 100644
--- a/drivers/common/mlx5/windows/mlx5_win_defs.h
+++ b/drivers/common/mlx5/windows/mlx5_win_defs.h
@@ -93,6 +93,18 @@ enum {
MLX5_ETH_WQE_L4_CSUM = (1 << 7),
 };
 
+enum {
+   MLX5_WQE_CTRL_CQ_UPDATE = 2 << 2,
+   MLX5_WQE_CTRL_SOLICITED = 1 << 1,
+   MLX5_WQE_CTRL_FENCE = 4 << 5,
+   MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE = 1 << 5,
+};
+
+enum {
+   MLX5_SEND_WQE_BB= 64,
+   MLX5_SEND_WQE_SHIFT = 6,
+};
+
 /*
  * RX Hash fields enable to set which incoming packet's field should
  * participates in RX Hash. Each flag represent certain packet's field,
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v3 2/5] crypto/mlx5: modify unix pthread code

2021-10-25 Thread Tal Shnaiderman
Remove the usage of PTHREAD_MUTEX_INITIALIZER which is not
support in Windows and initialize priv_list_lock in RTE_INIT.

Signed-off-by: Tal Shnaiderman 
Acked-by: Matan Azrad 
---
 drivers/crypto/mlx5/mlx5_crypto.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/mlx5/mlx5_crypto.c 
b/drivers/crypto/mlx5/mlx5_crypto.c
index f430d8cde0..6bebc83c39 100644
--- a/drivers/crypto/mlx5/mlx5_crypto.c
+++ b/drivers/crypto/mlx5/mlx5_crypto.c
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,7 +34,7 @@
 
 TAILQ_HEAD(mlx5_crypto_privs, mlx5_crypto_priv) mlx5_crypto_priv_list =
TAILQ_HEAD_INITIALIZER(mlx5_crypto_priv_list);
-static pthread_mutex_t priv_list_lock = PTHREAD_MUTEX_INITIALIZER;
+static pthread_mutex_t priv_list_lock;
 
 int mlx5_crypto_logtype;
 
@@ -967,6 +968,7 @@ static struct mlx5_class_driver mlx5_crypto_driver = {
 
 RTE_INIT(rte_mlx5_crypto_init)
 {
+   pthread_mutex_init(&priv_list_lock, NULL);
mlx5_common_init();
if (mlx5_glue != NULL)
mlx5_class_driver_register(&mlx5_crypto_driver);
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v3 4/5] build: check Windows support per driver

2021-10-25 Thread Tal Shnaiderman
Remove the check and build failure from crypto/meson.build
in case building for Windows OS.

Add this check/failure in the meson.build file of each crypto PMD
that is not enforcing it to allow PMD support for Windows
per driver when applicable.

Signed-off-by: Tal Shnaiderman 
Acked-by: Matan Azrad 
---
 drivers/crypto/armv8/meson.build | 6 ++
 drivers/crypto/bcmfs/meson.build | 6 ++
 drivers/crypto/ccp/meson.build   | 1 +
 drivers/crypto/ipsec_mb/meson.build  | 6 ++
 drivers/crypto/meson.build   | 3 ---
 drivers/crypto/mvsam/meson.build | 6 ++
 drivers/crypto/null/meson.build  | 6 ++
 drivers/crypto/octeontx/meson.build  | 6 ++
 drivers/crypto/openssl/meson.build   | 6 ++
 drivers/crypto/qat/meson.build   | 6 ++
 drivers/crypto/scheduler/meson.build | 6 ++
 drivers/crypto/virtio/meson.build| 6 ++
 12 files changed, 61 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/armv8/meson.build b/drivers/crypto/armv8/meson.build
index 40a4dbb7bb..5effba8bbc 100644
--- a/drivers/crypto/armv8/meson.build
+++ b/drivers/crypto/armv8/meson.build
@@ -1,6 +1,12 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2019 Arm Limited
 
+if is_windows
+build = false
+reason = 'not supported on Windows'
+subdir_done()
+endif
+
 dep = dependency('libAArch64crypto', required: false, method: 'pkg-config')
 if not dep.found()
 build = false
diff --git a/drivers/crypto/bcmfs/meson.build b/drivers/crypto/bcmfs/meson.build
index d67e78d51b..5842f83a3b 100644
--- a/drivers/crypto/bcmfs/meson.build
+++ b/drivers/crypto/bcmfs/meson.build
@@ -3,6 +3,12 @@
 # All rights reserved.
 #
 
+if is_windows
+build = false
+reason = 'not supported on Windows'
+subdir_done()
+endif
+
 deps += ['eal', 'bus_vdev']
 sources = files(
 'bcmfs_logs.c',
diff --git a/drivers/crypto/ccp/meson.build b/drivers/crypto/ccp/meson.build
index 0f82b9b90b..a4f3406009 100644
--- a/drivers/crypto/ccp/meson.build
+++ b/drivers/crypto/ccp/meson.build
@@ -4,6 +4,7 @@
 if not is_linux
 build = false
 reason = 'only supported on Linux'
+subdir_done()
 endif
 dep = dependency('libcrypto', required: false, method: 'pkg-config')
 if not dep.found()
diff --git a/drivers/crypto/ipsec_mb/meson.build 
b/drivers/crypto/ipsec_mb/meson.build
index d7037daea1..f3a34a60a8 100644
--- a/drivers/crypto/ipsec_mb/meson.build
+++ b/drivers/crypto/ipsec_mb/meson.build
@@ -1,6 +1,12 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2021 Intel Corporation
 
+if is_windows
+build = false
+reason = 'not supported on Windows'
+subdir_done()
+endif
+
 IMB_required_ver = '1.0.0'
 lib = cc.find_library('IPSec_MB', required: false)
 if not lib.found()
diff --git a/drivers/crypto/meson.build b/drivers/crypto/meson.build
index 2585471e93..59f02ea47c 100644
--- a/drivers/crypto/meson.build
+++ b/drivers/crypto/meson.build
@@ -1,9 +1,6 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
-if is_windows
-subdir_done()
-endif
 
 drivers = [
 'armv8',
diff --git a/drivers/crypto/mvsam/meson.build b/drivers/crypto/mvsam/meson.build
index fec167bf29..bf3c4323de 100644
--- a/drivers/crypto/mvsam/meson.build
+++ b/drivers/crypto/mvsam/meson.build
@@ -3,6 +3,12 @@
 # Copyright(c) 2018 Semihalf.
 # All rights reserved.
 
+if is_windows
+build = false
+reason = 'not supported on Windows'
+subdir_done()
+endif
+
 dep = dependency('libmusdk', required: false, method: 'pkg-config')
 if not dep.found()
 build = false
diff --git a/drivers/crypto/null/meson.build b/drivers/crypto/null/meson.build
index 1f7d644de1..acc16e7d81 100644
--- a/drivers/crypto/null/meson.build
+++ b/drivers/crypto/null/meson.build
@@ -1,5 +1,11 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
+if is_windows
+build = false
+reason = 'not supported on Windows'
+subdir_done()
+endif
+
 deps += 'bus_vdev'
 sources = files('null_crypto_pmd.c', 'null_crypto_pmd_ops.c')
diff --git a/drivers/crypto/octeontx/meson.build 
b/drivers/crypto/octeontx/meson.build
index bc6187e1cf..387727c6ab 100644
--- a/drivers/crypto/octeontx/meson.build
+++ b/drivers/crypto/octeontx/meson.build
@@ -8,6 +8,12 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
 subdir_done()
 endif
 
+if is_windows
+build = false
+reason = 'not supported on Windows'
+subdir_done()
+endif
+
 deps += ['bus_pci']
 deps += ['bus_vdev']
 deps += ['common_cpt']
diff --git a/drivers/crypto/openssl/meson.build 
b/drivers/crypto/openssl/meson.build
index b21fca0be3..cd962da1d6 100644
--- a/drivers/crypto/openssl/meson.build
+++ b/drivers/crypto/openssl/meson.build
@@ -1,6 +1,12 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
+if is_windows
+build = false
+reason = 'not supported on Windows'
+subdir_done()
+endif
+
 dep = dependency('libcrypto', required: false, metho

Re: [dpdk-dev] [PATCH v2] pipeline: add support for action annotations

2021-10-25 Thread Thomas Monjalon
18/10/2021 03:22, Yogesh Jangra:
> Enable restricting the scope of an action to regular table entries or
> to the table default entry in order to support the P4 language
> tableonly or defaultonly annotations.
> 
> Signed-off-by: Yogesh Jangra 
> Acked-by: Cristian Dumitrescu 

Applied, thanks.





[dpdk-dev] [PATCH v3 3/5] crypto/mlx5: fix size of UMR WQE

2021-10-25 Thread Tal Shnaiderman
The size of the UMR WQE allocated object is decided by a sizof
operation on the struct, however since the struct contains
a union of flexible array members this sizeof results can differ
between compilers.

GCC for example treats the union as 0 sized, MSVC adds a padding
of 16Bits.

To resolve the ambiguity the allocation size will be calculated
by the sizes of the members excluding the flexible union.

Fixes: a1978aa23bf4 ("crypto/mlx5: add maximum segments configuration")
Cc: sta...@dpdk.org

Signed-off-by: Tal Shnaiderman 
Acked-by: Matan Azrad 
---
 drivers/crypto/mlx5/mlx5_crypto.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/mlx5/mlx5_crypto.c 
b/drivers/crypto/mlx5/mlx5_crypto.c
index 6bebc83c39..07c2a9c68b 100644
--- a/drivers/crypto/mlx5/mlx5_crypto.c
+++ b/drivers/crypto/mlx5/mlx5_crypto.c
@@ -909,7 +909,9 @@ mlx5_crypto_dev_probe(struct mlx5_common_device *cdev)
priv->keytag = rte_cpu_to_be_64(devarg_prms.keytag);
priv->max_segs_num = devarg_prms.max_segs_num;
priv->umr_wqe_size = sizeof(struct mlx5_wqe_umr_bsf_seg) +
-sizeof(struct mlx5_umr_wqe) +
+sizeof(struct mlx5_wqe_cseg) +
+sizeof(struct mlx5_wqe_umr_cseg) +
+sizeof(struct mlx5_wqe_mkey_cseg) +
 RTE_ALIGN(priv->max_segs_num, 4) *
 sizeof(struct mlx5_wqe_dseg);
rdmw_wqe_size = sizeof(struct mlx5_rdma_write_wqe) +
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v3 5/5] crypto/mlx5: support on Windows

2021-10-25 Thread Tal Shnaiderman
Add support for mlx5 crypto pmd on Windows OS.
Add changes to release note and pmd guide.

Signed-off-by: Tal Shnaiderman 
Acked-by: Matan Azrad 
---
 doc/guides/cryptodevs/mlx5.rst   | 15 ---
 doc/guides/rel_notes/release_21_11.rst   |  1 +
 drivers/common/mlx5/version.map  |  2 +-
 drivers/common/mlx5/windows/mlx5_common_os.c |  2 +-
 drivers/crypto/mlx5/meson.build  |  4 ++--
 5 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/doc/guides/cryptodevs/mlx5.rst b/doc/guides/cryptodevs/mlx5.rst
index 68bfdf3a83..7338c0c493 100644
--- a/doc/guides/cryptodevs/mlx5.rst
+++ b/doc/guides/cryptodevs/mlx5.rst
@@ -39,12 +39,12 @@ or to access the hardware components directly.
 There are different levels of objects and bypassing abilities.
 To get the best performances:
 
-- Verbs is a complete high-level generic API.
-- Direct Verbs is a device-specific API.
+- Verbs is a complete high-level generic API (Linux only).
+- Direct Verbs is a device-specific API (Linux only).
 - DevX allows to access firmware objects.
 
 Enabling ``librte_crypto_mlx5`` causes DPDK applications
-to be linked against libibverbs.
+to be linked against libibverbs on Linux OS.
 
 In order to move the device to crypto operational mode, credential and KEK
 (Key Encrypting Key) should be set as the first step.
@@ -155,8 +155,17 @@ Limitations
 Prerequisites
 -
 
+Linux Prerequisites
+~~~
+
 - Mellanox OFED version: **5.3**
   see :doc:`../../nics/mlx5` guide for more Mellanox OFED details.
 
 - Compilation can be done also with rdma-core v15+.
   see :doc:`../../nics/mlx5` guide for more rdma-core details.
+
+Windows Prerequisites
+~
+
+- Mellanox WINOF-2 version: **2.60** or higher.
+  see :doc:`../../nics/mlx5` guide for more Mellanox WINOF-2 details.
diff --git a/doc/guides/rel_notes/release_21_11.rst 
b/doc/guides/rel_notes/release_21_11.rst
index 4fb2abf4ad..035a98d814 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -189,6 +189,7 @@ New Features
 
   * Added implicit mempool registration to avoid data path hiccups (opt-out).
   * Added NIC offloads for the PMD on Windows (TSO, VLAN strip, CRC keep).
+  * Added support for mlx5 crypto PMD on Windows operating system.
 
 * **Updated Solarflare network PMD.**
 
diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map
index 0ea8325f9a..71cd7c625e 100644
--- a/drivers/common/mlx5/version.map
+++ b/drivers/common/mlx5/version.map
@@ -16,7 +16,7 @@ INTERNAL {
mlx5_dev_mempool_unregister;
mlx5_dev_mempool_subscribe;
 
-   mlx5_devx_alloc_uar; # WINDOWS_NO_EXPORT
+   mlx5_devx_alloc_uar;
 
mlx5_devx_cmd_alloc_pd;
mlx5_devx_cmd_create_conn_track_offload_obj;
diff --git a/drivers/common/mlx5/windows/mlx5_common_os.c 
b/drivers/common/mlx5/windows/mlx5_common_os.c
index 44e8ebec2b..ea478d7395 100644
--- a/drivers/common/mlx5/windows/mlx5_common_os.c
+++ b/drivers/common/mlx5/windows/mlx5_common_os.c
@@ -202,7 +202,7 @@ mlx5_os_open_device(struct mlx5_common_device *cdev, 
uint32_t classes)
struct mlx5_context *mlx5_ctx = NULL;
int n;
 
-   if (classes != MLX5_CLASS_ETH) {
+   if (classes != MLX5_CLASS_ETH && classes != MLX5_CLASS_CRYPTO) {
DRV_LOG(ERR,
"The chosen classes are not supported on Windows.");
rte_errno = ENOTSUP;
diff --git a/drivers/crypto/mlx5/meson.build b/drivers/crypto/mlx5/meson.build
index 1d6e413dd5..9d9c9c00bc 100644
--- a/drivers/crypto/mlx5/meson.build
+++ b/drivers/crypto/mlx5/meson.build
@@ -1,9 +1,9 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright (c) 2021 NVIDIA Corporation & Affiliates
 
-if not is_linux
+if not (is_linux or is_windows)
 build = false
-reason = 'only supported on Linux'
+reason = 'only supported on Linux and Windows'
 subdir_done()
 endif
 
-- 
2.16.1.windows.4



Re: [dpdk-dev] [PATCH 1/2] ethdev: fix log level of Tx and Rx dummy functions

2021-10-25 Thread Ananyev, Konstantin

> > > Correctly behaving app should never call these stub functions and should 
> > > never see these messages.
> > > If your app ended up inside this function, then there something really 
> > > wrong is going on,
> > > that can cause app crash, silent memory corruption, NIC HW hang, or many 
> > > other nasty things.
> > > The aim of this stubs mechanism:
> > > 1) minimize (but not completely avoid) risk of such damage to happen in 
> > > case of
> > > programming error within user app.
> > > 2) flag to the user that something very wrong is going on within his app.
> > > In such situation, possible slowdown of misbehaving program is out of my 
> > > concern.
> 
> If correctly behaving app should not do this, why not put an assert()
> or a rte_panic?
> This way, the users will definitely catch it.

That was my first intention, though generic DPDK policy is
to avoid panics inside library functions.
But if everyone think it would be ok here, then I am fine with it too.   

> 
> 
> >
> > There is a concern about getting efficient log report,
> > especially when looking at CI issues.
> 
> +1.
> The current solution with logs is a real pain.

Are you guys talking about problems with
app/test/sample_packet_forward.* David reported?
Or some extra problems arise?
  


Re: [dpdk-dev] [PATCH v5 0/6] make rte_intr_handle internal

2021-10-25 Thread David Marchand
On Mon, Oct 25, 2021 at 3:04 PM Raslan Darawsheh  wrote:
>
> Hi,
>
> > -Original Message-
> > From: dev  On Behalf Of Harman Kalra
> > Sent: Friday, October 22, 2021 11:49 PM
> > To: dev@dpdk.org
> > Cc: david.march...@redhat.com; dmitry.kozl...@gmail.com;
> > m...@ashroe.eu; NBU-Contact-Thomas Monjalon ;
> > Harman Kalra 
> > Subject: [dpdk-dev] [PATCH v5 0/6] make rte_intr_handle internal
> >
> > Moving struct rte_intr_handle as an internal structure to
> > avoid any ABI breakages in future. Since this structure defines
> > some static arrays and changing respective macros breaks the ABI.
> > Eg:
> > Currently RTE_MAX_RXTX_INTR_VEC_ID imposes a limit of maximum 512
> > MSI-X interrupts that can be defined for a PCI device, while PCI
> > specification allows maximum 2048 MSI-X interrupts that can be used.
> > If some PCI device requires more than 512 vectors, either change the
> > RTE_MAX_RXTX_INTR_VEC_ID limit or dynamically allocate based on
> > PCI device MSI-X size on probe time. Either way its an ABI breakage.
> >
> > Change already included in 21.11 ABI improvement spreadsheet (item 42):
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> > efense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-
> > 3A__docs.google.com_s&data=04%7C01%7Crasland%40nvidia.com%7C
> > 567d8ee2e3c842a9e59808d9959d822e%7C43083d15727340c1b7db39efd9ccc1
> > 7a%7C0%7C0%7C637705326003996997%7CUnknown%7CTWFpbGZsb3d8eyJ
> > WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> > 7C1000&sdata=7UgxpkEtH%2Fnjk7xo9qELjqWi58XLzzCH2pimeDWLzvc%
> > 3D&reserved=0
> > preadsheets_d_1betlC000ua5SsSiJIcC54mCCCJnW6voH5Dqv9UxeyfE_edit-
> > 23gid-
> > 3D0&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=5ESHPj7V-
> > 7JdkxT_Z_SU6RrS37ys4U
> > XudBQ_rrS5LRo&m=7dl3OmXU7QHMmWYB6V1hYJtq1cUkjfhXUwze2Si_48c
> > &s=lh6DEGhR
> > Bg1shODpAy3RQk-H-0uQx5icRfUBf9dtCp4&e=
> >
> > This series makes struct rte_intr_handle totally opaque to the outside
> > world by wrapping it inside a .c file and providing get set wrapper APIs
> > to read or manipulate its fields.. Any changes to be made to any of the
> > fields should be done via these get set APIs.
> > Introduced a new eal_common_interrupts.c where all these APIs are
> > defined
> > and also hides struct rte_intr_handle definition.
> >
> > Details on each patch of the series:
> > Patch 1: eal/interrupts: implement get set APIs
> > This patch provides prototypes and implementation of all the new
> > get set APIs. Alloc APIs are implemented to allocate memory for
> > interrupt handle instance. Currently most of the drivers defines
> > interrupt handle instance as static but now it cant be static as
> > size of rte_intr_handle is unknown to all the drivers. Drivers are
> > expected to allocate interrupt instances during initialization
> > and free these instances during cleanup phase.
> > This patch also rearranges the headers related to interrupt
> > framework. Epoll related definitions prototypes are moved into a
> > new header i.e. rte_epoll.h and APIs defined in rte_eal_interrupts.h
> > which were driver specific are moved to rte_interrupts.h (as anyways
> > it was accessible and used outside DPDK library. Later in the series
> > rte_eal_interrupts.h is removed.
> >
> > Patch 2: eal/interrupts: avoid direct access to interrupt handle
> > Modifying the interrupt framework for linux and freebsd to use these
> > get set alloc APIs as per requirement and avoid accessing the fields
> > directly.
> >
> > Patch 3: test/interrupt: apply get set interrupt handle APIs
> > Updating interrupt test suite to use interrupt handle APIs.
> >
> > Patch 4: drivers: remove direct access to interrupt handle fields
> > Modifying all the drivers and libraries which are currently directly
> > accessing the interrupt handle fields. Drivers are expected to
> > allocated the interrupt instance, use get set APIs with the allocated
> > interrupt handle and free it on cleanup.
> >
> > Patch 5: eal/interrupts: make interrupt handle structure opaque
> > In this patch rte_eal_interrupt.h is removed, struct rte_intr_handle
> > definition is moved to c file to make it completely opaque. As part of
> > interrupt handle allocation, array like efds and elist(which are currently
> > static) are dynamically allocated with default size
> > (RTE_MAX_RXTX_INTR_VEC_ID). Later these arrays can be reallocated as per
> > device requirement using new API rte_intr_handle_event_list_update().
> > Eg, on PCI device probing MSIX size can be queried and these arrays can
> > be reallocated accordingly.
> >
> > Patch 6: eal/alarm: introduce alarm fini routine
> > Introducing alarm fini routine, as the memory allocated for alarm interrupt
> > instance can be freed in alarm fini.
> >
> > Testing performed:
> > 1. Validated the series by running interrupts and alarm test suite.
> > 2. Validate l3fwd power functionality with octeontx2 and i40e intel cards,
> >where interrupts are expected on packet arrival.
> >
> > v1:
> > * Fixed free

Re: [dpdk-dev] [PATCH] sched: remove experimental tag from the API

2021-10-25 Thread Thomas Monjalon
> > This API was introduced in 18.05, therefore removing
> > experimental tag to promote it to stable state
> > 
> > Signed-off-by: Jasvinder Singh 
> Acked-by: Ray Kinsella 

Applied, thanks.




Re: [dpdk-dev] [PATCH 1/2] ethdev: fix log level of Tx and Rx dummy functions

2021-10-25 Thread Thomas Monjalon
25/10/2021 14:55, Ananyev, Konstantin:
> 
> > > > Correctly behaving app should never call these stub functions and 
> > > > should never see these messages.
> > > > If your app ended up inside this function, then there something really 
> > > > wrong is going on,
> > > > that can cause app crash, silent memory corruption, NIC HW hang, or 
> > > > many other nasty things.
> > > > The aim of this stubs mechanism:
> > > > 1) minimize (but not completely avoid) risk of such damage to happen in 
> > > > case of
> > > > programming error within user app.
> > > > 2) flag to the user that something very wrong is going on within his 
> > > > app.
> > > > In such situation, possible slowdown of misbehaving program is out of 
> > > > my concern.
> > 
> > If correctly behaving app should not do this, why not put an assert()
> > or a rte_panic?
> > This way, the users will definitely catch it.
> 
> That was my first intention, though generic DPDK policy is
> to avoid panics inside library functions.
> But if everyone think it would be ok here, then I am fine with it too.

I would prefer not having panic/assert in the lib.

> > > There is a concern about getting efficient log report,
> > > especially when looking at CI issues.
> > 
> > +1.
> > The current solution with logs is a real pain.
> 
> Are you guys talking about problems with
> app/test/sample_packet_forward.* David reported?
> Or some extra problems arise?

The problem will arise each time an app is misbehaving.
That's going to be a recurring problem in the CI.




Re: [dpdk-dev] [PATCH 1/2] ethdev: fix log level of Tx and Rx dummy functions

2021-10-25 Thread David Marchand
On Mon, Oct 25, 2021 at 3:27 PM Thomas Monjalon  wrote:
> > > > There is a concern about getting efficient log report,
> > > > especially when looking at CI issues.
> > >
> > > +1.
> > > The current solution with logs is a real pain.
> >
> > Are you guys talking about problems with
> > app/test/sample_packet_forward.* David reported?
> > Or some extra problems arise?
>
> The problem will arise each time an app is misbehaving.
> That's going to be a recurring problem in the CI.
>

One thing that could be done is compiling with asserts in CI, and let
default build not have those asserts.

Otherwise, logging once should be enough (I have a patch for this latter idea).


-- 
David Marchand



Re: [dpdk-dev] [PATCH v2] mempool: fix non-IO flag inference

2021-10-25 Thread Olivier Matz
On Sat, Oct 23, 2021 at 12:09:19AM +0300, Dmitry Kozlyuk wrote:
> When mempool had been created with RTE_MEMPOOL_F_NO_IOVA_CONTIG flag
> but later populated with valid IOVA, RTE_MEMPOOL_F_NON_IO was unset,
> while it should be kept. The unit test did not catch this
> because rte_mempool_populate_default() it used was populating
> with RTE_BAD_IOVA.
> 
> Keep setting RTE_MEMPOOL_NON_IO at an empty mempool creation
> and add an assert for it in the unit test (remove the separate case).
> Do not reset the flag if RTE_MEMPOOL_F_ON_IOVA_CONTIG is set.
> 
> Fixes: 11541c5c81dd ("mempool: add non-IO flag")
> 
> Signed-off-by: Dmitry Kozlyuk 

Acked-by: Olivier Matz 

Thanks


Re: [dpdk-dev] [PATCH v2] kni: fix build for SLES15-SP3

2021-10-25 Thread Thomas Monjalon
> > From: Aman Singh 
> > 
> > As suse version numbering is inconsistent to determine Linux kernel API to
> > be used. In this patch we check parameter of 'ndo_tx_timeout'
> > API directly from the kernel source. This is done only for suse build.
> > 
> > Bugzilla ID: 812
> > Cc: sta...@dpdk.org
> > 
> > Signed-off-by: Aman Singh 
> > Acked-by: Ferruh Yigit 
> Tested-by: Longfeng Liang 

Applied, thanks.





[dpdk-dev] [PATCH v7 0/9] make rte_intr_handle internal

2021-10-25 Thread David Marchand
Moving struct rte_intr_handle as an internal structure to
avoid any ABI breakages in future. Since this structure defines
some static arrays and changing respective macros breaks the ABI.
Eg:
Currently RTE_MAX_RXTX_INTR_VEC_ID imposes a limit of maximum 512
MSI-X interrupts that can be defined for a PCI device, while PCI
specification allows maximum 2048 MSI-X interrupts that can be used.
If some PCI device requires more than 512 vectors, either change the
RTE_MAX_RXTX_INTR_VEC_ID limit or dynamically allocate based on
PCI device MSI-X size on probe time. Either way its an ABI breakage.

Change already included in 21.11 ABI improvement spreadsheet (item 42):
https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_s
preadsheets_d_1betlC000ua5SsSiJIcC54mCCCJnW6voH5Dqv9UxeyfE_edit-23gid-
3D0&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=5ESHPj7V-7JdkxT_Z_SU6RrS37ys4U
XudBQ_rrS5LRo&m=7dl3OmXU7QHMmWYB6V1hYJtq1cUkjfhXUwze2Si_48c&s=lh6DEGhR
Bg1shODpAy3RQk-H-0uQx5icRfUBf9dtCp4&e=

This series makes struct rte_intr_handle totally opaque to the outside
world by wrapping it inside a .c file and providing get set wrapper APIs
to read or manipulate its fields.. Any changes to be made to any of the
fields should be done via these get set APIs.
Introduced a new eal_common_interrupts.c where all these APIs are defined
and also hides struct rte_intr_handle definition.

v1:
* Fixed freebsd compilation failure
* Fixed seg fault in case of memif

v2:
* Merged the prototype and implementation patch to 1.
* Restricting allocation of single interrupt instance.
* Removed base APIs, as they were exposing internally
allocated memory information.
* Fixed some memory leak issues.
* Marked some library specific APIs as internal.

v3:
* Removed flag from instance alloc API, rather auto detect
if memory should be allocated using glibc malloc APIs or
rte_malloc*
* Added APIs for get/set windows handle.
* Defined macros for repeated checks.

v4:
* Rectified some typo in the APIs documentation.
* Better names for some internal variables.

v5:
* Reverted back to passing flag to instance alloc API, as
with auto detect some multiprocess issues existing in the
library were causing tests failure.
* Rebased to top of tree.

v6:
* renamed RTE_INTR_INSTANCE_F_UNSHARED as RTE_INTR_INSTANCE_F_PRIVATE,
* changed API and removed need for alloc_flag content exposure
  (see rte_intr_instance_dup() in patch 1 and 2),
* exported all symbols for Windows,
* fixed leak in unit tests in case of alloc failure,
* split (previously) patch 4 into three patches
  * (now) patch 4 only concerns alarm and (previously) patch 6 cleanup bits
are squashed in it,
  * (now) patch 5 concerns other libraries updates,
  * (now) patch 6 concerns drivers updates:
* instance allocation is moved to probing for auxiliary,
* there might be a bug for PCI drivers non requesting
  RTE_PCI_DRV_NEED_MAPPING, but code is left as v5,
* split (previously) patch 5 into three patches
  * (now) patch 7 only hides structure, but keep it in a EAL private
header, this makes it possible to keep info in tracepoints,
  * (now) patch 8 deals with VFIO/UIO internal fds merge,
  * (now) patch 9 extends event list,

v7:
* fixed compilation on FreeBSD,
* removed unused interrupt handle in FreeBSD alarm code,
* fixed interrupt handle allocation for PCI drivers without
  RTE_PCI_DRV_NEED_MAPPING,

-- 
David Marchand

Harman Kalra (9):
  interrupts: add allocator and accessors
  interrupts: remove direct access to interrupt handle
  test/interrupts: remove direct access to interrupt handle
  alarm: remove direct access to interrupt handle
  lib: remove direct access to interrupt handle
  drivers: remove direct access to interrupt handle
  interrupts: make interrupt handle structure opaque
  interrupts: rename device specific file descriptor
  interrupts: extend event list

 MAINTAINERS   |   1 +
 app/test/test_interrupts.c| 164 +++--
 drivers/baseband/acc100/rte_acc100_pmd.c  |  14 +-
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  24 +-
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |  24 +-
 drivers/bus/auxiliary/auxiliary_common.c  |  17 +-
 drivers/bus/auxiliary/rte_bus_auxiliary.h |   2 +-
 drivers/bus/dpaa/dpaa_bus.c   |  28 +-
 drivers/bus/dpaa/rte_dpaa_bus.h   |   2 +-
 drivers/bus/fslmc/fslmc_bus.c |  14 +-
 drivers/bus/fslmc/fslmc_vfio.c|  30 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c  |  18 +-
 drivers/bus/fslmc/portal/dpaa2_hw_pvt.h   |   2 +-
 drivers/bus/fslmc/rte_fslmc.h |   2 +-
 drivers/bus/ifpga/ifpga_bus.c |  13 +-
 drivers/bus/ifpga/rte_bus_ifpga.h |   2 +-
 drivers/bus/pci/bsd/pci.c |  20 +-
 drivers/bus/pci/linux/pci.c   |   4 +-
 drivers/bus/pci/linux/pci_uio.c   |  69 +-
 drivers/bus/pci/linux/pci_vfio.c  | 108 ++-
 drivers/

[dpdk-dev] [PATCH v7 1/9] interrupts: add allocator and accessors

2021-10-25 Thread David Marchand
From: Harman Kalra 

Prototype/Implement get set APIs for interrupt handle fields.
User won't be able to access any of the interrupt handle fields
directly while should use these get/set APIs to access/manipulate
them.

Internal interrupt header i.e. rte_eal_interrupt.h is rearranged,
as APIs defined are moved to rte_interrupts.h and epoll specific
definitions are moved to a new header rte_epoll.h.
Later in the series rte_eal_interrupt.h will be removed.

Signed-off-by: Harman Kalra 
Acked-by: Ray Kinsella 
Acked-by: Dmitry Kozlyuk 
Signed-off-by: David Marchand 
---
Changes since v5:
- renamed RTE_INTR_INSTANCE_F_UNSHARED as RTE_INTR_INSTANCE_F_PRIVATE,
- used a single bit to mark instance as shared (default is private),
- removed rte_intr_instance_copy / rte_intr_instance_alloc_flag_get
  with a single rte_intr_instance_dup helper,
- made rte_intr_vec_list_alloc alloc_flags-aware,
- exported all symbols for Windows,

---
 MAINTAINERS|   1 +
 lib/eal/common/eal_common_interrupts.c | 411 
 lib/eal/common/meson.build |   1 +
 lib/eal/include/meson.build|   1 +
 lib/eal/include/rte_eal_interrupts.h   | 207 +---
 lib/eal/include/rte_epoll.h| 118 +
 lib/eal/include/rte_interrupts.h   | 627 +
 lib/eal/version.map|  45 +-
 8 files changed, 1201 insertions(+), 210 deletions(-)
 create mode 100644 lib/eal/common/eal_common_interrupts.c
 create mode 100644 lib/eal/include/rte_epoll.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 587632dce0..097a57f7f6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -211,6 +211,7 @@ F: app/test/test_memzone.c
 
 Interrupt Subsystem
 M: Harman Kalra 
+F: lib/eal/include/rte_epoll.h
 F: lib/eal/*/*interrupts.*
 F: app/test/test_interrupts.c
 
diff --git a/lib/eal/common/eal_common_interrupts.c 
b/lib/eal/common/eal_common_interrupts.c
new file mode 100644
index 00..d6e6654fbb
--- /dev/null
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -0,0 +1,411 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2021 Marvell.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+/* Macros to check for valid interrupt handle */
+#define CHECK_VALID_INTR_HANDLE(intr_handle) do { \
+   if (intr_handle == NULL) { \
+   RTE_LOG(ERR, EAL, "Interrupt instance unallocated\n"); \
+   rte_errno = EINVAL; \
+   goto fail; \
+   } \
+} while (0)
+
+#define RTE_INTR_INSTANCE_KNOWN_FLAGS (RTE_INTR_INSTANCE_F_PRIVATE \
+   | RTE_INTR_INSTANCE_F_SHARED \
+   )
+
+#define RTE_INTR_INSTANCE_USES_RTE_MEMORY(flags) \
+   !!(flags & RTE_INTR_INSTANCE_F_SHARED)
+
+struct rte_intr_handle *rte_intr_instance_alloc(uint32_t flags)
+{
+   struct rte_intr_handle *intr_handle;
+   bool uses_rte_memory;
+
+   /* Check the flag passed by user, it should be part of the
+* defined flags.
+*/
+   if ((flags & ~RTE_INTR_INSTANCE_KNOWN_FLAGS) != 0) {
+   RTE_LOG(ERR, EAL, "Invalid alloc flag passed 0x%x\n", flags);
+   rte_errno = EINVAL;
+   return NULL;
+   }
+
+   uses_rte_memory = RTE_INTR_INSTANCE_USES_RTE_MEMORY(flags);
+   if (uses_rte_memory)
+   intr_handle = rte_zmalloc(NULL, sizeof(*intr_handle), 0);
+   else
+   intr_handle = calloc(1, sizeof(*intr_handle));
+   if (intr_handle == NULL) {
+   RTE_LOG(ERR, EAL, "Fail to allocate intr_handle\n");
+   rte_errno = ENOMEM;
+   return NULL;
+   }
+
+   intr_handle->alloc_flags = flags;
+   intr_handle->nb_intr = RTE_MAX_RXTX_INTR_VEC_ID;
+
+   return intr_handle;
+}
+
+struct rte_intr_handle *rte_intr_instance_dup(const struct rte_intr_handle 
*src)
+{
+   struct rte_intr_handle *intr_handle;
+
+   if (src == NULL) {
+   RTE_LOG(ERR, EAL, "Source interrupt instance unallocated\n");
+   rte_errno = EINVAL;
+   return NULL;
+   }
+
+   intr_handle = rte_intr_instance_alloc(src->alloc_flags);
+
+   intr_handle->fd = src->fd;
+   intr_handle->vfio_dev_fd = src->vfio_dev_fd;
+   intr_handle->type = src->type;
+   intr_handle->max_intr = src->max_intr;
+   intr_handle->nb_efd = src->nb_efd;
+   intr_handle->efd_counter_size = src->efd_counter_size;
+   memcpy(intr_handle->efds, src->efds, src->nb_intr);
+   memcpy(intr_handle->elist, src->elist, src->nb_intr);
+
+   return intr_handle;
+}
+
+void rte_intr_instance_free(struct rte_intr_handle *intr_handle)
+{
+   if (intr_handle == NULL)
+   return;
+   if (RTE_INTR_INSTANCE_USES_RTE_MEMORY(intr_handle->alloc_flags))
+   rte_free(intr_handle);
+   else
+   free(intr_handle);
+}
+
+int rte_intr_fd_set(struct rte_intr_handle *intr_handle, int fd)
+{
+   CHECK_VALID_INTR_HANDLE(intr_handle);
+
+   intr_hand

[dpdk-dev] [PATCH v7 2/9] interrupts: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
From: Harman Kalra 

Making changes to the interrupt framework to use interrupt handle
APIs to get/set any field.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v6:
- fixed compilation on FreeBSD,

Changes since v5:
- used new helper rte_intr_instance_dup,

---
 lib/eal/freebsd/eal_interrupts.c |  85 +
 lib/eal/linux/eal_interrupts.c   | 304 +--
 2 files changed, 219 insertions(+), 170 deletions(-)

diff --git a/lib/eal/freebsd/eal_interrupts.c b/lib/eal/freebsd/eal_interrupts.c
index 86810845fe..10aa91cc09 100644
--- a/lib/eal/freebsd/eal_interrupts.c
+++ b/lib/eal/freebsd/eal_interrupts.c
@@ -40,7 +40,7 @@ struct rte_intr_callback {
 
 struct rte_intr_source {
TAILQ_ENTRY(rte_intr_source) next;
-   struct rte_intr_handle intr_handle; /**< interrupt handle */
+   struct rte_intr_handle *intr_handle; /**< interrupt handle */
struct rte_intr_cb_list callbacks;  /**< user callbacks */
uint32_t active;
 };
@@ -60,7 +60,7 @@ static int
 intr_source_to_kevent(const struct rte_intr_handle *ih, struct kevent *ke)
 {
/* alarm callbacks are special case */
-   if (ih->type == RTE_INTR_HANDLE_ALARM) {
+   if (rte_intr_type_get(ih) == RTE_INTR_HANDLE_ALARM) {
uint64_t timeout_ns;
 
/* get soonest alarm timeout */
@@ -75,7 +75,7 @@ intr_source_to_kevent(const struct rte_intr_handle *ih, 
struct kevent *ke)
} else {
ke->filter = EVFILT_READ;
}
-   ke->ident = ih->fd;
+   ke->ident = rte_intr_fd_get(ih);
 
return 0;
 }
@@ -89,7 +89,7 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
int ret = 0, add_event = 0;
 
/* first do parameter checking */
-   if (intr_handle == NULL || intr_handle->fd < 0 || cb == NULL) {
+   if (rte_intr_fd_get(intr_handle) < 0 || cb == NULL) {
RTE_LOG(ERR, EAL,
"Registering with invalid input parameter\n");
return -EINVAL;
@@ -103,7 +103,7 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
 
/* find the source for this intr_handle */
TAILQ_FOREACH(src, &intr_sources, next) {
-   if (src->intr_handle.fd == intr_handle->fd)
+   if (rte_intr_fd_get(src->intr_handle) == 
rte_intr_fd_get(intr_handle))
break;
}
 
@@ -112,8 +112,9 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
 * thing on the list should be eal_alarm_callback() and we may
 * be called just to reset the timer.
 */
-   if (src != NULL && src->intr_handle.type == RTE_INTR_HANDLE_ALARM &&
-!TAILQ_EMPTY(&src->callbacks)) {
+   if (src != NULL &&
+   rte_intr_type_get(src->intr_handle) == 
RTE_INTR_HANDLE_ALARM &&
+   !TAILQ_EMPTY(&src->callbacks)) {
callback = NULL;
} else {
/* allocate a new interrupt callback entity */
@@ -135,7 +136,14 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
ret = -ENOMEM;
goto fail;
} else {
-   src->intr_handle = *intr_handle;
+   src->intr_handle = 
rte_intr_instance_dup(intr_handle);
+   if (src->intr_handle == NULL) {
+   RTE_LOG(ERR, EAL, "Can not create intr 
instance\n");
+   ret = -ENOMEM;
+   free(src);
+   src = NULL;
+   goto fail;
+   }
TAILQ_INIT(&src->callbacks);
TAILQ_INSERT_TAIL(&intr_sources, src, next);
}
@@ -151,7 +159,8 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
/* add events to the queue. timer events are special as we need to
 * re-set the timer.
 */
-   if (add_event || src->intr_handle.type == RTE_INTR_HANDLE_ALARM) {
+   if (add_event ||
+   rte_intr_type_get(src->intr_handle) == 
RTE_INTR_HANDLE_ALARM) {
struct kevent ke;
 
memset(&ke, 0, sizeof(ke));
@@ -173,12 +182,11 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
 */
if (errno == ENODEV)
RTE_LOG(DEBUG, EAL, "Interrupt handle %d not 
supported\n",
-   src->intr_handle.fd);
+   rte_intr_fd_get(src->intr_handle));
else
-   RTE_LOG(ERR, EAL, "Error adding fd %d "
- 

[dpdk-dev] [PATCH v7 3/9] test/interrupts: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
From: Harman Kalra 

Updating the interrupt testsuite to make use of interrupt
handle get set APIs.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v5:
- fixed leak on when some interrupt handle can't be allocated,

---
 app/test/test_interrupts.c | 164 ++---
 1 file changed, 98 insertions(+), 66 deletions(-)

diff --git a/app/test/test_interrupts.c b/app/test/test_interrupts.c
index 233b14a70b..2a05399f96 100644
--- a/app/test/test_interrupts.c
+++ b/app/test/test_interrupts.c
@@ -16,7 +16,7 @@
 
 /* predefined interrupt handle types */
 enum test_interrupt_handle_type {
-   TEST_INTERRUPT_HANDLE_INVALID,
+   TEST_INTERRUPT_HANDLE_INVALID = 0,
TEST_INTERRUPT_HANDLE_VALID,
TEST_INTERRUPT_HANDLE_VALID_UIO,
TEST_INTERRUPT_HANDLE_VALID_ALARM,
@@ -27,7 +27,7 @@ enum test_interrupt_handle_type {
 
 /* flag of if callback is called */
 static volatile int flag;
-static struct rte_intr_handle intr_handles[TEST_INTERRUPT_HANDLE_MAX];
+static struct rte_intr_handle *intr_handles[TEST_INTERRUPT_HANDLE_MAX];
 static enum test_interrupt_handle_type test_intr_type =
TEST_INTERRUPT_HANDLE_MAX;
 
@@ -50,7 +50,7 @@ static union intr_pipefds pfds;
 static inline int
 test_interrupt_handle_sanity_check(struct rte_intr_handle *intr_handle)
 {
-   if (!intr_handle || intr_handle->fd < 0)
+   if (!intr_handle || rte_intr_fd_get(intr_handle) < 0)
return -1;
 
return 0;
@@ -62,31 +62,54 @@ test_interrupt_handle_sanity_check(struct rte_intr_handle 
*intr_handle)
 static int
 test_interrupt_init(void)
 {
+   struct rte_intr_handle *test_intr_handle;
+   int i;
+
if (pipe(pfds.pipefd) < 0)
return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_INVALID].fd = -1;
-   intr_handles[TEST_INTERRUPT_HANDLE_INVALID].type =
-   RTE_INTR_HANDLE_UNKNOWN;
+   for (i = 0; i < TEST_INTERRUPT_HANDLE_MAX; i++) {
+   intr_handles[i] =
+   rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+   if (!intr_handles[i])
+   return -1;
+   }
 
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID].fd = pfds.readfd;
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID].type =
-   RTE_INTR_HANDLE_UNKNOWN;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_INVALID];
+   if (rte_intr_fd_set(test_intr_handle, -1))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_UNKNOWN))
+   return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_UIO].fd = pfds.readfd;
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_UIO].type =
-   RTE_INTR_HANDLE_UIO;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID];
+   if (rte_intr_fd_set(test_intr_handle, pfds.readfd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_UNKNOWN))
+   return -1;
+
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_UIO];
+   if (rte_intr_fd_set(test_intr_handle, pfds.readfd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_UIO))
+   return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_ALARM].fd = pfds.readfd;
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_ALARM].type =
-   RTE_INTR_HANDLE_ALARM;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_ALARM];
+   if (rte_intr_fd_set(test_intr_handle, pfds.readfd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_ALARM))
+   return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].fd = pfds.readfd;
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].type =
-   RTE_INTR_HANDLE_DEV_EVENT;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT];
+   if (rte_intr_fd_set(test_intr_handle, pfds.readfd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_DEV_EVENT))
+   return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_CASE1].fd = pfds.writefd;
-   intr_handles[TEST_INTERRUPT_HANDLE_CASE1].type = RTE_INTR_HANDLE_UIO;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_CASE1];
+   if (rte_intr_fd_set(test_intr_handle, pfds.writefd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_UIO))
+   return -1;
 
return 0;
 }
@@ -97,6 +120,10 @@ test_interrupt_init(void)
 static int
 test_interrupt_deinit(void)
 {
+   int i;
+
+   for (i = 0; i < TEST_INTERRUPT_HANDLE_MAX; i++)
+   rte_intr_instance_free(intr_handles[i]);
close(

[dpdk-dev] [PATCH v7 4/9] alarm: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
From: Harman Kalra 

Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the libraries access the interrupt handle fields.

Implementing alarm cleanup routine, where the memory allocated
for interrupt instance can be freed.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v6:
- removed unused interrupt handle in FreeBSD alarm code,

Changes since v5:
- split from patch4,
- merged patch6,
- renamed rte_eal_alarm_fini as rte_eal_alarm_cleanup,

---
 lib/eal/common/eal_private.h | 10 ++
 lib/eal/freebsd/eal.c|  1 +
 lib/eal/freebsd/eal_alarm.c  | 35 +--
 lib/eal/linux/eal.c  |  1 +
 lib/eal/linux/eal_alarm.c| 32 +---
 5 files changed, 66 insertions(+), 13 deletions(-)

diff --git a/lib/eal/common/eal_private.h b/lib/eal/common/eal_private.h
index 86dab1f057..36bcc0b5a4 100644
--- a/lib/eal/common/eal_private.h
+++ b/lib/eal/common/eal_private.h
@@ -163,6 +163,16 @@ int rte_eal_intr_init(void);
  */
 int rte_eal_alarm_init(void);
 
+/**
+ * Alarm mechanism cleanup.
+ *
+ * This function is private to EAL.
+ *
+ * @return
+ *  0 on success, negative on error
+ */
+void rte_eal_alarm_cleanup(void);
+
 /**
  * Function is to check if the kernel module(like, vfio, vfio_iommu_type1,
  * etc.) loaded.
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 56a60f13e9..9935356ed4 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -975,6 +975,7 @@ rte_eal_cleanup(void)
rte_mp_channel_cleanup();
/* after this point, any DPDK pointers will become dangling */
rte_eal_memory_detach();
+   rte_eal_alarm_cleanup();
rte_trace_save();
eal_trace_fini();
eal_cleanup_config(internal_conf);
diff --git a/lib/eal/freebsd/eal_alarm.c b/lib/eal/freebsd/eal_alarm.c
index c38b2e04f8..1023c32937 100644
--- a/lib/eal/freebsd/eal_alarm.c
+++ b/lib/eal/freebsd/eal_alarm.c
@@ -32,7 +32,6 @@
 
 struct alarm_entry {
LIST_ENTRY(alarm_entry) next;
-   struct rte_intr_handle handle;
struct timespec time;
rte_eal_alarm_callback cb_fn;
void *cb_arg;
@@ -43,22 +42,46 @@ struct alarm_entry {
 static LIST_HEAD(alarm_list, alarm_entry) alarm_list = LIST_HEAD_INITIALIZER();
 static rte_spinlock_t alarm_list_lk = RTE_SPINLOCK_INITIALIZER;
 
-static struct rte_intr_handle intr_handle = {.fd = -1 };
+static struct rte_intr_handle *intr_handle;
 static void eal_alarm_callback(void *arg);
 
+void
+rte_eal_alarm_cleanup(void)
+{
+   rte_intr_instance_free(intr_handle);
+}
+
 int
 rte_eal_alarm_init(void)
 {
-   intr_handle.type = RTE_INTR_HANDLE_ALARM;
+   int fd;
+
+   intr_handle = rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+   if (intr_handle == NULL) {
+   RTE_LOG(ERR, EAL, "Fail to allocate intr_handle\n");
+   goto error;
+   }
+
+   if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_ALARM))
+   goto error;
+
+   if (rte_intr_fd_set(intr_handle, -1))
+   goto error;
 
/* on FreeBSD, timers don't use fd's, and their identifiers are stored
 * in separate namespace from fd's, so using any value is OK. however,
 * EAL interrupts handler expects fd's to be unique, so use an actual fd
 * to guarantee unique timer identifier.
 */
-   intr_handle.fd = open("/dev/zero", O_RDONLY);
+   fd = open("/dev/zero", O_RDONLY);
+
+   if (rte_intr_fd_set(intr_handle, fd))
+   goto error;
 
return 0;
+error:
+   rte_intr_instance_free(intr_handle);
+   return -1;
 }
 
 static inline int
@@ -118,7 +141,7 @@ unregister_current_callback(void)
ap = LIST_FIRST(&alarm_list);
 
do {
-   ret = rte_intr_callback_unregister(&intr_handle,
+   ret = rte_intr_callback_unregister(intr_handle,
eal_alarm_callback, &ap->time);
} while (ret == -EAGAIN);
}
@@ -136,7 +159,7 @@ register_first_callback(void)
ap = LIST_FIRST(&alarm_list);
 
/* register a new callback */
-   ret = rte_intr_callback_register(&intr_handle,
+   ret = rte_intr_callback_register(intr_handle,
eal_alarm_callback, &ap->time);
}
return ret;
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 0d0fc8..81fdebc6a0 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1368,6 +1368,7 @@ rte_eal_cleanup(void)
rte_mp_channel_cleanup();
/* after this point, any DPDK pointers will become dangling */
rte_eal_memory_detach();
+   rte_eal_alarm_cleanup();
rte_trace_save();
eal_trace_fini();
eal_cleanup_config(internal_conf);
diff --git a/lib/eal/linux/eal_alarm.c

[dpdk-dev] [PATCH v7 5/9] lib: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
From: Harman Kalra 

Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the libraries access the interrupt handle fields.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v5:
- split from patch4,

---
 lib/bbdev/rte_bbdev.c   |  4 +--
 lib/eal/linux/eal_dev.c | 57 -
 lib/ethdev/rte_ethdev.c | 14 +-
 3 files changed, 43 insertions(+), 32 deletions(-)

diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index defddcfc28..b86c5fdcc0 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -1094,7 +1094,7 @@ rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t 
queue_id, int epfd, int op,
VALID_QUEUE_OR_RET_ERR(queue_id, dev);
 
intr_handle = dev->intr_handle;
-   if (!intr_handle || !intr_handle->intr_vec) {
+   if (intr_handle == NULL) {
rte_bbdev_log(ERR, "Device %u intr handle unset\n", dev_id);
return -ENOTSUP;
}
@@ -1105,7 +1105,7 @@ rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t 
queue_id, int epfd, int op,
return -ENOTSUP;
}
 
-   vec = intr_handle->intr_vec[queue_id];
+   vec = rte_intr_vec_list_index_get(intr_handle, queue_id);
ret = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
if (ret && (ret != -EEXIST)) {
rte_bbdev_log(ERR,
diff --git a/lib/eal/linux/eal_dev.c b/lib/eal/linux/eal_dev.c
index 3b905e18f5..06820a3666 100644
--- a/lib/eal/linux/eal_dev.c
+++ b/lib/eal/linux/eal_dev.c
@@ -23,10 +23,7 @@
 
 #include "eal_private.h"
 
-static struct rte_intr_handle intr_handle = {
-   .type = RTE_INTR_HANDLE_DEV_EVENT,
-   .fd = -1,
-};
+static struct rte_intr_handle *intr_handle;
 static rte_rwlock_t monitor_lock = RTE_RWLOCK_INITIALIZER;
 static uint32_t monitor_refcount;
 static bool hotplug_handle;
@@ -109,12 +106,11 @@ static int
 dev_uev_socket_fd_create(void)
 {
struct sockaddr_nl addr;
-   int ret;
+   int ret, fd;
 
-   intr_handle.fd = socket(PF_NETLINK, SOCK_RAW | SOCK_CLOEXEC |
-   SOCK_NONBLOCK,
-   NETLINK_KOBJECT_UEVENT);
-   if (intr_handle.fd < 0) {
+   fd = socket(PF_NETLINK, SOCK_RAW | SOCK_CLOEXEC | SOCK_NONBLOCK,
+   NETLINK_KOBJECT_UEVENT);
+   if (fd < 0) {
RTE_LOG(ERR, EAL, "create uevent fd failed.\n");
return -1;
}
@@ -124,16 +120,19 @@ dev_uev_socket_fd_create(void)
addr.nl_pid = 0;
addr.nl_groups = 0x;
 
-   ret = bind(intr_handle.fd, (struct sockaddr *) &addr, sizeof(addr));
+   ret = bind(fd, (struct sockaddr *) &addr, sizeof(addr));
if (ret < 0) {
RTE_LOG(ERR, EAL, "Failed to bind uevent socket.\n");
goto err;
}
 
+   if (rte_intr_fd_set(intr_handle, fd))
+   goto err;
+
return 0;
 err:
-   close(intr_handle.fd);
-   intr_handle.fd = -1;
+   close(fd);
+   fd = -1;
return ret;
 }
 
@@ -217,9 +216,9 @@ dev_uev_parse(const char *buf, struct rte_dev_event *event, 
int length)
 static void
 dev_delayed_unregister(void *param)
 {
-   rte_intr_callback_unregister(&intr_handle, dev_uev_handler, param);
-   close(intr_handle.fd);
-   intr_handle.fd = -1;
+   rte_intr_callback_unregister(intr_handle, dev_uev_handler, param);
+   close(rte_intr_fd_get(intr_handle));
+   rte_intr_fd_set(intr_handle, -1);
 }
 
 static void
@@ -235,7 +234,8 @@ dev_uev_handler(__rte_unused void *param)
memset(&uevent, 0, sizeof(struct rte_dev_event));
memset(buf, 0, EAL_UEV_MSG_LEN);
 
-   ret = recv(intr_handle.fd, buf, EAL_UEV_MSG_LEN, MSG_DONTWAIT);
+   ret = recv(rte_intr_fd_get(intr_handle), buf, EAL_UEV_MSG_LEN,
+  MSG_DONTWAIT);
if (ret < 0 && errno == EAGAIN)
return;
else if (ret <= 0) {
@@ -311,24 +311,35 @@ rte_dev_event_monitor_start(void)
goto exit;
}
 
+   intr_handle = rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+   if (intr_handle == NULL) {
+   RTE_LOG(ERR, EAL, "Fail to allocate intr_handle\n");
+   goto exit;
+   }
+
+   if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_DEV_EVENT))
+   goto exit;
+
+   if (rte_intr_fd_set(intr_handle, -1))
+   goto exit;
+
ret = dev_uev_socket_fd_create();
if (ret) {
RTE_LOG(ERR, EAL, "error create device event fd.\n");
goto exit;
}
 
-   ret = rte_intr_callback_register(&intr_handle, dev_uev_handler, NULL);
+   ret = rte_intr_callback_register(intr_handle, dev_uev_handler, NULL);
 
if (ret) {
-   RTE_LOG(ERR, EAL, "fail to register uevent callback.\n");
-   close(intr_handle.fd);
-   int

[dpdk-dev] [PATCH v7 7/9] interrupts: make interrupt handle structure opaque

2021-10-25 Thread David Marchand
From: Harman Kalra 

Moving interrupt handle structure definition inside a EAL private
header to make its fields totally opaque to the outside world.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v5:
- let rte_intr_handle fields untouched:
  - split vfio / uio fd renames in a separate commit,
  - split event list update in a separate commit,
- moved rte_intr_handle definition to a EAL private header,
- preserved dumping all info in interrupt tracepoints,

---
 lib/eal/common/eal_common_interrupts.c |  2 +
 lib/eal/common/eal_interrupts.h| 37 +
 lib/eal/include/meson.build|  1 -
 lib/eal/include/rte_eal_interrupts.h   | 72 --
 lib/eal/include/rte_eal_trace.h|  2 +
 lib/eal/include/rte_interrupts.h   | 24 -
 6 files changed, 63 insertions(+), 75 deletions(-)
 create mode 100644 lib/eal/common/eal_interrupts.h
 delete mode 100644 lib/eal/include/rte_eal_interrupts.h

diff --git a/lib/eal/common/eal_common_interrupts.c 
b/lib/eal/common/eal_common_interrupts.c
index d6e6654fbb..1337c560e4 100644
--- a/lib/eal/common/eal_common_interrupts.c
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 
+#include "eal_interrupts.h"
+
 /* Macros to check for valid interrupt handle */
 #define CHECK_VALID_INTR_HANDLE(intr_handle) do { \
if (intr_handle == NULL) { \
diff --git a/lib/eal/common/eal_interrupts.h b/lib/eal/common/eal_interrupts.h
new file mode 100644
index 00..beacc04b62
--- /dev/null
+++ b/lib/eal/common/eal_interrupts.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#ifndef EAL_INTERRUPTS_H
+#define EAL_INTERRUPTS_H
+
+struct rte_intr_handle {
+   RTE_STD_C11
+   union {
+   struct {
+   RTE_STD_C11
+   union {
+   /** VFIO device file descriptor */
+   int vfio_dev_fd;
+   /** UIO cfg file desc for uio_pci_generic */
+   int uio_cfg_fd;
+   };
+   int fd; /**< interrupt event file descriptor */
+   };
+   void *windows_handle; /**< device driver handle */
+   };
+   uint32_t alloc_flags;   /**< flags passed at allocation */
+   enum rte_intr_handle_type type;  /**< handle type */
+   uint32_t max_intr; /**< max interrupt requested */
+   uint32_t nb_efd;   /**< number of available efd(event fd) */
+   uint8_t efd_counter_size;  /**< size of efd counter, used for vdev 
*/
+   uint16_t nb_intr;
+   /**< Max vector count, default RTE_MAX_RXTX_INTR_VEC_ID */
+   int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+   struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID];
+  /**< intr vector epoll event */
+   uint16_t vec_list_size;
+   int *intr_vec; /**< intr vector number array */
+};
+
+#endif /* EAL_INTERRUPTS_H */
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index 8e258607b8..86468d1a2b 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -49,7 +49,6 @@ headers += files(
 'rte_version.h',
 'rte_vfio.h',
 )
-indirect_headers += files('rte_eal_interrupts.h')
 
 # special case install the generic headers, since they go in a subdir
 generic_headers = files(
diff --git a/lib/eal/include/rte_eal_interrupts.h 
b/lib/eal/include/rte_eal_interrupts.h
deleted file mode 100644
index 60bb60ca59..00
--- a/lib/eal/include/rte_eal_interrupts.h
+++ /dev/null
@@ -1,72 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2014 Intel Corporation
- */
-
-#ifndef _RTE_INTERRUPTS_H_
-#error "don't include this file directly, please include generic 
"
-#endif
-
-/**
- * @file rte_eal_interrupts.h
- * @internal
- *
- * Contains function prototypes exposed by the EAL for interrupt handling by
- * drivers and other DPDK internal consumers.
- */
-
-#ifndef _RTE_EAL_INTERRUPTS_H_
-#define _RTE_EAL_INTERRUPTS_H_
-
-#define RTE_MAX_RXTX_INTR_VEC_ID  512
-#define RTE_INTR_VEC_ZERO_OFFSET  0
-#define RTE_INTR_VEC_RXTX_OFFSET  1
-
-/**
- * The interrupt source type, e.g. UIO, VFIO, ALARM etc.
- */
-enum rte_intr_handle_type {
-   RTE_INTR_HANDLE_UNKNOWN = 0,  /**< generic unknown handle */
-   RTE_INTR_HANDLE_UIO,  /**< uio device handle */
-   RTE_INTR_HANDLE_UIO_INTX, /**< uio generic handle */
-   RTE_INTR_HANDLE_VFIO_LEGACY,  /**< vfio device handle (legacy) */
-   RTE_INTR_HANDLE_VFIO_MSI, /**< vfio device handle (MSI) */
-   RTE_INTR_HANDLE_VFIO_MSIX,/**< vfio device handle (MSIX) */
-   RTE_INTR_HANDLE_ALARM,/**< alarm handle */
-   RTE_INTR_HANDLE_EXT,  /**

[dpdk-dev] [PATCH v7 8/9] interrupts: rename device specific file descriptor

2021-10-25 Thread David Marchand
From: Harman Kalra 

VFIO/UIO are mutually exclusive, storing file descriptor in a single
field is enough.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v5:
- split from patch5,

---
 lib/eal/common/eal_common_interrupts.c | 6 +++---
 lib/eal/common/eal_interrupts.h| 8 +---
 lib/eal/include/rte_eal_trace.h| 8 
 3 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/lib/eal/common/eal_common_interrupts.c 
b/lib/eal/common/eal_common_interrupts.c
index 1337c560e4..3285c4335f 100644
--- a/lib/eal/common/eal_common_interrupts.c
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -72,7 +72,7 @@ struct rte_intr_handle *rte_intr_instance_dup(const struct 
rte_intr_handle *src)
intr_handle = rte_intr_instance_alloc(src->alloc_flags);
 
intr_handle->fd = src->fd;
-   intr_handle->vfio_dev_fd = src->vfio_dev_fd;
+   intr_handle->dev_fd = src->dev_fd;
intr_handle->type = src->type;
intr_handle->max_intr = src->max_intr;
intr_handle->nb_efd = src->nb_efd;
@@ -139,7 +139,7 @@ int rte_intr_dev_fd_set(struct rte_intr_handle 
*intr_handle, int fd)
 {
CHECK_VALID_INTR_HANDLE(intr_handle);
 
-   intr_handle->vfio_dev_fd = fd;
+   intr_handle->dev_fd = fd;
 
return 0;
 fail:
@@ -150,7 +150,7 @@ int rte_intr_dev_fd_get(const struct rte_intr_handle 
*intr_handle)
 {
CHECK_VALID_INTR_HANDLE(intr_handle);
 
-   return intr_handle->vfio_dev_fd;
+   return intr_handle->dev_fd;
 fail:
return -1;
 }
diff --git a/lib/eal/common/eal_interrupts.h b/lib/eal/common/eal_interrupts.h
index beacc04b62..1a4e5573b2 100644
--- a/lib/eal/common/eal_interrupts.h
+++ b/lib/eal/common/eal_interrupts.h
@@ -9,13 +9,7 @@ struct rte_intr_handle {
RTE_STD_C11
union {
struct {
-   RTE_STD_C11
-   union {
-   /** VFIO device file descriptor */
-   int vfio_dev_fd;
-   /** UIO cfg file desc for uio_pci_generic */
-   int uio_cfg_fd;
-   };
+   int dev_fd; /**< VFIO/UIO cfg device file descriptor */
int fd; /**< interrupt event file descriptor */
};
void *windows_handle; /**< device driver handle */
diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h
index af7b2d0bf0..5ef4398230 100644
--- a/lib/eal/include/rte_eal_trace.h
+++ b/lib/eal/include/rte_eal_trace.h
@@ -151,7 +151,7 @@ RTE_TRACE_POINT(
RTE_TRACE_POINT_ARGS(const struct rte_intr_handle *handle,
rte_intr_callback_fn cb, void *cb_arg, int rc),
rte_trace_point_emit_int(rc);
-   rte_trace_point_emit_int(handle->vfio_dev_fd);
+   rte_trace_point_emit_int(handle->dev_fd);
rte_trace_point_emit_int(handle->fd);
rte_trace_point_emit_int(handle->type);
rte_trace_point_emit_u32(handle->max_intr);
@@ -164,7 +164,7 @@ RTE_TRACE_POINT(
RTE_TRACE_POINT_ARGS(const struct rte_intr_handle *handle,
rte_intr_callback_fn cb, void *cb_arg, int rc),
rte_trace_point_emit_int(rc);
-   rte_trace_point_emit_int(handle->vfio_dev_fd);
+   rte_trace_point_emit_int(handle->dev_fd);
rte_trace_point_emit_int(handle->fd);
rte_trace_point_emit_int(handle->type);
rte_trace_point_emit_u32(handle->max_intr);
@@ -176,7 +176,7 @@ RTE_TRACE_POINT(
rte_eal_trace_intr_enable,
RTE_TRACE_POINT_ARGS(const struct rte_intr_handle *handle, int rc),
rte_trace_point_emit_int(rc);
-   rte_trace_point_emit_int(handle->vfio_dev_fd);
+   rte_trace_point_emit_int(handle->dev_fd);
rte_trace_point_emit_int(handle->fd);
rte_trace_point_emit_int(handle->type);
rte_trace_point_emit_u32(handle->max_intr);
@@ -186,7 +186,7 @@ RTE_TRACE_POINT(
rte_eal_trace_intr_disable,
RTE_TRACE_POINT_ARGS(const struct rte_intr_handle *handle, int rc),
rte_trace_point_emit_int(rc);
-   rte_trace_point_emit_int(handle->vfio_dev_fd);
+   rte_trace_point_emit_int(handle->dev_fd);
rte_trace_point_emit_int(handle->fd);
rte_trace_point_emit_int(handle->type);
rte_trace_point_emit_u32(handle->max_intr);
-- 
2.23.0



[dpdk-dev] [PATCH v7 9/9] interrupts: extend event list

2021-10-25 Thread David Marchand
From: Harman Kalra 

Dynamically allocating the efds and elist array of intr_handle
structure, based on size provided by user. Eg size can be
MSIX interrupts supported by a PCI device.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
Acked-by: Dmitry Kozlyuk 
---
Changes since v6:
- removed unneeded checks on elist/efds array initialisation,

Changes since v5:
- split from patch5,

---
 drivers/bus/pci/linux/pci_vfio.c   |  6 ++
 drivers/common/cnxk/roc_platform.h |  1 +
 lib/eal/common/eal_common_interrupts.c | 95 +-
 lib/eal/common/eal_interrupts.h|  5 +-
 4 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index 7b2f8296c5..f622e7f8e6 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -266,6 +266,12 @@ pci_vfio_setup_interrupts(struct rte_pci_device *dev, int 
vfio_dev_fd)
return -1;
}
 
+   /* Reallocate the efds and elist fields of intr_handle based
+* on PCI device MSIX size.
+*/
+   if (rte_intr_event_list_update(dev->intr_handle, irq.count))
+   return -1;
+
/* if this vector cannot be used with eventfd, fail if we 
explicitly
 * specified interrupt type, otherwise continue */
if ((irq.flags & VFIO_IRQ_INFO_EVENTFD) == 0) {
diff --git a/drivers/common/cnxk/roc_platform.h 
b/drivers/common/cnxk/roc_platform.h
index 60227b72d0..5da23fe5f8 100644
--- a/drivers/common/cnxk/roc_platform.h
+++ b/drivers/common/cnxk/roc_platform.h
@@ -121,6 +121,7 @@
 #define plt_intr_instance_allocrte_intr_instance_alloc
 #define plt_intr_instance_dup  rte_intr_instance_dup
 #define plt_intr_instance_free rte_intr_instance_free
+#define plt_intr_event_list_update rte_intr_event_list_update
 #define plt_intr_max_intr_get  rte_intr_max_intr_get
 #define plt_intr_max_intr_set  rte_intr_max_intr_set
 #define plt_intr_nb_efd_getrte_intr_nb_efd_get
diff --git a/lib/eal/common/eal_common_interrupts.c 
b/lib/eal/common/eal_common_interrupts.c
index 3285c4335f..636bbfce72 100644
--- a/lib/eal/common/eal_common_interrupts.c
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -53,10 +53,46 @@ struct rte_intr_handle *rte_intr_instance_alloc(uint32_t 
flags)
return NULL;
}
 
+   if (uses_rte_memory) {
+   intr_handle->efds = rte_zmalloc(NULL,
+   RTE_MAX_RXTX_INTR_VEC_ID * sizeof(int), 0);
+   } else {
+   intr_handle->efds = calloc(RTE_MAX_RXTX_INTR_VEC_ID,
+   sizeof(int));
+   }
+   if (intr_handle->efds == NULL) {
+   RTE_LOG(ERR, EAL, "Fail to allocate event fd list\n");
+   rte_errno = ENOMEM;
+   goto fail;
+   }
+
+   if (uses_rte_memory) {
+   intr_handle->elist = rte_zmalloc(NULL,
+   RTE_MAX_RXTX_INTR_VEC_ID * sizeof(struct 
rte_epoll_event),
+   0);
+   } else {
+   intr_handle->elist = calloc(RTE_MAX_RXTX_INTR_VEC_ID,
+   sizeof(struct rte_epoll_event));
+   }
+   if (intr_handle->elist == NULL) {
+   RTE_LOG(ERR, EAL, "fail to allocate event fd list\n");
+   rte_errno = ENOMEM;
+   goto fail;
+   }
+
intr_handle->alloc_flags = flags;
intr_handle->nb_intr = RTE_MAX_RXTX_INTR_VEC_ID;
 
return intr_handle;
+fail:
+   if (uses_rte_memory) {
+   rte_free(intr_handle->efds);
+   rte_free(intr_handle);
+   } else {
+   free(intr_handle->efds);
+   free(intr_handle);
+   }
+   return NULL;
 }
 
 struct rte_intr_handle *rte_intr_instance_dup(const struct rte_intr_handle 
*src)
@@ -83,14 +119,69 @@ struct rte_intr_handle *rte_intr_instance_dup(const struct 
rte_intr_handle *src)
return intr_handle;
 }
 
+int rte_intr_event_list_update(struct rte_intr_handle *intr_handle, int size)
+{
+   struct rte_epoll_event *tmp_elist;
+   bool uses_rte_memory;
+   int *tmp_efds;
+
+   CHECK_VALID_INTR_HANDLE(intr_handle);
+
+   if (size == 0) {
+   RTE_LOG(ERR, EAL, "Size can't be zero\n");
+   rte_errno = EINVAL;
+   goto fail;
+   }
+
+   uses_rte_memory =
+   RTE_INTR_INSTANCE_USES_RTE_MEMORY(intr_handle->alloc_flags);
+   if (uses_rte_memory) {
+   tmp_efds = rte_realloc(intr_handle->efds, size * sizeof(int),
+   0);
+   } else {
+   tmp_efds = realloc(intr_handle->efds, size * sizeof(int));
+   }
+   if (tmp_efds == NULL) {
+   RTE_LOG(ERR, EAL, "Failed to realloc the efds list\n");
+   rte_errno = ENOMEM;
+  

[dpdk-dev] [PATCH v8 0/9] make rte_intr_handle internal

2021-10-25 Thread David Marchand
Moving struct rte_intr_handle as an internal structure to
avoid any ABI breakages in future. Since this structure defines
some static arrays and changing respective macros breaks the ABI.
Eg:
Currently RTE_MAX_RXTX_INTR_VEC_ID imposes a limit of maximum 512
MSI-X interrupts that can be defined for a PCI device, while PCI
specification allows maximum 2048 MSI-X interrupts that can be used.
If some PCI device requires more than 512 vectors, either change the
RTE_MAX_RXTX_INTR_VEC_ID limit or dynamically allocate based on
PCI device MSI-X size on probe time. Either way its an ABI breakage.

Change already included in 21.11 ABI improvement spreadsheet (item 42):
https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_s
preadsheets_d_1betlC000ua5SsSiJIcC54mCCCJnW6voH5Dqv9UxeyfE_edit-23gid-
3D0&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=5ESHPj7V-7JdkxT_Z_SU6RrS37ys4U
XudBQ_rrS5LRo&m=7dl3OmXU7QHMmWYB6V1hYJtq1cUkjfhXUwze2Si_48c&s=lh6DEGhR
Bg1shODpAy3RQk-H-0uQx5icRfUBf9dtCp4&e=

This series makes struct rte_intr_handle totally opaque to the outside
world by wrapping it inside a .c file and providing get set wrapper APIs
to read or manipulate its fields.. Any changes to be made to any of the
fields should be done via these get set APIs.
Introduced a new eal_common_interrupts.c where all these APIs are defined
and also hides struct rte_intr_handle definition.

v1:
* Fixed freebsd compilation failure
* Fixed seg fault in case of memif

v2:
* Merged the prototype and implementation patch to 1.
* Restricting allocation of single interrupt instance.
* Removed base APIs, as they were exposing internally
allocated memory information.
* Fixed some memory leak issues.
* Marked some library specific APIs as internal.

v3:
* Removed flag from instance alloc API, rather auto detect
if memory should be allocated using glibc malloc APIs or
rte_malloc*
* Added APIs for get/set windows handle.
* Defined macros for repeated checks.

v4:
* Rectified some typo in the APIs documentation.
* Better names for some internal variables.

v5:
* Reverted back to passing flag to instance alloc API, as
with auto detect some multiprocess issues existing in the
library were causing tests failure.
* Rebased to top of tree.

v6:
* renamed RTE_INTR_INSTANCE_F_UNSHARED as RTE_INTR_INSTANCE_F_PRIVATE,
* changed API and removed need for alloc_flag content exposure
  (see rte_intr_instance_dup() in patch 1 and 2),
* exported all symbols for Windows,
* fixed leak in unit tests in case of alloc failure,
* split (previously) patch 4 into three patches
  * (now) patch 4 only concerns alarm and (previously) patch 6 cleanup bits
are squashed in it,
  * (now) patch 5 concerns other libraries updates,
  * (now) patch 6 concerns drivers updates:
* instance allocation is moved to probing for auxiliary,
* there might be a bug for PCI drivers non requesting
  RTE_PCI_DRV_NEED_MAPPING, but code is left as v5,
* split (previously) patch 5 into three patches
  * (now) patch 7 only hides structure, but keep it in a EAL private
header, this makes it possible to keep info in tracepoints,
  * (now) patch 8 deals with VFIO/UIO internal fds merge,
  * (now) patch 9 extends event list,

v7:
* fixed compilation on FreeBSD,
* removed unused interrupt handle in FreeBSD alarm code,
* fixed interrupt handle allocation for PCI drivers without
  RTE_PCI_DRV_NEED_MAPPING,

v8:
* lowered logs level to DEBUG in sanity checks,
* fixed corner case with vector list access,

-- 
David Marchand

Harman Kalra (9):
  interrupts: add allocator and accessors
  interrupts: remove direct access to interrupt handle
  test/interrupts: remove direct access to interrupt handle
  alarm: remove direct access to interrupt handle
  lib: remove direct access to interrupt handle
  drivers: remove direct access to interrupt handle
  interrupts: make interrupt handle structure opaque
  interrupts: rename device specific file descriptor
  interrupts: extend event list

 MAINTAINERS   |   1 +
 app/test/test_interrupts.c| 164 +++--
 drivers/baseband/acc100/rte_acc100_pmd.c  |  14 +-
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  24 +-
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |  24 +-
 drivers/bus/auxiliary/auxiliary_common.c  |  17 +-
 drivers/bus/auxiliary/rte_bus_auxiliary.h |   2 +-
 drivers/bus/dpaa/dpaa_bus.c   |  28 +-
 drivers/bus/dpaa/rte_dpaa_bus.h   |   2 +-
 drivers/bus/fslmc/fslmc_bus.c |  14 +-
 drivers/bus/fslmc/fslmc_vfio.c|  30 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c  |  18 +-
 drivers/bus/fslmc/portal/dpaa2_hw_pvt.h   |   2 +-
 drivers/bus/fslmc/rte_fslmc.h |   2 +-
 drivers/bus/ifpga/ifpga_bus.c |  13 +-
 drivers/bus/ifpga/rte_bus_ifpga.h |   2 +-
 drivers/bus/pci/bsd/pci.c |  20 +-
 drivers/bus/pci/linux/pci.c   |   4 +-
 drivers/bus/pci/linux/p

[dpdk-dev] [PATCH v8 1/9] interrupts: add allocator and accessors

2021-10-25 Thread David Marchand
From: Harman Kalra 

Prototype/Implement get set APIs for interrupt handle fields.
User won't be able to access any of the interrupt handle fields
directly while should use these get/set APIs to access/manipulate
them.

Internal interrupt header i.e. rte_eal_interrupt.h is rearranged,
as APIs defined are moved to rte_interrupts.h and epoll specific
definitions are moved to a new header rte_epoll.h.
Later in the series rte_eal_interrupt.h will be removed.

Signed-off-by: Harman Kalra 
Acked-by: Ray Kinsella 
Acked-by: Dmitry Kozlyuk 
Signed-off-by: David Marchand 
---
Changes since v7:
- lowered checks log level to DEBUG,
- removed asserts on vector list size, and fixed check on list size
  for drivers like mlx5 who expects list is not initialized,

Changes since v5:
- renamed RTE_INTR_INSTANCE_F_UNSHARED as RTE_INTR_INSTANCE_F_PRIVATE,
- used a single bit to mark instance as shared (default is private),
- removed rte_intr_instance_copy / rte_intr_instance_alloc_flag_get
  with a single rte_intr_instance_dup helper,
- made rte_intr_vec_list_alloc alloc_flags-aware,
- exported all symbols for Windows,

---
 MAINTAINERS|   1 +
 lib/eal/common/eal_common_interrupts.c | 407 
 lib/eal/common/meson.build |   1 +
 lib/eal/include/meson.build|   1 +
 lib/eal/include/rte_eal_interrupts.h   | 207 +---
 lib/eal/include/rte_epoll.h| 118 +
 lib/eal/include/rte_interrupts.h   | 627 +
 lib/eal/version.map|  45 +-
 8 files changed, 1197 insertions(+), 210 deletions(-)
 create mode 100644 lib/eal/common/eal_common_interrupts.c
 create mode 100644 lib/eal/include/rte_epoll.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 587632dce0..097a57f7f6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -211,6 +211,7 @@ F: app/test/test_memzone.c
 
 Interrupt Subsystem
 M: Harman Kalra 
+F: lib/eal/include/rte_epoll.h
 F: lib/eal/*/*interrupts.*
 F: app/test/test_interrupts.c
 
diff --git a/lib/eal/common/eal_common_interrupts.c 
b/lib/eal/common/eal_common_interrupts.c
new file mode 100644
index 00..46064870f4
--- /dev/null
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -0,0 +1,407 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2021 Marvell.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+/* Macros to check for valid interrupt handle */
+#define CHECK_VALID_INTR_HANDLE(intr_handle) do { \
+   if (intr_handle == NULL) { \
+   RTE_LOG(DEBUG, EAL, "Interrupt instance unallocated\n"); \
+   rte_errno = EINVAL; \
+   goto fail; \
+   } \
+} while (0)
+
+#define RTE_INTR_INSTANCE_KNOWN_FLAGS (RTE_INTR_INSTANCE_F_PRIVATE \
+   | RTE_INTR_INSTANCE_F_SHARED \
+   )
+
+#define RTE_INTR_INSTANCE_USES_RTE_MEMORY(flags) \
+   !!(flags & RTE_INTR_INSTANCE_F_SHARED)
+
+struct rte_intr_handle *rte_intr_instance_alloc(uint32_t flags)
+{
+   struct rte_intr_handle *intr_handle;
+   bool uses_rte_memory;
+
+   /* Check the flag passed by user, it should be part of the
+* defined flags.
+*/
+   if ((flags & ~RTE_INTR_INSTANCE_KNOWN_FLAGS) != 0) {
+   RTE_LOG(DEBUG, EAL, "Invalid alloc flag passed 0x%x\n", flags);
+   rte_errno = EINVAL;
+   return NULL;
+   }
+
+   uses_rte_memory = RTE_INTR_INSTANCE_USES_RTE_MEMORY(flags);
+   if (uses_rte_memory)
+   intr_handle = rte_zmalloc(NULL, sizeof(*intr_handle), 0);
+   else
+   intr_handle = calloc(1, sizeof(*intr_handle));
+   if (intr_handle == NULL) {
+   RTE_LOG(ERR, EAL, "Failed to allocate intr_handle\n");
+   rte_errno = ENOMEM;
+   return NULL;
+   }
+
+   intr_handle->alloc_flags = flags;
+   intr_handle->nb_intr = RTE_MAX_RXTX_INTR_VEC_ID;
+
+   return intr_handle;
+}
+
+struct rte_intr_handle *rte_intr_instance_dup(const struct rte_intr_handle 
*src)
+{
+   struct rte_intr_handle *intr_handle;
+
+   if (src == NULL) {
+   RTE_LOG(DEBUG, EAL, "Source interrupt instance unallocated\n");
+   rte_errno = EINVAL;
+   return NULL;
+   }
+
+   intr_handle = rte_intr_instance_alloc(src->alloc_flags);
+
+   intr_handle->fd = src->fd;
+   intr_handle->vfio_dev_fd = src->vfio_dev_fd;
+   intr_handle->type = src->type;
+   intr_handle->max_intr = src->max_intr;
+   intr_handle->nb_efd = src->nb_efd;
+   intr_handle->efd_counter_size = src->efd_counter_size;
+   memcpy(intr_handle->efds, src->efds, src->nb_intr);
+   memcpy(intr_handle->elist, src->elist, src->nb_intr);
+
+   return intr_handle;
+}
+
+void rte_intr_instance_free(struct rte_intr_handle *intr_handle)
+{
+   if (intr_handle == NULL)
+   return;
+   if (RTE_INTR_INSTANCE_USES_RTE_MEMORY(intr_handle->alloc_flags))
+   rte_free(intr_han

[dpdk-dev] [PATCH v8 2/9] interrupts: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
From: Harman Kalra 

Making changes to the interrupt framework to use interrupt handle
APIs to get/set any field.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v6:
- fixed compilation on FreeBSD,

Changes since v5:
- used new helper rte_intr_instance_dup,

---
 lib/eal/freebsd/eal_interrupts.c |  85 +
 lib/eal/linux/eal_interrupts.c   | 304 +--
 2 files changed, 219 insertions(+), 170 deletions(-)

diff --git a/lib/eal/freebsd/eal_interrupts.c b/lib/eal/freebsd/eal_interrupts.c
index 86810845fe..10aa91cc09 100644
--- a/lib/eal/freebsd/eal_interrupts.c
+++ b/lib/eal/freebsd/eal_interrupts.c
@@ -40,7 +40,7 @@ struct rte_intr_callback {
 
 struct rte_intr_source {
TAILQ_ENTRY(rte_intr_source) next;
-   struct rte_intr_handle intr_handle; /**< interrupt handle */
+   struct rte_intr_handle *intr_handle; /**< interrupt handle */
struct rte_intr_cb_list callbacks;  /**< user callbacks */
uint32_t active;
 };
@@ -60,7 +60,7 @@ static int
 intr_source_to_kevent(const struct rte_intr_handle *ih, struct kevent *ke)
 {
/* alarm callbacks are special case */
-   if (ih->type == RTE_INTR_HANDLE_ALARM) {
+   if (rte_intr_type_get(ih) == RTE_INTR_HANDLE_ALARM) {
uint64_t timeout_ns;
 
/* get soonest alarm timeout */
@@ -75,7 +75,7 @@ intr_source_to_kevent(const struct rte_intr_handle *ih, 
struct kevent *ke)
} else {
ke->filter = EVFILT_READ;
}
-   ke->ident = ih->fd;
+   ke->ident = rte_intr_fd_get(ih);
 
return 0;
 }
@@ -89,7 +89,7 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
int ret = 0, add_event = 0;
 
/* first do parameter checking */
-   if (intr_handle == NULL || intr_handle->fd < 0 || cb == NULL) {
+   if (rte_intr_fd_get(intr_handle) < 0 || cb == NULL) {
RTE_LOG(ERR, EAL,
"Registering with invalid input parameter\n");
return -EINVAL;
@@ -103,7 +103,7 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
 
/* find the source for this intr_handle */
TAILQ_FOREACH(src, &intr_sources, next) {
-   if (src->intr_handle.fd == intr_handle->fd)
+   if (rte_intr_fd_get(src->intr_handle) == 
rte_intr_fd_get(intr_handle))
break;
}
 
@@ -112,8 +112,9 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
 * thing on the list should be eal_alarm_callback() and we may
 * be called just to reset the timer.
 */
-   if (src != NULL && src->intr_handle.type == RTE_INTR_HANDLE_ALARM &&
-!TAILQ_EMPTY(&src->callbacks)) {
+   if (src != NULL &&
+   rte_intr_type_get(src->intr_handle) == 
RTE_INTR_HANDLE_ALARM &&
+   !TAILQ_EMPTY(&src->callbacks)) {
callback = NULL;
} else {
/* allocate a new interrupt callback entity */
@@ -135,7 +136,14 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
ret = -ENOMEM;
goto fail;
} else {
-   src->intr_handle = *intr_handle;
+   src->intr_handle = 
rte_intr_instance_dup(intr_handle);
+   if (src->intr_handle == NULL) {
+   RTE_LOG(ERR, EAL, "Can not create intr 
instance\n");
+   ret = -ENOMEM;
+   free(src);
+   src = NULL;
+   goto fail;
+   }
TAILQ_INIT(&src->callbacks);
TAILQ_INSERT_TAIL(&intr_sources, src, next);
}
@@ -151,7 +159,8 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
/* add events to the queue. timer events are special as we need to
 * re-set the timer.
 */
-   if (add_event || src->intr_handle.type == RTE_INTR_HANDLE_ALARM) {
+   if (add_event ||
+   rte_intr_type_get(src->intr_handle) == 
RTE_INTR_HANDLE_ALARM) {
struct kevent ke;
 
memset(&ke, 0, sizeof(ke));
@@ -173,12 +182,11 @@ rte_intr_callback_register(const struct rte_intr_handle 
*intr_handle,
 */
if (errno == ENODEV)
RTE_LOG(DEBUG, EAL, "Interrupt handle %d not 
supported\n",
-   src->intr_handle.fd);
+   rte_intr_fd_get(src->intr_handle));
else
-   RTE_LOG(ERR, EAL, "Error adding fd %d "
- 

[dpdk-dev] [PATCH v8 3/9] test/interrupts: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
From: Harman Kalra 

Updating the interrupt testsuite to make use of interrupt
handle get set APIs.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v5:
- fixed leak on when some interrupt handle can't be allocated,

---
 app/test/test_interrupts.c | 164 ++---
 1 file changed, 98 insertions(+), 66 deletions(-)

diff --git a/app/test/test_interrupts.c b/app/test/test_interrupts.c
index 233b14a70b..2a05399f96 100644
--- a/app/test/test_interrupts.c
+++ b/app/test/test_interrupts.c
@@ -16,7 +16,7 @@
 
 /* predefined interrupt handle types */
 enum test_interrupt_handle_type {
-   TEST_INTERRUPT_HANDLE_INVALID,
+   TEST_INTERRUPT_HANDLE_INVALID = 0,
TEST_INTERRUPT_HANDLE_VALID,
TEST_INTERRUPT_HANDLE_VALID_UIO,
TEST_INTERRUPT_HANDLE_VALID_ALARM,
@@ -27,7 +27,7 @@ enum test_interrupt_handle_type {
 
 /* flag of if callback is called */
 static volatile int flag;
-static struct rte_intr_handle intr_handles[TEST_INTERRUPT_HANDLE_MAX];
+static struct rte_intr_handle *intr_handles[TEST_INTERRUPT_HANDLE_MAX];
 static enum test_interrupt_handle_type test_intr_type =
TEST_INTERRUPT_HANDLE_MAX;
 
@@ -50,7 +50,7 @@ static union intr_pipefds pfds;
 static inline int
 test_interrupt_handle_sanity_check(struct rte_intr_handle *intr_handle)
 {
-   if (!intr_handle || intr_handle->fd < 0)
+   if (!intr_handle || rte_intr_fd_get(intr_handle) < 0)
return -1;
 
return 0;
@@ -62,31 +62,54 @@ test_interrupt_handle_sanity_check(struct rte_intr_handle 
*intr_handle)
 static int
 test_interrupt_init(void)
 {
+   struct rte_intr_handle *test_intr_handle;
+   int i;
+
if (pipe(pfds.pipefd) < 0)
return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_INVALID].fd = -1;
-   intr_handles[TEST_INTERRUPT_HANDLE_INVALID].type =
-   RTE_INTR_HANDLE_UNKNOWN;
+   for (i = 0; i < TEST_INTERRUPT_HANDLE_MAX; i++) {
+   intr_handles[i] =
+   rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+   if (!intr_handles[i])
+   return -1;
+   }
 
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID].fd = pfds.readfd;
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID].type =
-   RTE_INTR_HANDLE_UNKNOWN;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_INVALID];
+   if (rte_intr_fd_set(test_intr_handle, -1))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_UNKNOWN))
+   return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_UIO].fd = pfds.readfd;
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_UIO].type =
-   RTE_INTR_HANDLE_UIO;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID];
+   if (rte_intr_fd_set(test_intr_handle, pfds.readfd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_UNKNOWN))
+   return -1;
+
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_UIO];
+   if (rte_intr_fd_set(test_intr_handle, pfds.readfd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_UIO))
+   return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_ALARM].fd = pfds.readfd;
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_ALARM].type =
-   RTE_INTR_HANDLE_ALARM;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_ALARM];
+   if (rte_intr_fd_set(test_intr_handle, pfds.readfd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_ALARM))
+   return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].fd = pfds.readfd;
-   intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].type =
-   RTE_INTR_HANDLE_DEV_EVENT;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT];
+   if (rte_intr_fd_set(test_intr_handle, pfds.readfd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_DEV_EVENT))
+   return -1;
 
-   intr_handles[TEST_INTERRUPT_HANDLE_CASE1].fd = pfds.writefd;
-   intr_handles[TEST_INTERRUPT_HANDLE_CASE1].type = RTE_INTR_HANDLE_UIO;
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_CASE1];
+   if (rte_intr_fd_set(test_intr_handle, pfds.writefd))
+   return -1;
+   if (rte_intr_type_set(test_intr_handle, RTE_INTR_HANDLE_UIO))
+   return -1;
 
return 0;
 }
@@ -97,6 +120,10 @@ test_interrupt_init(void)
 static int
 test_interrupt_deinit(void)
 {
+   int i;
+
+   for (i = 0; i < TEST_INTERRUPT_HANDLE_MAX; i++)
+   rte_intr_instance_free(intr_handles[i]);
close(

[dpdk-dev] [PATCH v8 4/9] alarm: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
From: Harman Kalra 

Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the libraries access the interrupt handle fields.

Implementing alarm cleanup routine, where the memory allocated
for interrupt instance can be freed.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v6:
- removed unused interrupt handle in FreeBSD alarm code,

Changes since v5:
- split from patch4,
- merged patch6,
- renamed rte_eal_alarm_fini as rte_eal_alarm_cleanup,

---
 lib/eal/common/eal_private.h | 10 ++
 lib/eal/freebsd/eal.c|  1 +
 lib/eal/freebsd/eal_alarm.c  | 35 +--
 lib/eal/linux/eal.c  |  1 +
 lib/eal/linux/eal_alarm.c| 32 +---
 5 files changed, 66 insertions(+), 13 deletions(-)

diff --git a/lib/eal/common/eal_private.h b/lib/eal/common/eal_private.h
index 86dab1f057..36bcc0b5a4 100644
--- a/lib/eal/common/eal_private.h
+++ b/lib/eal/common/eal_private.h
@@ -163,6 +163,16 @@ int rte_eal_intr_init(void);
  */
 int rte_eal_alarm_init(void);
 
+/**
+ * Alarm mechanism cleanup.
+ *
+ * This function is private to EAL.
+ *
+ * @return
+ *  0 on success, negative on error
+ */
+void rte_eal_alarm_cleanup(void);
+
 /**
  * Function is to check if the kernel module(like, vfio, vfio_iommu_type1,
  * etc.) loaded.
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 56a60f13e9..9935356ed4 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -975,6 +975,7 @@ rte_eal_cleanup(void)
rte_mp_channel_cleanup();
/* after this point, any DPDK pointers will become dangling */
rte_eal_memory_detach();
+   rte_eal_alarm_cleanup();
rte_trace_save();
eal_trace_fini();
eal_cleanup_config(internal_conf);
diff --git a/lib/eal/freebsd/eal_alarm.c b/lib/eal/freebsd/eal_alarm.c
index c38b2e04f8..1023c32937 100644
--- a/lib/eal/freebsd/eal_alarm.c
+++ b/lib/eal/freebsd/eal_alarm.c
@@ -32,7 +32,6 @@
 
 struct alarm_entry {
LIST_ENTRY(alarm_entry) next;
-   struct rte_intr_handle handle;
struct timespec time;
rte_eal_alarm_callback cb_fn;
void *cb_arg;
@@ -43,22 +42,46 @@ struct alarm_entry {
 static LIST_HEAD(alarm_list, alarm_entry) alarm_list = LIST_HEAD_INITIALIZER();
 static rte_spinlock_t alarm_list_lk = RTE_SPINLOCK_INITIALIZER;
 
-static struct rte_intr_handle intr_handle = {.fd = -1 };
+static struct rte_intr_handle *intr_handle;
 static void eal_alarm_callback(void *arg);
 
+void
+rte_eal_alarm_cleanup(void)
+{
+   rte_intr_instance_free(intr_handle);
+}
+
 int
 rte_eal_alarm_init(void)
 {
-   intr_handle.type = RTE_INTR_HANDLE_ALARM;
+   int fd;
+
+   intr_handle = rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+   if (intr_handle == NULL) {
+   RTE_LOG(ERR, EAL, "Fail to allocate intr_handle\n");
+   goto error;
+   }
+
+   if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_ALARM))
+   goto error;
+
+   if (rte_intr_fd_set(intr_handle, -1))
+   goto error;
 
/* on FreeBSD, timers don't use fd's, and their identifiers are stored
 * in separate namespace from fd's, so using any value is OK. however,
 * EAL interrupts handler expects fd's to be unique, so use an actual fd
 * to guarantee unique timer identifier.
 */
-   intr_handle.fd = open("/dev/zero", O_RDONLY);
+   fd = open("/dev/zero", O_RDONLY);
+
+   if (rte_intr_fd_set(intr_handle, fd))
+   goto error;
 
return 0;
+error:
+   rte_intr_instance_free(intr_handle);
+   return -1;
 }
 
 static inline int
@@ -118,7 +141,7 @@ unregister_current_callback(void)
ap = LIST_FIRST(&alarm_list);
 
do {
-   ret = rte_intr_callback_unregister(&intr_handle,
+   ret = rte_intr_callback_unregister(intr_handle,
eal_alarm_callback, &ap->time);
} while (ret == -EAGAIN);
}
@@ -136,7 +159,7 @@ register_first_callback(void)
ap = LIST_FIRST(&alarm_list);
 
/* register a new callback */
-   ret = rte_intr_callback_register(&intr_handle,
+   ret = rte_intr_callback_register(intr_handle,
eal_alarm_callback, &ap->time);
}
return ret;
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index 0d0fc8..81fdebc6a0 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -1368,6 +1368,7 @@ rte_eal_cleanup(void)
rte_mp_channel_cleanup();
/* after this point, any DPDK pointers will become dangling */
rte_eal_memory_detach();
+   rte_eal_alarm_cleanup();
rte_trace_save();
eal_trace_fini();
eal_cleanup_config(internal_conf);
diff --git a/lib/eal/linux/eal_alarm.c

[dpdk-dev] [PATCH v8 5/9] lib: remove direct access to interrupt handle

2021-10-25 Thread David Marchand
From: Harman Kalra 

Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the libraries access the interrupt handle fields.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v5:
- split from patch4,

---
 lib/bbdev/rte_bbdev.c   |  4 +--
 lib/eal/linux/eal_dev.c | 57 -
 lib/ethdev/rte_ethdev.c | 14 +-
 3 files changed, 43 insertions(+), 32 deletions(-)

diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index defddcfc28..b86c5fdcc0 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -1094,7 +1094,7 @@ rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t 
queue_id, int epfd, int op,
VALID_QUEUE_OR_RET_ERR(queue_id, dev);
 
intr_handle = dev->intr_handle;
-   if (!intr_handle || !intr_handle->intr_vec) {
+   if (intr_handle == NULL) {
rte_bbdev_log(ERR, "Device %u intr handle unset\n", dev_id);
return -ENOTSUP;
}
@@ -1105,7 +1105,7 @@ rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t 
queue_id, int epfd, int op,
return -ENOTSUP;
}
 
-   vec = intr_handle->intr_vec[queue_id];
+   vec = rte_intr_vec_list_index_get(intr_handle, queue_id);
ret = rte_intr_rx_ctl(intr_handle, epfd, op, vec, data);
if (ret && (ret != -EEXIST)) {
rte_bbdev_log(ERR,
diff --git a/lib/eal/linux/eal_dev.c b/lib/eal/linux/eal_dev.c
index 3b905e18f5..06820a3666 100644
--- a/lib/eal/linux/eal_dev.c
+++ b/lib/eal/linux/eal_dev.c
@@ -23,10 +23,7 @@
 
 #include "eal_private.h"
 
-static struct rte_intr_handle intr_handle = {
-   .type = RTE_INTR_HANDLE_DEV_EVENT,
-   .fd = -1,
-};
+static struct rte_intr_handle *intr_handle;
 static rte_rwlock_t monitor_lock = RTE_RWLOCK_INITIALIZER;
 static uint32_t monitor_refcount;
 static bool hotplug_handle;
@@ -109,12 +106,11 @@ static int
 dev_uev_socket_fd_create(void)
 {
struct sockaddr_nl addr;
-   int ret;
+   int ret, fd;
 
-   intr_handle.fd = socket(PF_NETLINK, SOCK_RAW | SOCK_CLOEXEC |
-   SOCK_NONBLOCK,
-   NETLINK_KOBJECT_UEVENT);
-   if (intr_handle.fd < 0) {
+   fd = socket(PF_NETLINK, SOCK_RAW | SOCK_CLOEXEC | SOCK_NONBLOCK,
+   NETLINK_KOBJECT_UEVENT);
+   if (fd < 0) {
RTE_LOG(ERR, EAL, "create uevent fd failed.\n");
return -1;
}
@@ -124,16 +120,19 @@ dev_uev_socket_fd_create(void)
addr.nl_pid = 0;
addr.nl_groups = 0x;
 
-   ret = bind(intr_handle.fd, (struct sockaddr *) &addr, sizeof(addr));
+   ret = bind(fd, (struct sockaddr *) &addr, sizeof(addr));
if (ret < 0) {
RTE_LOG(ERR, EAL, "Failed to bind uevent socket.\n");
goto err;
}
 
+   if (rte_intr_fd_set(intr_handle, fd))
+   goto err;
+
return 0;
 err:
-   close(intr_handle.fd);
-   intr_handle.fd = -1;
+   close(fd);
+   fd = -1;
return ret;
 }
 
@@ -217,9 +216,9 @@ dev_uev_parse(const char *buf, struct rte_dev_event *event, 
int length)
 static void
 dev_delayed_unregister(void *param)
 {
-   rte_intr_callback_unregister(&intr_handle, dev_uev_handler, param);
-   close(intr_handle.fd);
-   intr_handle.fd = -1;
+   rte_intr_callback_unregister(intr_handle, dev_uev_handler, param);
+   close(rte_intr_fd_get(intr_handle));
+   rte_intr_fd_set(intr_handle, -1);
 }
 
 static void
@@ -235,7 +234,8 @@ dev_uev_handler(__rte_unused void *param)
memset(&uevent, 0, sizeof(struct rte_dev_event));
memset(buf, 0, EAL_UEV_MSG_LEN);
 
-   ret = recv(intr_handle.fd, buf, EAL_UEV_MSG_LEN, MSG_DONTWAIT);
+   ret = recv(rte_intr_fd_get(intr_handle), buf, EAL_UEV_MSG_LEN,
+  MSG_DONTWAIT);
if (ret < 0 && errno == EAGAIN)
return;
else if (ret <= 0) {
@@ -311,24 +311,35 @@ rte_dev_event_monitor_start(void)
goto exit;
}
 
+   intr_handle = rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_PRIVATE);
+   if (intr_handle == NULL) {
+   RTE_LOG(ERR, EAL, "Fail to allocate intr_handle\n");
+   goto exit;
+   }
+
+   if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_DEV_EVENT))
+   goto exit;
+
+   if (rte_intr_fd_set(intr_handle, -1))
+   goto exit;
+
ret = dev_uev_socket_fd_create();
if (ret) {
RTE_LOG(ERR, EAL, "error create device event fd.\n");
goto exit;
}
 
-   ret = rte_intr_callback_register(&intr_handle, dev_uev_handler, NULL);
+   ret = rte_intr_callback_register(intr_handle, dev_uev_handler, NULL);
 
if (ret) {
-   RTE_LOG(ERR, EAL, "fail to register uevent callback.\n");
-   close(intr_handle.fd);
-   int

[dpdk-dev] [PATCH v8 7/9] interrupts: make interrupt handle structure opaque

2021-10-25 Thread David Marchand
From: Harman Kalra 

Moving interrupt handle structure definition inside a EAL private
header to make its fields totally opaque to the outside world.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v5:
- let rte_intr_handle fields untouched:
  - split vfio / uio fd renames in a separate commit,
  - split event list update in a separate commit,
- moved rte_intr_handle definition to a EAL private header,
- preserved dumping all info in interrupt tracepoints,

---
 lib/eal/common/eal_common_interrupts.c |  2 +
 lib/eal/common/eal_interrupts.h| 37 +
 lib/eal/include/meson.build|  1 -
 lib/eal/include/rte_eal_interrupts.h   | 72 --
 lib/eal/include/rte_eal_trace.h|  2 +
 lib/eal/include/rte_interrupts.h   | 24 -
 6 files changed, 63 insertions(+), 75 deletions(-)
 create mode 100644 lib/eal/common/eal_interrupts.h
 delete mode 100644 lib/eal/include/rte_eal_interrupts.h

diff --git a/lib/eal/common/eal_common_interrupts.c 
b/lib/eal/common/eal_common_interrupts.c
index 46064870f4..5886376d84 100644
--- a/lib/eal/common/eal_common_interrupts.c
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 
+#include "eal_interrupts.h"
+
 /* Macros to check for valid interrupt handle */
 #define CHECK_VALID_INTR_HANDLE(intr_handle) do { \
if (intr_handle == NULL) { \
diff --git a/lib/eal/common/eal_interrupts.h b/lib/eal/common/eal_interrupts.h
new file mode 100644
index 00..beacc04b62
--- /dev/null
+++ b/lib/eal/common/eal_interrupts.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#ifndef EAL_INTERRUPTS_H
+#define EAL_INTERRUPTS_H
+
+struct rte_intr_handle {
+   RTE_STD_C11
+   union {
+   struct {
+   RTE_STD_C11
+   union {
+   /** VFIO device file descriptor */
+   int vfio_dev_fd;
+   /** UIO cfg file desc for uio_pci_generic */
+   int uio_cfg_fd;
+   };
+   int fd; /**< interrupt event file descriptor */
+   };
+   void *windows_handle; /**< device driver handle */
+   };
+   uint32_t alloc_flags;   /**< flags passed at allocation */
+   enum rte_intr_handle_type type;  /**< handle type */
+   uint32_t max_intr; /**< max interrupt requested */
+   uint32_t nb_efd;   /**< number of available efd(event fd) */
+   uint8_t efd_counter_size;  /**< size of efd counter, used for vdev 
*/
+   uint16_t nb_intr;
+   /**< Max vector count, default RTE_MAX_RXTX_INTR_VEC_ID */
+   int efds[RTE_MAX_RXTX_INTR_VEC_ID];  /**< intr vectors/efds mapping */
+   struct rte_epoll_event elist[RTE_MAX_RXTX_INTR_VEC_ID];
+  /**< intr vector epoll event */
+   uint16_t vec_list_size;
+   int *intr_vec; /**< intr vector number array */
+};
+
+#endif /* EAL_INTERRUPTS_H */
diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build
index 8e258607b8..86468d1a2b 100644
--- a/lib/eal/include/meson.build
+++ b/lib/eal/include/meson.build
@@ -49,7 +49,6 @@ headers += files(
 'rte_version.h',
 'rte_vfio.h',
 )
-indirect_headers += files('rte_eal_interrupts.h')
 
 # special case install the generic headers, since they go in a subdir
 generic_headers = files(
diff --git a/lib/eal/include/rte_eal_interrupts.h 
b/lib/eal/include/rte_eal_interrupts.h
deleted file mode 100644
index 60bb60ca59..00
--- a/lib/eal/include/rte_eal_interrupts.h
+++ /dev/null
@@ -1,72 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2014 Intel Corporation
- */
-
-#ifndef _RTE_INTERRUPTS_H_
-#error "don't include this file directly, please include generic 
"
-#endif
-
-/**
- * @file rte_eal_interrupts.h
- * @internal
- *
- * Contains function prototypes exposed by the EAL for interrupt handling by
- * drivers and other DPDK internal consumers.
- */
-
-#ifndef _RTE_EAL_INTERRUPTS_H_
-#define _RTE_EAL_INTERRUPTS_H_
-
-#define RTE_MAX_RXTX_INTR_VEC_ID  512
-#define RTE_INTR_VEC_ZERO_OFFSET  0
-#define RTE_INTR_VEC_RXTX_OFFSET  1
-
-/**
- * The interrupt source type, e.g. UIO, VFIO, ALARM etc.
- */
-enum rte_intr_handle_type {
-   RTE_INTR_HANDLE_UNKNOWN = 0,  /**< generic unknown handle */
-   RTE_INTR_HANDLE_UIO,  /**< uio device handle */
-   RTE_INTR_HANDLE_UIO_INTX, /**< uio generic handle */
-   RTE_INTR_HANDLE_VFIO_LEGACY,  /**< vfio device handle (legacy) */
-   RTE_INTR_HANDLE_VFIO_MSI, /**< vfio device handle (MSI) */
-   RTE_INTR_HANDLE_VFIO_MSIX,/**< vfio device handle (MSIX) */
-   RTE_INTR_HANDLE_ALARM,/**< alarm handle */
-   RTE_INTR_HANDLE_EXT,  /**

[dpdk-dev] [PATCH v8 8/9] interrupts: rename device specific file descriptor

2021-10-25 Thread David Marchand
From: Harman Kalra 

VFIO/UIO are mutually exclusive, storing file descriptor in a single
field is enough.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
---
Changes since v5:
- split from patch5,

---
 lib/eal/common/eal_common_interrupts.c | 6 +++---
 lib/eal/common/eal_interrupts.h| 8 +---
 lib/eal/include/rte_eal_trace.h| 8 
 3 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/lib/eal/common/eal_common_interrupts.c 
b/lib/eal/common/eal_common_interrupts.c
index 5886376d84..2146b933bb 100644
--- a/lib/eal/common/eal_common_interrupts.c
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -72,7 +72,7 @@ struct rte_intr_handle *rte_intr_instance_dup(const struct 
rte_intr_handle *src)
intr_handle = rte_intr_instance_alloc(src->alloc_flags);
 
intr_handle->fd = src->fd;
-   intr_handle->vfio_dev_fd = src->vfio_dev_fd;
+   intr_handle->dev_fd = src->dev_fd;
intr_handle->type = src->type;
intr_handle->max_intr = src->max_intr;
intr_handle->nb_efd = src->nb_efd;
@@ -139,7 +139,7 @@ int rte_intr_dev_fd_set(struct rte_intr_handle 
*intr_handle, int fd)
 {
CHECK_VALID_INTR_HANDLE(intr_handle);
 
-   intr_handle->vfio_dev_fd = fd;
+   intr_handle->dev_fd = fd;
 
return 0;
 fail:
@@ -150,7 +150,7 @@ int rte_intr_dev_fd_get(const struct rte_intr_handle 
*intr_handle)
 {
CHECK_VALID_INTR_HANDLE(intr_handle);
 
-   return intr_handle->vfio_dev_fd;
+   return intr_handle->dev_fd;
 fail:
return -1;
 }
diff --git a/lib/eal/common/eal_interrupts.h b/lib/eal/common/eal_interrupts.h
index beacc04b62..1a4e5573b2 100644
--- a/lib/eal/common/eal_interrupts.h
+++ b/lib/eal/common/eal_interrupts.h
@@ -9,13 +9,7 @@ struct rte_intr_handle {
RTE_STD_C11
union {
struct {
-   RTE_STD_C11
-   union {
-   /** VFIO device file descriptor */
-   int vfio_dev_fd;
-   /** UIO cfg file desc for uio_pci_generic */
-   int uio_cfg_fd;
-   };
+   int dev_fd; /**< VFIO/UIO cfg device file descriptor */
int fd; /**< interrupt event file descriptor */
};
void *windows_handle; /**< device driver handle */
diff --git a/lib/eal/include/rte_eal_trace.h b/lib/eal/include/rte_eal_trace.h
index af7b2d0bf0..5ef4398230 100644
--- a/lib/eal/include/rte_eal_trace.h
+++ b/lib/eal/include/rte_eal_trace.h
@@ -151,7 +151,7 @@ RTE_TRACE_POINT(
RTE_TRACE_POINT_ARGS(const struct rte_intr_handle *handle,
rte_intr_callback_fn cb, void *cb_arg, int rc),
rte_trace_point_emit_int(rc);
-   rte_trace_point_emit_int(handle->vfio_dev_fd);
+   rte_trace_point_emit_int(handle->dev_fd);
rte_trace_point_emit_int(handle->fd);
rte_trace_point_emit_int(handle->type);
rte_trace_point_emit_u32(handle->max_intr);
@@ -164,7 +164,7 @@ RTE_TRACE_POINT(
RTE_TRACE_POINT_ARGS(const struct rte_intr_handle *handle,
rte_intr_callback_fn cb, void *cb_arg, int rc),
rte_trace_point_emit_int(rc);
-   rte_trace_point_emit_int(handle->vfio_dev_fd);
+   rte_trace_point_emit_int(handle->dev_fd);
rte_trace_point_emit_int(handle->fd);
rte_trace_point_emit_int(handle->type);
rte_trace_point_emit_u32(handle->max_intr);
@@ -176,7 +176,7 @@ RTE_TRACE_POINT(
rte_eal_trace_intr_enable,
RTE_TRACE_POINT_ARGS(const struct rte_intr_handle *handle, int rc),
rte_trace_point_emit_int(rc);
-   rte_trace_point_emit_int(handle->vfio_dev_fd);
+   rte_trace_point_emit_int(handle->dev_fd);
rte_trace_point_emit_int(handle->fd);
rte_trace_point_emit_int(handle->type);
rte_trace_point_emit_u32(handle->max_intr);
@@ -186,7 +186,7 @@ RTE_TRACE_POINT(
rte_eal_trace_intr_disable,
RTE_TRACE_POINT_ARGS(const struct rte_intr_handle *handle, int rc),
rte_trace_point_emit_int(rc);
-   rte_trace_point_emit_int(handle->vfio_dev_fd);
+   rte_trace_point_emit_int(handle->dev_fd);
rte_trace_point_emit_int(handle->fd);
rte_trace_point_emit_int(handle->type);
rte_trace_point_emit_u32(handle->max_intr);
-- 
2.23.0



Re: [dpdk-dev] [PATCH v4 1/5] eal: add new definitions for wait scheme

2021-10-25 Thread Ananyev, Konstantin

> > > Introduce macros as generic interface for address monitoring.
> > >
> > > Signed-off-by: Feifei Wang 
> > > Reviewed-by: Ruifeng Wang 
> > > ---
> > >  lib/eal/arm/include/rte_pause_64.h  | 126
> > >   lib/eal/include/generic/rte_pause.h |
> > > 32 +++
> > >  2 files changed, 104 insertions(+), 54 deletions(-)
> > >
> > > diff --git a/lib/eal/arm/include/rte_pause_64.h
> > > b/lib/eal/arm/include/rte_pause_64.h
> > > index e87d10b8cc..23954c2de2 100644
> > > --- a/lib/eal/arm/include/rte_pause_64.h
> > > +++ b/lib/eal/arm/include/rte_pause_64.h
> > > @@ -31,20 +31,12 @@ static inline void rte_pause(void)
> > >  /* Put processor into low power WFE(Wait For Event) state. */
> > > #define __WFE() { asm volatile("wfe" : : : "memory"); }
> > >
> > > -static __rte_always_inline void
> > > -rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > > - int memorder)
> > > -{
> > > - uint16_t value;
> > > -
> > > - assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > __ATOMIC_RELAXED);
> > > -
> > > - /*
> > > -  * Atomic exclusive load from addr, it returns the 16-bit content of
> > > -  * *addr while making it 'monitored',when it is written by someone
> > > -  * else, the 'monitored' state is cleared and a event is generated
> > > -  * implicitly to exit WFE.
> > > -  */
> > > +/*
> > > + * Atomic exclusive load from addr, it returns the 16-bit content of
> > > + * *addr while making it 'monitored', when it is written by someone
> > > + * else, the 'monitored' state is cleared and a event is generated
> > > + * implicitly to exit WFE.
> > > + */
> > >  #define __LOAD_EXC_16(src, dst, memorder) {   \
> > >   if (memorder == __ATOMIC_RELAXED) {   \
> > >   asm volatile("ldxrh %w[tmp], [%x[addr]]"  \ @@ -58,6 +50,52
> > @@
> > > rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > >   : "memory");  \
> > >   } }
> > >
> > > +/*
> > > + * Atomic exclusive load from addr, it returns the 32-bit content of
> > > + * *addr while making it 'monitored', when it is written by someone
> > > + * else, the 'monitored' state is cleared and a event is generated
> > > + * implicitly to exit WFE.
> > > + */
> > > +#define __LOAD_EXC_32(src, dst, memorder) {  \
> > > + if (memorder == __ATOMIC_RELAXED) {  \
> > > + asm volatile("ldxr %w[tmp], [%x[addr]]"  \
> > > + : [tmp] "=&r" (dst)  \
> > > + : [addr] "r"(src)\
> > > + : "memory"); \
> > > + } else { \
> > > + asm volatile("ldaxr %w[tmp], [%x[addr]]" \
> > > + : [tmp] "=&r" (dst)  \
> > > + : [addr] "r"(src)\
> > > + : "memory"); \
> > > + } }
> > > +
> > > +/*
> > > + * Atomic exclusive load from addr, it returns the 64-bit content of
> > > + * *addr while making it 'monitored', when it is written by someone
> > > + * else, the 'monitored' state is cleared and a event is generated
> > > + * implicitly to exit WFE.
> > > + */
> > > +#define __LOAD_EXC_64(src, dst, memorder) {  \
> > > + if (memorder == __ATOMIC_RELAXED) {  \
> > > + asm volatile("ldxr %x[tmp], [%x[addr]]"  \
> > > + : [tmp] "=&r" (dst)  \
> > > + : [addr] "r"(src)\
> > > + : "memory"); \
> > > + } else { \
> > > + asm volatile("ldaxr %x[tmp], [%x[addr]]" \
> > > + : [tmp] "=&r" (dst)  \
> > > + : [addr] "r"(src)\
> > > + : "memory"); \
> > > + } }
> > > +
> > > +static __rte_always_inline void
> > > +rte_wait_until_equal_16(volatile uint16_t *addr, uint16_t expected,
> > > + int memorder)
> > > +{
> > > + uint16_t value;
> > > +
> > > + assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > > +__ATOMIC_RELAXED);
> > > +
> > >   __LOAD_EXC_16(addr, value, memorder)
> > >   if (value != expected) {
> > >   __SEVL()
> > > @@ -66,7 +104,6 @@ rte_wait_until_equal_16(volatile uint16_t *addr,
> > uint16_t expected,
> > >   __LOAD_EXC_16(addr, value, memorder)
> > >   } while (value != expected);
> > >   }
> > > -#undef __LOAD_EXC_16
> > >  }
> > >
> > >  static __rte_always_inline void
> > > @@ -77,25 +114,6 @@ rte_wait_until_equal_32(volatile uint32_t *addr,
> > > uint32_t expected,
> > >
> > >   assert(memorder == __ATOMIC_ACQUIRE || memorder ==
> > > __ATOMIC_RELAXED);
> > >
> > > - /*
> > > -  * Atomic exclusive load from addr, it returns the 32-bit content of
> > > -  * *addr while making it 'monitored',when it is written by someone
> > > -  * else, the 'monitored' state is cleared and a event 

[dpdk-dev] [PATCH v8 9/9] interrupts: extend event list

2021-10-25 Thread David Marchand
From: Harman Kalra 

Dynamically allocating the efds and elist array of intr_handle
structure, based on size provided by user. Eg size can be
MSIX interrupts supported by a PCI device.

Signed-off-by: Harman Kalra 
Signed-off-by: David Marchand 
Acked-by: Dmitry Kozlyuk 
---
Changes since v6:
- removed unneeded checks on elist/efds array initialisation,

Changes since v5:
- split from patch5,

---
 drivers/bus/pci/linux/pci_vfio.c   |  6 ++
 drivers/common/cnxk/roc_platform.h |  1 +
 lib/eal/common/eal_common_interrupts.c | 95 +-
 lib/eal/common/eal_interrupts.h|  5 +-
 4 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/drivers/bus/pci/linux/pci_vfio.c b/drivers/bus/pci/linux/pci_vfio.c
index 7b2f8296c5..f622e7f8e6 100644
--- a/drivers/bus/pci/linux/pci_vfio.c
+++ b/drivers/bus/pci/linux/pci_vfio.c
@@ -266,6 +266,12 @@ pci_vfio_setup_interrupts(struct rte_pci_device *dev, int 
vfio_dev_fd)
return -1;
}
 
+   /* Reallocate the efds and elist fields of intr_handle based
+* on PCI device MSIX size.
+*/
+   if (rte_intr_event_list_update(dev->intr_handle, irq.count))
+   return -1;
+
/* if this vector cannot be used with eventfd, fail if we 
explicitly
 * specified interrupt type, otherwise continue */
if ((irq.flags & VFIO_IRQ_INFO_EVENTFD) == 0) {
diff --git a/drivers/common/cnxk/roc_platform.h 
b/drivers/common/cnxk/roc_platform.h
index 60227b72d0..5da23fe5f8 100644
--- a/drivers/common/cnxk/roc_platform.h
+++ b/drivers/common/cnxk/roc_platform.h
@@ -121,6 +121,7 @@
 #define plt_intr_instance_allocrte_intr_instance_alloc
 #define plt_intr_instance_dup  rte_intr_instance_dup
 #define plt_intr_instance_free rte_intr_instance_free
+#define plt_intr_event_list_update rte_intr_event_list_update
 #define plt_intr_max_intr_get  rte_intr_max_intr_get
 #define plt_intr_max_intr_set  rte_intr_max_intr_set
 #define plt_intr_nb_efd_getrte_intr_nb_efd_get
diff --git a/lib/eal/common/eal_common_interrupts.c 
b/lib/eal/common/eal_common_interrupts.c
index 2146b933bb..da3ab006b8 100644
--- a/lib/eal/common/eal_common_interrupts.c
+++ b/lib/eal/common/eal_common_interrupts.c
@@ -53,10 +53,46 @@ struct rte_intr_handle *rte_intr_instance_alloc(uint32_t 
flags)
return NULL;
}
 
+   if (uses_rte_memory) {
+   intr_handle->efds = rte_zmalloc(NULL,
+   RTE_MAX_RXTX_INTR_VEC_ID * sizeof(int), 0);
+   } else {
+   intr_handle->efds = calloc(RTE_MAX_RXTX_INTR_VEC_ID,
+   sizeof(int));
+   }
+   if (intr_handle->efds == NULL) {
+   RTE_LOG(ERR, EAL, "Fail to allocate event fd list\n");
+   rte_errno = ENOMEM;
+   goto fail;
+   }
+
+   if (uses_rte_memory) {
+   intr_handle->elist = rte_zmalloc(NULL,
+   RTE_MAX_RXTX_INTR_VEC_ID * sizeof(struct 
rte_epoll_event),
+   0);
+   } else {
+   intr_handle->elist = calloc(RTE_MAX_RXTX_INTR_VEC_ID,
+   sizeof(struct rte_epoll_event));
+   }
+   if (intr_handle->elist == NULL) {
+   RTE_LOG(ERR, EAL, "fail to allocate event fd list\n");
+   rte_errno = ENOMEM;
+   goto fail;
+   }
+
intr_handle->alloc_flags = flags;
intr_handle->nb_intr = RTE_MAX_RXTX_INTR_VEC_ID;
 
return intr_handle;
+fail:
+   if (uses_rte_memory) {
+   rte_free(intr_handle->efds);
+   rte_free(intr_handle);
+   } else {
+   free(intr_handle->efds);
+   free(intr_handle);
+   }
+   return NULL;
 }
 
 struct rte_intr_handle *rte_intr_instance_dup(const struct rte_intr_handle 
*src)
@@ -83,14 +119,69 @@ struct rte_intr_handle *rte_intr_instance_dup(const struct 
rte_intr_handle *src)
return intr_handle;
 }
 
+int rte_intr_event_list_update(struct rte_intr_handle *intr_handle, int size)
+{
+   struct rte_epoll_event *tmp_elist;
+   bool uses_rte_memory;
+   int *tmp_efds;
+
+   CHECK_VALID_INTR_HANDLE(intr_handle);
+
+   if (size == 0) {
+   RTE_LOG(DEBUG, EAL, "Size can't be zero\n");
+   rte_errno = EINVAL;
+   goto fail;
+   }
+
+   uses_rte_memory =
+   RTE_INTR_INSTANCE_USES_RTE_MEMORY(intr_handle->alloc_flags);
+   if (uses_rte_memory) {
+   tmp_efds = rte_realloc(intr_handle->efds, size * sizeof(int),
+   0);
+   } else {
+   tmp_efds = realloc(intr_handle->efds, size * sizeof(int));
+   }
+   if (tmp_efds == NULL) {
+   RTE_LOG(ERR, EAL, "Failed to realloc the efds list\n");
+   rte_errno = ENOMEM;
+

[dpdk-dev] Minutes of Technical Board Meeting, 2021-Oct-20

2021-10-25 Thread Ananyev, Konstantin


Minutes of Technical Board Meeting, 2021-Oct-20

Members Attending
-
-Bruce
-Ferruh
-Hemant
-Honnappa
-Jerin
-Kevin
-Konstantin (Chair)
-Maxime
-Stephen
-Thomas

NOTE: The technical board meetings every second Wednesday at
https://meet.jit.si/DPDK at 3 pm UTC.
Meetings are public, and DPDK community members are welcome to attend.

NOTE: Next meeting will be on Wednesday 2021-Oct-27 @3pm UTC, and will
be chaired by Maxime.

1) DTS proposal discussion [1]
  TB agreed on following roadmap for DTS:
   a) DTS adoption:
   -  DTS framework will be merged to DPDK repository
   -  existing DTS test-cases will be uploaded to DPDK repo on step by step 
basis
  (submit a patch to dpdk.org, review, accept).
   - All communication regarding new DTS framework will go through 
dev@dpdk.org
 mailing list and will follow usual DPDK policies. 
   - existing DTS repository and mailing list will remain available for 
legacy purposes.
 
   Concern was expressed regarding that due to companies internal policies,
   DTS code adopted to DPDK will be matter of extra scanning (legacy, 
security, etc.). 
   Need to figure out what are exact requirements for this scanning (AR to 
John Mcnamara).
   
   b) Patch submission changes:
 When new feature is added to DPDK a test-case(s) have to be provided along 
with it
 (either with DPDK UT, or DPDK DTS frameworks). 
 When new DTS test-case is required:  
 - Same patch series will contain patches for both DPDK and DTS.
 - DTS patches will follow DPDK patches review/accept policy.

c) Patch backporting changes:
  - DTS patches will follow DPDK backporting policy
  - Any backporting of the DPDK patches should also backport the DTS 
changes (if there are any).

2) DTS framework adoption timeframe and resources
  -  There are few people working on it half of their time, but more help is 
needed.
  2-3 extra people working 50% of their time on that task would be ideal.  
  -   First DTS related patches are expected in 22.02 timeframe.
  -   22.11 as preliminary target for DTS framework adoption

3) DTS framework maintainer
   - Need to talk with current DTS maintainer -
 is Lijuan ok to also maintain DTS framework within DPDK (AR to Honnappa)
   - Would be good to have at least one more maintainer for DTS
  (better coverage/split load, etc.).

4) GPU library / DWA library inclusion
  
http://inbox.dpdk.org/dev/dm6pr12mb41079fae6b5da35102b1bbfacd...@dm6pr12mb4107.namprd12.prod.outlook.com/
  http://mails.dpdk.org/archives/dev/2021-October/226070.html  

  Separate TB meeting is scheduled to cover that topic.
  Date: Wednesday, October 27th
  Time: 3pm UTC
   https://meet.jit.si/DPDK
   Backup On IRC libera.chat #dpdk-board

[1]
https://docs.google.com/presentation/d/1gTMJGP40FlWoSxMwdZsE2ydmd5SrMvGfWA9wtqvdbbM/edit?usp=sharing




Re: [dpdk-dev] [PATCH v2] mempool: fix non-IO flag inference

2021-10-25 Thread Thomas Monjalon
25/10/2021 15:33, Olivier Matz:
> On Sat, Oct 23, 2021 at 12:09:19AM +0300, Dmitry Kozlyuk wrote:
> > When mempool had been created with RTE_MEMPOOL_F_NO_IOVA_CONTIG flag
> > but later populated with valid IOVA, RTE_MEMPOOL_F_NON_IO was unset,
> > while it should be kept. The unit test did not catch this
> > because rte_mempool_populate_default() it used was populating
> > with RTE_BAD_IOVA.
> > 
> > Keep setting RTE_MEMPOOL_NON_IO at an empty mempool creation
> > and add an assert for it in the unit test (remove the separate case).
> > Do not reset the flag if RTE_MEMPOOL_F_ON_IOVA_CONTIG is set.
> > 
> > Fixes: 11541c5c81dd ("mempool: add non-IO flag")
> > 
> > Signed-off-by: Dmitry Kozlyuk 
> 
> Acked-by: Olivier Matz 

Applied, thanks.




Re: [dpdk-dev] [PATCH] usertools/pmdinfo: fix plugin auto scan

2021-10-25 Thread Thomas Monjalon
20/10/2021 21:31, Robin Jarry:
> Hello,
> 
> 2021-10-19, David Marchand:
> > Migration to argparse was incomplete.
> > 
> > $ dpdk-pmdinfo.py -p $(which dpdk-testpmd)
> > Traceback (most recent call last):
> >   File "/usr/bin/dpdk-pmdinfo.py", line 626, in 
> > main()
> >   File "/usr/bin/dpdk-pmdinfo.py", line 596, in main
> > exit(scan_for_autoload_pmds(args[0]))
> > TypeError: 'Namespace' object does not support indexing
> > 
> > Fixes: 81255f27c65c ("usertools: replace optparse with argparse")
> > Cc: sta...@dpdk.org
> > 
> > Signed-off-by: David Marchand 
> 
> It is sad that pylint does not report an error:
> 
> https://pycodequ.al/docs/pylint-messages/e1136-unsubscriptable-object.html
> 
> Anyway, this fix looks good.
> 
> Reviewed-by: Robin Jarry 

Applied, thanks.





Re: [dpdk-dev] [PATCH] doc: fix default mempool option

2021-10-25 Thread Thomas Monjalon
15/10/2021 14:26, Olivier Matz:
> On Fri, Oct 15, 2021 at 10:39:41AM +0200, David Marchand wrote:
> > This option should be prefixed with -- for consistency with others.
> > 
> > Fixes: a103a97e7191 ("eal: allow user to override default mempool driver")
> > Cc: sta...@dpdk.org
> > 
> > Signed-off-by: David Marchand 
> 
> Reviewed-by: Olivier Matz 

Applied, thanks.





Re: [dpdk-dev] [dpdk-stable] [PATCH V2 3/3] eal/x86: avoid cast-align warning in x86 memcpy functions

2021-10-25 Thread Thomas Monjalon
21/10/2021 10:51, Eli Britstein:
> Functions and macros in x86 rte_memcpy.h may cause cast-align warnings,
> when using strict cast align flag with supporting gcc:
> gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> CFLAGS="-Wcast-align=strict" make V=1 -C examples/l2fwd clean static
> 
> For example:
> In file included from main.c:24:
> /dpdk/build/include/rte_memcpy.h: In function 'rte_mov16':
> /dpdk/build/include/rte_memcpy.h:306:25: warning: cast increases
> required alignment of target type [-Wcast-align]
>   306 |  xmm0 = _mm_loadu_si128((const __m128i *)src);
>   | ^
> 
> As the code assumes correct alignment, add first a (void *) or (const
> void *) castings, to avoid the warnings.
> 
> Fixes: 9484092baad3 ("eal/x86: optimize memcpy for AVX512 platforms")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Eli Britstein 

Series applied, thanks.




Re: [dpdk-dev] [PATCH v6 01/12] lib: build libraries that some tests depend on

2021-10-25 Thread Thomas Monjalon
14/10/2021 18:21, Jie Zhou:
> Enable building subset of libraries that tests depend on for Windows
> 
> Signed-off-by: Jie Zhou 
> ---
>  lib/meson.build | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/meson.build b/lib/meson.build
> index b2ba7258d8..bd6c27deef 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -82,9 +82,11 @@ if is_windows
>  'bitratestats',
>  'cryptodev',
>  'cfgfile',
> +'efd',
>  'gro',
>  'gso',
>  'latencystats',
> +'lpm',
>  'pdump',
>  'stack',
>  'security',
> 

It needs to be rebased after the recent changes to disable libs on Windows.
But instead of such patch, I would prefer one patch per lib,
with Tested-by tags showing that the lib is tested and working on Windows.
Also you need to Cc the maintainer of the lib being enabled.




Re: [dpdk-dev] [PATCH v3] ci: update machine meson option to platform

2021-10-25 Thread Thomas Monjalon
14/10/2021 14:26, Aaron Conole:
> Juraj Linkeš  writes:
> 
> > The way we're building DPDK in CI, with -Dmachine=default, has not been
> > updated when the option got replaced to preserve a backwards-complatible
> > build call to facilitate ABI verification between DPDK versions. Update
> > the call to use -Dplatform=generic, which is the most up to date way to
> > execute the same build which is now present in all DPDK versions the ABI
> > check verifies.
> >
> > Signed-off-by: Juraj Linkeš 
> 
> Acked-by: Aaron Conole 

Applied, thanks.





Re: [dpdk-dev] [PATCH v5 1/2] build: add meson options of atomic_mbuf_ref_counts

2021-10-25 Thread Thomas Monjalon
14/10/2021 10:20, Bruce Richardson:
> On Thu, Oct 14, 2021 at 04:54:18AM +0800, Kefu Chai wrote:
> > RTE_MBUF_REFCNT_ATOMIC = 0 is not necessary for applications like
> > Seastar, where it's safe to assume that the mbuf refcnt is only
> > updated by a single core only.
> > 
> > Signed-off-by: Kefu Chai 
> > ---
> 
> For this, I think it's a setting that needs to be a global one for DPDK, so
> I'm ok with adding it as a meson option.
> 
> Acked-by: Bruce Richardson 

Changed the option name to "mbuf_refcnt_atomic" to match the flag.
Applied, thanks.




[dpdk-dev] [PATCH] config: sort Meson options by categories

2021-10-25 Thread Thomas Monjalon
Options used to be sorted alphabetically.
It looks easier to read when major options are first,
then path tuning, libs options, and drivers options.

Signed-off-by: Thomas Monjalon 
---
 meson_options.txt | 75 ---
 1 file changed, 39 insertions(+), 36 deletions(-)

diff --git a/meson_options.txt b/meson_options.txt
index 7c220ad68d..281719c794 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -1,50 +1,53 @@
-# Please keep these options sorted alphabetically.
-
-option('check_includes', type: 'boolean', value: false, description:
-   'build "chkincs" to verify each header file can compile alone')
-option('cpu_instruction_set', type: 'string', value: 'auto',
-   description: 'Set the target machine ISA (instruction set 
architecture). Will be set according to the platform option by default.')
-option('developer_mode', type: 'feature', description:
-   'turn on additional build checks relevant for DPDK developers')
-option('disable_drivers', type: 'string', value: '', description:
-   'Comma-separated list of drivers to explicitly disable.')
-option('disable_libs', type: 'string', value: '', description:
-   'Comma-separated list of libraries to explicitly disable. [NOTE: not 
all libs can be disabled]')
+# general compilation tuning
+option('platform', type: 'string', value: 'native', description:
+   'Platform to build, either "native", "generic" or a SoC. Please refer 
to the Linux build guide for more information.')
+option('cpu_instruction_set', type: 'string', value: 'auto', description:
+   'Set the target machine ISA (instruction set architecture). Will be set 
according to the platform option by default.')
+option('machine', type: 'string', value: 'auto', description:
+   'Alias of cpu_instruction_set.')
+option('include_subdir_arch', type: 'string', value: '', description:
+   'Subdirectory where to install arch-dependent headers.')
 option('drivers_install_subdir', type: 'string', value: 'dpdk/pmds-', 
description:
'Subdirectory of libdir where to install PMDs. Defaults to using a 
versioned subdirectory.')
-option('enable_docs', type: 'boolean', value: false, description:
-   'build documentation')
+option('disable_libs', type: 'string', value: '', description:
+   'Comma-separated list of libraries to explicitly disable. [NOTE: not 
all libs can be disabled]')
+option('disable_drivers', type: 'string', value: '', description:
+   'Comma-separated list of drivers to explicitly disable.')
 option('enable_drivers', type: 'string', value: '', description:
'Comma-separated list of drivers to build. If unspecified, build all 
drivers.')
 option('enable_driver_sdk', type: 'boolean', value: false, description:
'Install headers to build drivers.')
 option('enable_kmods', type: 'boolean', value: false, description:
-   'build kernel modules')
-option('examples', type: 'string', value: '', description:
-   'Comma-separated list of examples to build by default')
-option('flexran_sdk', type: 'string', value: '', description:
-   'Path to FlexRAN SDK optional Libraries for BBDEV device')
-option('ibverbs_link', type: 'combo', choices : ['static', 'shared', 
'dlopen'], value: 'shared', description:
-   'Linkage method (static/shared/dlopen) for Mellanox PMDs with ibverbs 
dependencies.')
-option('include_subdir_arch', type: 'string', value: '', description:
-   'subdirectory where to install arch-dependent headers')
+   'Build kernel modules.')
 option('kernel_dir', type: 'string', value: '', description:
'Path to the kernel for building kernel modules. Headers must be in 
$kernel_dir or $kernel_dir/build. Modules will be installed in /lib/modules.')
-option('machine', type: 'string', value: 'auto', description:
-   'Alias of cpu_instruction_set.')
-option('max_ethports', type: 'integer', value: 32, description:
-   'maximum number of Ethernet devices')
-option('max_lcores', type: 'string', value: 'default', description:
-   'Set maximum number of cores/threads supported by EAL; "default" is 
different per-arch, "detect" detects the number of cores on the build machine.')
+option('enable_docs', type: 'boolean', value: false, description:
+   'Build documentation.')
+option('examples', type: 'string', value: '', description:
+   'Comma-separated list of examples to build by default.')
+option('tests', type: 'boolean', value: true, description:
+   'Build unit tests.')
+option('developer_mode', type: 'feature', description:
+   'Turn on additional build checks relevant for DPDK developers.')
+option('check_includes', type: 'boolean', value: false, description:
+   'Build chkincs to verify each header file can compile alone.')
+
+# library-specific options
 option('max_numa_nodes', type: 'string', value: 'default', description:
'Set the highest NUMA node supported by EAL; "default" is different 
per-arch, "detect" detects

Re: [dpdk-dev] [PATCH v3 0/4] Use correct memory ordering in eal functions

2021-10-25 Thread David Marchand
On Mon, Oct 25, 2021 at 6:53 AM Honnappa Nagarahalli
 wrote:
>
> v3:
> a) Added Fixes, Cc:stable#dpdk.org in 1/6
> b) Merged 3/6 & 4/6 and moved after the first commit in the series
> c) Merged 2/6 & 5/6 as they need to be in a single commit
> d) Removed use of volatile in 6/6 (Konstantin)
>
> rte_eal_remote_launch and rte_eal_wait_lcore need to provide
> correct memory ordering to address the data communication from
> main core to worker core.
>
> There are 2 use cases:
> 1) All the store operations (meant for worker) by main core
> should be visible to worker core before the worker core runs the
> assigned function
>
> 2) All the store operations by the worker core should be visible
> to the main core after rte_eal_wait_lcore returns.
>
> For the data that needs to be communicated to the worker after
> the rte_eal_remote_launch returns, load-acquire and store-release
> semantics should be used.
>
> For the main to worker communication, the pointer to function
> to execute is used as the guard variable. Hence, resetting of
> the function pointer is important.
>
> For the worker to main communication, the existing code uses the
> lcore state as the guard variable. However, it looks like
> the FINISHED state is not really required. Hence the FINISHED state
> is removed before using the state as the guard variable.
>
> Honnappa Nagarahalli (4):
>   eal: reset lcore function pointer and argument
>   eal: lcore state FINISHED is not required
>   eal: ensure correct memory ordering
>   test/ring: use relaxed barriers for ring stress test
>
>  app/test/test_ring_stress_impl.h  | 18 +++
>  drivers/event/dpaa2/dpaa2_eventdev_selftest.c |  2 +-
>  drivers/event/octeontx/ssovf_evdev_selftest.c |  2 +-
>  drivers/event/sw/sw_evdev_selftest.c  |  4 +-
>  examples/l2fwd-keepalive/main.c   |  3 +-
>  lib/eal/common/eal_common_launch.c| 13 ++---
>  lib/eal/freebsd/eal_thread.c  | 45 +
>  lib/eal/include/rte_launch.h  | 21 
>  lib/eal/include/rte_service.h |  4 +-
>  lib/eal/linux/eal_thread.c| 48 +--
>  lib/eal/windows/eal_thread.c  | 48 +--
>  11 files changed, 132 insertions(+), 76 deletions(-)

Tweaked commit titles, removed deprecation notice and updated release
notes in patch 2.
Series applied, thanks.


-- 
David Marchand



Re: [dpdk-dev] [PATCH] config: sort Meson options by categories

2021-10-25 Thread Thomas Monjalon
25/10/2021 18:17, Thomas Monjalon:
> Options used to be sorted alphabetically.
> It looks easier to read when major options are first,
> then path tuning, libs options, and drivers options.

Even better, we could insert a blank line between each option.





Re: [dpdk-dev] [dpdk-stable] [PATCH] hash: fix doxygen comments

2021-10-25 Thread Thomas Monjalon
10/09/2021 11:46, Mcnamara, John:
> From: Medvedkin, Vladimir 
> 
> The git diff makes this look like the ifdef is moving but I see that you are 
> moving the doc into the right place so Doxygen can pick it up.
> 
> Acked-by: John McNamara 

title: hash: fix Doxygen comment of Toeplitz file

Applied, thanks.





Re: [dpdk-dev] [PATCH v5 5/5] test/thash: add performance tests for the Toeplitz hash

2021-10-25 Thread Thomas Monjalon
21/10/2021 20:54, Vladimir Medvedkin:
> This patch adds performance tests for different implementations
> of the Toeplitz hash function.

Please name them.

> Signed-off-by: Vladimir Medvedkin 

There are some garbage,

> @@ -320,6 +321,7 @@ perf_test_names = [
>  'hash_readwrite_lf_perf_autotest',
>  'trace_perf_autotest',
>  'ipsec_perf_autotest',
> + 'thash_perf_autotest',

here (tabs instead of space)

>  driver_test_names = [
> diff --git a/app/test/test_thash_perf.c b/app/test/test_thash_perf.c
> new file mode 100644
> index 000..fb66e20
> --- /dev/null
> +++ b/app/test/test_thash_perf.c
> @@ -0,0 +1,120 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2021 Intel Corporation
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "test.h"
> +
> +#define ITERATIONS   (1 << 15)
> +#define  BATCH_SZ(1 << 10)
> +
> +#define IPV4_2_TUPLE_LEN (8)
> +#define IPV4_4_TUPLE_LEN (12)
> +#define IPV6_2_TUPLE_LEN (32)
> +#define IPV6_4_TUPLE_LEN (36)
> +
> +
> +static uint8_t default_rss_key[] = {
> + 0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2,
> + 0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0,
> + 0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4,
> + 0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c,
> + 0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa,
> +};
> +
> +static void
> +run_thash_test(unsigned int tuple_len)
> +{
> + uint32_t *tuples[BATCH_SZ];
> + unsigned int i, j;
> + uint64_t start_tsc, end_tsc;
> + uint32_t len = RTE_ALIGN_CEIL(tuple_len, sizeof(uint32_t));
> + volatile uint32_t hash = 0;
> + uint32_t bulk_hash[BATCH_SZ] = { 0 };
> +
> + for (i = 0; i < BATCH_SZ; i++) {
> + tuples[i] = rte_zmalloc(NULL, len, 0);
> + for (j = 0; j < len / sizeof(uint32_t); j++)
> + tuples[i][j] = rte_rand();
> + }
> +
> + start_tsc = rte_rdtsc_precise();
> + for (i = 0; i < ITERATIONS; i++) {
> + for (j = 0; j < BATCH_SZ; j++) {
> + hash ^= rte_softrss(tuples[j], len / sizeof(uint32_t),
> + default_rss_key);
> + }
> + }
> + end_tsc = rte_rdtsc_precise();
> +
> + printf("Average rte_softrss() takes \t\t%.1f cycles for key len %d\n",
> + (double)(end_tsc - start_tsc) / (double)(ITERATIONS *
> + BATCH_SZ), len);
> +
> + start_tsc = rte_rdtsc_precise();
> + for (i = 0; i < ITERATIONS; i++) {
> + for (j = 0; j < BATCH_SZ; j++) {
> + hash ^= rte_softrss_be(tuples[j], len /
> + sizeof(uint32_t), default_rss_key);
> + }
> + }
> + end_tsc = rte_rdtsc_precise();
> +
> + printf("Average rte_softrss_be() takes \t\t%.1f cycles for key len 
> %d\n",
> + (double)(end_tsc - start_tsc) / (double)(ITERATIONS *
> + BATCH_SZ), len);

The function could stop here (one function per type of implementation).

> +
> + if (!rte_thash_gfni_supported())
> + return;
> +
> + uint64_t rss_key_matrixes[RTE_DIM(default_rss_key)];
> +
> + rte_thash_complete_matrix(rss_key_matrixes, default_rss_key,
> + RTE_DIM(default_rss_key));
> +
> + start_tsc = rte_rdtsc_precise();
> + for (i = 0; i < ITERATIONS; i++) {
> + for (j = 0; j < BATCH_SZ; j++)
> + hash ^= rte_thash_gfni(rss_key_matrixes,
> + (uint8_t *)tuples[j], len);
> + }
> + end_tsc = rte_rdtsc_precise();
> +
> + printf("Average rte_thash_gfni takes \t\t%.1f cycles for key len %d\n",
> + (double)(end_tsc - start_tsc) / (double)(ITERATIONS *
> + BATCH_SZ), len);
> +
> + start_tsc = rte_rdtsc_precise();
> + for (i = 0; i < ITERATIONS; i++)
> + rte_thash_gfni_bulk(rss_key_matrixes, len, (uint8_t **)tuples,
> + bulk_hash, BATCH_SZ);
> +
> + end_tsc = rte_rdtsc_precise();
> +
> + printf("Average rte_thash_gfni_x2 takes \t%.1f cycles for key len %d\n",

and here, the function name is not updated.

> + (double)(end_tsc - start_tsc) / (double)(ITERATIONS *
> + BATCH_SZ), len);
> +

useless blank line

> +}





Re: [dpdk-dev] [PATCH v5 0/6] make rte_intr_handle internal

2021-10-25 Thread Raslan Darawsheh
Hi,

> -Original Message-
> From: dev  On Behalf Of Harman Kalra
> Sent: Friday, October 22, 2021 11:49 PM
> To: dev@dpdk.org
> Cc: david.march...@redhat.com; dmitry.kozl...@gmail.com;
> m...@ashroe.eu; NBU-Contact-Thomas Monjalon ;
> Harman Kalra 
> Subject: [dpdk-dev] [PATCH v5 0/6] make rte_intr_handle internal
> 
> Moving struct rte_intr_handle as an internal structure to
> avoid any ABI breakages in future. Since this structure defines
> some static arrays and changing respective macros breaks the ABI.
> Eg:
> Currently RTE_MAX_RXTX_INTR_VEC_ID imposes a limit of maximum 512
> MSI-X interrupts that can be defined for a PCI device, while PCI
> specification allows maximum 2048 MSI-X interrupts that can be used.
> If some PCI device requires more than 512 vectors, either change the
> RTE_MAX_RXTX_INTR_VEC_ID limit or dynamically allocate based on
> PCI device MSI-X size on probe time. Either way its an ABI breakage.
> 
> Change already included in 21.11 ABI improvement spreadsheet (item 42):
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-
> 3A__docs.google.com_s&data=04%7C01%7Crasland%40nvidia.com%7C
> 567d8ee2e3c842a9e59808d9959d822e%7C43083d15727340c1b7db39efd9ccc1
> 7a%7C0%7C0%7C637705326003996997%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C1000&sdata=7UgxpkEtH%2Fnjk7xo9qELjqWi58XLzzCH2pimeDWLzvc%
> 3D&reserved=0
> preadsheets_d_1betlC000ua5SsSiJIcC54mCCCJnW6voH5Dqv9UxeyfE_edit-
> 23gid-
> 3D0&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=5ESHPj7V-
> 7JdkxT_Z_SU6RrS37ys4U
> XudBQ_rrS5LRo&m=7dl3OmXU7QHMmWYB6V1hYJtq1cUkjfhXUwze2Si_48c
> &s=lh6DEGhR
> Bg1shODpAy3RQk-H-0uQx5icRfUBf9dtCp4&e=
> 
> This series makes struct rte_intr_handle totally opaque to the outside
> world by wrapping it inside a .c file and providing get set wrapper APIs
> to read or manipulate its fields.. Any changes to be made to any of the
> fields should be done via these get set APIs.
> Introduced a new eal_common_interrupts.c where all these APIs are
> defined
> and also hides struct rte_intr_handle definition.
> 
> Details on each patch of the series:
> Patch 1: eal/interrupts: implement get set APIs
> This patch provides prototypes and implementation of all the new
> get set APIs. Alloc APIs are implemented to allocate memory for
> interrupt handle instance. Currently most of the drivers defines
> interrupt handle instance as static but now it cant be static as
> size of rte_intr_handle is unknown to all the drivers. Drivers are
> expected to allocate interrupt instances during initialization
> and free these instances during cleanup phase.
> This patch also rearranges the headers related to interrupt
> framework. Epoll related definitions prototypes are moved into a
> new header i.e. rte_epoll.h and APIs defined in rte_eal_interrupts.h
> which were driver specific are moved to rte_interrupts.h (as anyways
> it was accessible and used outside DPDK library. Later in the series
> rte_eal_interrupts.h is removed.
> 
> Patch 2: eal/interrupts: avoid direct access to interrupt handle
> Modifying the interrupt framework for linux and freebsd to use these
> get set alloc APIs as per requirement and avoid accessing the fields
> directly.
> 
> Patch 3: test/interrupt: apply get set interrupt handle APIs
> Updating interrupt test suite to use interrupt handle APIs.
> 
> Patch 4: drivers: remove direct access to interrupt handle fields
> Modifying all the drivers and libraries which are currently directly
> accessing the interrupt handle fields. Drivers are expected to
> allocated the interrupt instance, use get set APIs with the allocated
> interrupt handle and free it on cleanup.
> 
> Patch 5: eal/interrupts: make interrupt handle structure opaque
> In this patch rte_eal_interrupt.h is removed, struct rte_intr_handle
> definition is moved to c file to make it completely opaque. As part of
> interrupt handle allocation, array like efds and elist(which are currently
> static) are dynamically allocated with default size
> (RTE_MAX_RXTX_INTR_VEC_ID). Later these arrays can be reallocated as per
> device requirement using new API rte_intr_handle_event_list_update().
> Eg, on PCI device probing MSIX size can be queried and these arrays can
> be reallocated accordingly.
> 
> Patch 6: eal/alarm: introduce alarm fini routine
> Introducing alarm fini routine, as the memory allocated for alarm interrupt
> instance can be freed in alarm fini.
> 
> Testing performed:
> 1. Validated the series by running interrupts and alarm test suite.
> 2. Validate l3fwd power functionality with octeontx2 and i40e intel cards,
>where interrupts are expected on packet arrival.
> 
> v1:
> * Fixed freebsd compilation failure
> * Fixed seg fault in case of memif
> 
> v2:
> * Merged the prototype and implementation patch to 1.
> * Restricting allocation of single interrupt instance.
> * Removed base APIs, as they were exposing internally
> al

Re: [dpdk-dev] [PATCH v5 3/5] doc/hash: update documentation for the thash library

2021-10-25 Thread Thomas Monjalon
Vladimir, your patches are late and not perfect.
You need reviews. Please ask other maintainers to help with reviews.


21/10/2021 20:54, Vladimir Medvedkin:
> This patch adds documentation for the new optimized Toeplitz hash
> implementation using GFNI.
> 
> Signed-off-by: Vladimir Medvedkin 
> ---
>  doc/guides/prog_guide/toeplitz_hash_lib.rst | 37 
> +
>  doc/guides/rel_notes/release_21_11.rst  |  4 
>  2 files changed, 37 insertions(+), 4 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/toeplitz_hash_lib.rst 
> b/doc/guides/prog_guide/toeplitz_hash_lib.rst
> index f916857..88b152e 100644
> --- a/doc/guides/prog_guide/toeplitz_hash_lib.rst
> +++ b/doc/guides/prog_guide/toeplitz_hash_lib.rst
> @@ -19,24 +19,53 @@ to calculate the RSS hash sum to spread the traffic among 
> the queues.
>  Toeplitz hash function API
>  --
>  
> -There are two functions that provide calculation of the Toeplitz hash sum:
> +There are four functions that provide calculation of the Toeplitz hash sum:
>  
>  * ``rte_softrss()``
>  * ``rte_softrss_be()``
> +* ``rte_thash_gfni()``
> +* ``rte_thash_gfni_x2()``

The last function doesn't exist. I think it should be the _bulk one.

Also please squash the doc and test with the relevant code addition.
Maybe 2 patches for each implementation?




Re: [dpdk-dev] [PATCH v5 1/5] hash: add new toeplitz hash implementation

2021-10-25 Thread Thomas Monjalon
21/10/2021 20:54, Vladimir Medvedkin:
> This patch add a new Toeplitz hash implementation using
> Galios Fields New Instructions (GFNI).
> 
> Signed-off-by: Vladimir Medvedkin 
> Acked-by: Konstantin Ananyev 
> ---
> --- a/lib/hash/version.map
> +++ b/lib/hash/version.map
> @@ -39,10 +39,12 @@ EXPERIMENTAL {
>  
>   rte_thash_add_helper;
>   rte_thash_adjust_tuple;
> + rte_thash_complete_matrix;
>   rte_thash_find_existing;
>   rte_thash_free_ctx;
>   rte_thash_get_complement;
>   rte_thash_get_helper;
>   rte_thash_get_key;
> + rte_thash_gfni_supported;
>   rte_thash_init_ctx;
>  };
> 

It should be like this:

--- a/lib/hash/version.map
+++ b/lib/hash/version.map
@@ -37,6 +37,7 @@ DPDK_22 {
 EXPERIMENTAL {
global:
 
+   # added in 21.05
rte_thash_add_helper;
rte_thash_adjust_tuple;
rte_thash_find_existing;
@@ -45,4 +46,8 @@ EXPERIMENTAL {
rte_thash_get_helper;
rte_thash_get_key;
rte_thash_init_ctx;
+
+   # added in 21.11
+   rte_thash_complete_matrix;
+   rte_thash_gfni_supported;
 };






Re: [dpdk-dev] [dpdk-stable] [PATCH v2] lpm: fix buffer overflow

2021-10-25 Thread Thomas Monjalon
22/10/2021 11:07, Bruce Richardson:
> On Thu, Oct 21, 2021 at 06:15:49PM +0100, Vladimir Medvedkin wrote:
> > This patch fixes buffer overflow reported by ASAN,
> > please reference https://bugs.dpdk.org/show_bug.cgi?id=819
> > 
> > The rte_lpm6 keeps routing information for control plane purpose
> > inside the rte_hash table which uses rte_jhash() as a hash function.
> > From the rte_jhash() documentation: If input key is not aligned to
> > four byte boundaries or a multiple of four bytes in length,
> > the memory region just after may be read (but not used in the
> > computation).
> > rte_lpm6 uses 17 bytes keys consisting of IPv6 address (16 bytes) +
> > depth (1 byte).
> > 
> > This patch increases the size of the depth field up to uint32_t
> > and sets the alignment to 4 bytes.
> > 
> > Bugzilla ID: 819
> > Fixes: 86b3b21952a8 ("lpm6: store rules in hash table")
> > Cc: a...@therouter.net
> > Cc: sta...@dpdk.org
> > 
> > Signed-off-by: Vladimir Medvedkin 
> 
> Acked-by: Bruce Richardson 

Applied, thanks.





Re: [dpdk-dev] [PATCH] rib: fix the IPv6 depth mask

2021-10-25 Thread Thomas Monjalon
06/09/2021 17:54, Vladimir Medvedkin:
> Fixes: 03b8372a9a73 ("rib: fix max depth IPv6 lookup")
> Cc: ohily...@iol.unh.edu
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Vladimir Medvedkin 

An explanation would have been appreciated.

> - index = (depth & (UINT8_MAX - 1)) / CHAR_BIT;
> + index = (depth & INT8_MAX) / CHAR_BIT;

Applied, thanks.




Re: [dpdk-dev] [PATCH] fib: add rib extension size parameter

2021-10-25 Thread Thomas Monjalon
06/09/2021 17:55, Vladimir Medvedkin:
> This patch adds a new parameter to the fib configuration to specify
> the size of the extension for internal RIB structure.

It looks to be an announced API change.
What happens if the new field is not initialized in the app?
At least it would deserve a note in the release notes in API changes I think.

> --- a/examples/l3fwd/l3fwd_fib.c
> +++ b/examples/l3fwd/l3fwd_fib.c
> @@ -426,6 +426,7 @@ setup_fib(const int socketid)
>   /* Create the fib IPv4 table. */
>   config_ipv4.type = RTE_FIB_DIR24_8;
>   config_ipv4.max_routes = (1 << 16);
> + config_ipv4.rib_ext_sz = 0;
>   config_ipv4.default_nh = FIB_DEFAULT_HOP;
>   config_ipv4.dir24_8.nh_sz = RTE_FIB_DIR24_8_4B;
>   config_ipv4.dir24_8.num_tbl8 = (1 << 15);
> @@ -475,6 +476,7 @@ setup_fib(const int socketid)
>  
>   config.type = RTE_FIB6_TRIE;
>   config.max_routes = (1 << 16) - 1;
> + config.rib_ext_sz = 0;
>   config.default_nh = FIB_DEFAULT_HOP;
>   config.trie.nh_sz = RTE_FIB6_TRIE_4B;
>   config.trie.num_tbl8 = (1 << 15);
> diff --git a/lib/fib/rte_fib.c b/lib/fib/rte_fib.c
> index b354d4b..6ca180d 100644
> --- a/lib/fib/rte_fib.c
> +++ b/lib/fib/rte_fib.c
> @@ -164,7 +164,7 @@ rte_fib_create(const char *name, int socket_id, struct 
> rte_fib_conf *conf)
>   return NULL;
>   }
>  
> - rib_conf.ext_sz = 0;
> + rib_conf.ext_sz = conf->rib_ext_sz;
>   rib_conf.max_nodes = conf->max_routes * 2;
>  
>   rib = rte_rib_create(name, socket_id, &rib_conf);
> diff --git a/lib/fib/rte_fib.h b/lib/fib/rte_fib.h
> index acad209..570b4b6 100644
> --- a/lib/fib/rte_fib.h
> +++ b/lib/fib/rte_fib.h
> @@ -84,6 +84,8 @@ struct rte_fib_conf {
>   /** Default value returned on lookup if there is no route */
>   uint64_t default_nh;
>   int max_routes;
> + /** Size of the node extension in the internal RIB struct */
> + unsigned int rib_ext_sz;
>   union {
>   struct {
>   enum rte_fib_dir24_8_nh_sz nh_sz;





Re: [dpdk-dev] [PATCH v5 5/5] test/thash: add performance tests for the Toeplitz hash

2021-10-25 Thread Stephen Hemminger
On Thu, 21 Oct 2021 19:54:29 +0100
Vladimir Medvedkin  wrote:

> +static uint8_t default_rss_key[] = {

Should this be const?

That way you can make sure API isn't modifying it.


Re: [dpdk-dev] [PATCH v8 0/9] make rte_intr_handle internal

2021-10-25 Thread Raslan Darawsheh
Hi,
> -Original Message-
> From: David Marchand 
> Sent: Monday, October 25, 2021 5:27 PM
> To: hka...@marvell.com; dev@dpdk.org
> Cc: dmitry.kozl...@gmail.com; Raslan Darawsheh ;
> NBU-Contact-Thomas Monjalon 
> Subject: [PATCH v8 0/9] make rte_intr_handle internal
> 
> Moving struct rte_intr_handle as an internal structure to avoid any ABI
> breakages in future. Since this structure defines some static arrays and
> changing respective macros breaks the ABI.
> Eg:
> Currently RTE_MAX_RXTX_INTR_VEC_ID imposes a limit of maximum 512
> MSI-X interrupts that can be defined for a PCI device, while PCI specification
> allows maximum 2048 MSI-X interrupts that can be used.
> If some PCI device requires more than 512 vectors, either change the
> RTE_MAX_RXTX_INTR_VEC_ID limit or dynamically allocate based on PCI
> device MSI-X size on probe time. Either way its an ABI breakage.
> 
> Change already included in 21.11 ABI improvement spreadsheet (item 42):
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld
> efense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-
> 3A__docs.google.com_s&data=04%7C01%7Crasland%40nvidia.com%7C
> c626e0d058714bc3075a08d997c39557%7C43083d15727340c1b7db39efd9ccc17
> a%7C0%7C0%7C637707688554493769%7CUnknown%7CTWFpbGZsb3d8eyJWI
> joiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1
> 000&sdata=y7vFUXbUzh6ise1zn8bzbfuUGv6L24gCNcUsuWKqRBk%3D&
> amp;reserved=0
> preadsheets_d_1betlC000ua5SsSiJIcC54mCCCJnW6voH5Dqv9UxeyfE_edit-
> 23gid-
> 3D0&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=5ESHPj7V-
> 7JdkxT_Z_SU6RrS37ys4U
> XudBQ_rrS5LRo&m=7dl3OmXU7QHMmWYB6V1hYJtq1cUkjfhXUwze2Si_48c
> &s=lh6DEGhR
> Bg1shODpAy3RQk-H-0uQx5icRfUBf9dtCp4&e=
> 
> This series makes struct rte_intr_handle totally opaque to the outside world
> by wrapping it inside a .c file and providing get set wrapper APIs to read or
> manipulate its fields.. Any changes to be made to any of the fields should be
> done via these get set APIs.
> Introduced a new eal_common_interrupts.c where all these APIs are
> defined and also hides struct rte_intr_handle definition.
> 
> v1:
> * Fixed freebsd compilation failure
> * Fixed seg fault in case of memif
> 
> v2:
> * Merged the prototype and implementation patch to 1.
> * Restricting allocation of single interrupt instance.
> * Removed base APIs, as they were exposing internally allocated memory
> information.
> * Fixed some memory leak issues.
> * Marked some library specific APIs as internal.
> 
> v3:
> * Removed flag from instance alloc API, rather auto detect if memory should
> be allocated using glibc malloc APIs or
> rte_malloc*
> * Added APIs for get/set windows handle.
> * Defined macros for repeated checks.
> 
> v4:
> * Rectified some typo in the APIs documentation.
> * Better names for some internal variables.
> 
> v5:
> * Reverted back to passing flag to instance alloc API, as with auto detect
> some multiprocess issues existing in the library were causing tests failure.
> * Rebased to top of tree.
> 
> v6:
> * renamed RTE_INTR_INSTANCE_F_UNSHARED as
> RTE_INTR_INSTANCE_F_PRIVATE,
> * changed API and removed need for alloc_flag content exposure
>   (see rte_intr_instance_dup() in patch 1 and 2),
> * exported all symbols for Windows,
> * fixed leak in unit tests in case of alloc failure,
> * split (previously) patch 4 into three patches
>   * (now) patch 4 only concerns alarm and (previously) patch 6 cleanup bits
> are squashed in it,
>   * (now) patch 5 concerns other libraries updates,
>   * (now) patch 6 concerns drivers updates:
> * instance allocation is moved to probing for auxiliary,
> * there might be a bug for PCI drivers non requesting
>   RTE_PCI_DRV_NEED_MAPPING, but code is left as v5,
> * split (previously) patch 5 into three patches
>   * (now) patch 7 only hides structure, but keep it in a EAL private
> header, this makes it possible to keep info in tracepoints,
>   * (now) patch 8 deals with VFIO/UIO internal fds merge,
>   * (now) patch 9 extends event list,
> 
> v7:
> * fixed compilation on FreeBSD,
> * removed unused interrupt handle in FreeBSD alarm code,
> * fixed interrupt handle allocation for PCI drivers without
>   RTE_PCI_DRV_NEED_MAPPING,
> 
> v8:
> * lowered logs level to DEBUG in sanity checks,
> * fixed corner case with vector list access,
> 
> --
> David Marchand
> 
> Harman Kalra (9):
>   interrupts: add allocator and accessors
>   interrupts: remove direct access to interrupt handle
>   test/interrupts: remove direct access to interrupt handle
>   alarm: remove direct access to interrupt handle
>   lib: remove direct access to interrupt handle
>   drivers: remove direct access to interrupt handle
>   interrupts: make interrupt handle structure opaque
>   interrupts: rename device specific file descriptor
>   interrupts: extend event list
> 
>  MAINTAINERS   |   1 +
>  app/test/test_interrupts.c| 164 +++--
>  drivers/baseband/acc100/rte_acc100_pmd.c   

Re: [dpdk-dev] [PATCH] eal/windows: fix IOVA mode detection and handling

2021-10-25 Thread Kadam, Pallavi



On 10/25/2021 5:20 AM, Dmitry Kozlyuk wrote:

Windows EAL did not detect IOVA mode and worked incorrectly
if physical addresses could not be obtained
(if virt2phys driver was missing or inaccessible).
In this case, rte_mem_virt2iova() reported RTE_BAD_IOVA for any address.
Inability to obtain IOVA, be it PA or VA, should cause a failure
for the DPDK allocator, but it was hidden by the implementation,
so allocations did not fail when they should.
The mode when DPDK cannot obtain PA but can work is IOVA-as-VA mode.
However, rte_eal_iova_mode() always returned RTE_IOVA_DC
(while it should only ever return RTE_IOVA_PA or RTE_IOVA_VA),
because IOVA mode detection was not implemented.

Implement IOVA mode detection:
1. Always allow to force --iova-mode=va.
2. Allow to force --iova-mode=pa only if virt2phys is available.
3. If no mode is forced and virt2phys is available,
select the mode according to bus requests, default to PA.
4. If no mode is forced but virt2phys is unavailable, default to VA.
Fix rte_mem_virt2iova() by returning VA when using IOVA-as-VA.
Fix rte_eal_iova_mode() by returning the selected mode.

Fixes: 2a5d547a4a9b ("eal/windows: implement basic memory management")
Cc: sta...@dpdk.org

Reported-by: Tal Shnaiderman 
Signed-off-by: Dmitry Kozlyuk 
---


Tested-by: Pallavi Kadam 

Acked-by: Pallavi Kadam 





Re: [dpdk-dev] [PATCH v2] eal: add telemetry callbacks for memory info

2021-10-25 Thread Thomas Monjalon
> > From a Telemetry usage point of view,
> > 
> > Acked-by: Ciara Power 
> 
> Agree, this patch is much more in keeping with the existing way of working
> than the v1.
> 
> Acked-by: Bruce Richardson 

Applied, thanks.





  1   2   >