[dpdk-dev] [PATCH v2] doc: add reserve fields to eventdev public structures

2020-08-03 Thread pbhagavatula
From: Pavan Nikhilesh 

Add 64 byte padding at the end of event device public structure to allow
future extensions.

Signed-off-by: Pavan Nikhilesh 
Acked-by: Jerin Jacob 
---
 v2 Changes:
 - Modify commit title.
 - Add patch reference to doc.

 doc/guides/rel_notes/deprecation.rst | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7a4..ec5db68e9 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -151,3 +151,14 @@ Deprecation Notices
   Python 2 support will be completely removed in 20.11.
   In 20.08, explicit deprecation warnings will be displayed when running
   scripts with Python 2.
+
+* eventdev: A 64 byte padding is added at the end of the following structures
+  in event device library to support future extensions:
+  ``rte_event_crypto_adapter_conf``, ``rte_event_eth_rx_adapter_conf``,
+  ``rte_event_eth_rx_adapter_queue_conf``, ``rte_event_eth_tx_adapter_conf``,
+  ``rte_event_timer_adapter_conf``, ``rte_event_timer_adapter_info``,
+  ``rte_event_dev_info``, ``rte_event_dev_config``, ``rte_event_queue_conf``,
+  ``rte_event_port_conf``, ``rte_event_timer_adapter``,
+  ``rte_event_timer_adapter_data``.
+  Reference:
+  http://patches.dpdk.org/project/dpdk/list/?series=10728&archive=both&state=*
--
2.17.1



Re: [dpdk-dev] [PATCH V1 2/3] mbuf: change free_cb interface to adapt to GSO case

2020-08-03 Thread Olivier Matz
On Mon, Aug 03, 2020 at 09:26:40AM +0800, yang_y_yi wrote:
> At 2020-08-03 04:29:07, "Olivier Matz"  wrote:
> >Hi,
> >
> >On Sun, Aug 02, 2020 at 07:12:36AM +0800, yang_y_yi wrote:
> >> 
> >> 
> >> At 2020-07-31 23:15:43, "Olivier Matz"  wrote:
> >> >Hi,
> >> >
> >> >On Thu, Jul 30, 2020 at 08:08:59PM +0800, yang_y...@163.com wrote:
> >> >> From: Yi Yang 
> >> >> 
> >> >> In GSO case, segmented mbufs are attached to original
> >> >> mbuf which can't be freed when it is external. The issue
> >> >> is free_cb doesn't know original mbuf and doesn't free
> >> >> it when refcnt of shinfo is 0.
> >> >> 
> >> >> Original mbuf can be freed by rte_pktmbuf_free segmented
> >> >> mbufs or by rte_pktmbuf_free original mbuf. Two kind of
> >> >> cases should have different behaviors. free_cb won't
> >> >> explicitly call rte_pktmbuf_free to free original mbuf
> >> >> if it is freed by rte_pktmbuf_free original mbuf, but it
> >> >> has to call rte_pktmbuf_free to free original mbuf if it
> >> >> is freed by rte_pktmbuf_free segmented mbufs.
> >> >> 
> >> >> In order to fix this issue, free_cb interface has to been
> >> >> changed, __rte_pktmbuf_free_extbuf must deliver called
> >> >> mbuf pointer to free_cb, argument opaque can be defined
> >> >> as a custom struct by user, it can includes original mbuf
> >> >> pointer, user-defined free_cb can compare caller mbuf with
> >> >> mbuf in opaque struct, free_cb should free original mbuf
> >> >> if they are not same, this corresponds to rte_pktmbuf_free
> >> >> segmented mbufs case, otherwise, free_cb won't free original
> >> >> mbuf because the caller explicitly called rte_pktmbuf_free
> >> >> to free it.
> >> >> 
> >> >> Here is pseduo code to show two kind of cases.
> >> >> 
> >> >> case 1. rte_pktmbuf_free segmented mbufs
> >> >> 
> >> >> nb_tx = rte_gso_segment(original_mbuf, /* original mbuf */
> >> >>   &gso_ctx,
> >> >>   /* segmented mbuf */
> >> >>   (struct rte_mbuf **)&gso_mbufs,
> >> >>   MAX_GSO_MBUFS);
> >> >
> >> >I'm sorry but it is not very clear to me what operations are done by
> >> >rte_gso_segment().
> >> >
> >> >In the current code, I only see calls to rte_pktmbuf_attach(),
> >> >which do not deal with external buffers. Am I missing something?
> >> >
> >> >Are you able to show the issue only with mbuf functions? It would
> >> >be helpful to understand what does not work.
> >> >
> >> >
> >> >Thanks,
> >> >Olivier
> >> >
> >> Oliver, thank you for comment, let me show you why it doesn't work for my 
> >> use case.  In OVS DPDK, VM uses vhostuserclient to send large packets 
> >> whose size is about 64K because we enabled TSO & UFO, these large packets 
> >> use rte_mbufs allocated by DPDK virtio_net function 
> >> virtio_dev_pktmbuf_alloc() (in file lib/librte_vhost/virtio_net.c. Please 
> >> refer to [PATCH V1 3/3], I changed free_cb as below, these packets use the 
> >> same allocate function and the same free_cb no matter they are TCP packet 
> >> or UDP packets, in case of VXLAN TSO, most NICs can't support inner UDP 
> >> fragment offload, so OVS DPDK has to do it by software, for UDP case, the 
> >> original rte_mbuf only can be freed by segmented rte_mbufs which are 
> >> output packets of rte_gso_segment, i.e. the original rte_mbuf only can 
> >> freed by free_cb, you can see, it explicitly called 
> >> rte_pktmbuf_free(arg->mbuf), the condition statement "if (caller_m != 
> >> arg->mbuf)" is true for this case, this has no problem, but for TCP case, 
> >> the original mbuf is delivered to rte_eth_tx_burst() but not segmented 
> >> rte_mbufs output by rte_gso_segment, PMD driver will call 
> >> rte_pktmbuf_free(original_rte_mbuf) but not 
> >> rte_pktmbuf_free(segmented_rte_mbufs), the same free_cb will be called, 
> >> that means original_rte_mbuf will be freed twice, you know what will 
> >> happen, this is just the issue I'm fixing. I bring in caller_m argument, 
> >> it can help work around this because caller_m is arg->mbuf and the 
> >> condition statement "if (caller_m != arg->mbuf)" is false, you can't fix 
> >> it without the change this patch series did.
> >
> >I'm sill not sure to get your issue. Please, if you have a simple test
> >case using only mbufs functions (without virtio, gso, ...), it would be
> >very helpful because we will be sure that we are talking about the same
> >thing. In case there is an issue, it can easily become a unit test.
> 
> Oliver, I think you don't get the point, free operation can't be controlled 
> by the application itself, 
> it is done by PMD driver and triggered by rte_eth_tx_burst, I have shown 
> pseudo code,
> rte_gso_segment just segments a large mbuf to multiple mbufs, it won't send 
> them, the application
> will call rte_eth_tx_burst to send them finally.
>
> >
> >That said, I looked at vhost mbuf allocation and gso segmentation, and
> >I found some strange things:
> >
> >1/ In virtio_dev_extbuf_alloc(), and I there 

[dpdk-dev] [PATCH] vdpa/mlx5: fix virtq unset

2020-08-03 Thread Matan Azrad
When a virtq is destroyed, the SW should be able to continue the virtq
processing from where the HW stopped.

The current destroy behavior in the driver saves the virtq state (used
and available indexes) only when LM is requested.
So, when LM is not requested the queue state is not saved and the SW
indexes stay invalid.

Save the virtq state in the virtq destroy process.

Fixes: bff735011078 ("vdpa/mlx5: prepare virtio queues")
Cc: sta...@dpdk.org

Signed-off-by: Matan Azrad 
Acked-by: Xueming Li 
---
 drivers/vdpa/mlx5/mlx5_vdpa.h   |  1 +
 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 17 -
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h
index 57044d9..5963e35 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.h
@@ -86,6 +86,7 @@ struct mlx5_vdpa_virtq {
uint16_t index;
uint16_t vq_size;
uint8_t notifier_state;
+   bool stopped;
struct mlx5_vdpa_priv *priv;
struct mlx5_devx_obj *virtq;
struct mlx5_devx_obj *counters;
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c 
b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
index 19554f6..17e71cf 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
@@ -72,8 +72,13 @@
}
virtq->intr_handle.fd = -1;
}
-   if (virtq->virtq)
+   if (virtq->virtq) {
+   ret = mlx5_vdpa_virtq_stop(virtq->priv, virtq->index);
+   if (ret)
+   DRV_LOG(WARNING, "Failed to stop virtq %d.",
+   virtq->index);
claim_zero(mlx5_devx_cmd_destroy(virtq->virtq));
+   }
virtq->virtq = NULL;
for (i = 0; i < RTE_DIM(virtq->umems); ++i) {
if (virtq->umems[i].obj)
@@ -135,10 +140,14 @@
 {
struct mlx5_devx_virtq_attr attr = {0};
struct mlx5_vdpa_virtq *virtq = &priv->virtqs[index];
-   int ret = mlx5_vdpa_virtq_modify(virtq, 0);
+   int ret;
 
+   if (virtq->stopped)
+   return 0;
+   ret = mlx5_vdpa_virtq_modify(virtq, 0);
if (ret)
return -1;
+   virtq->stopped = true;
if (mlx5_devx_cmd_query_virtq(virtq->virtq, &attr)) {
DRV_LOG(ERR, "Failed to query virtq %d.", index);
return -1;
@@ -323,6 +332,7 @@
virtq->intr_handle.fd, index);
}
}
+   virtq->stopped = false;
DRV_LOG(DEBUG, "vid %u virtq %u was created successfully.", priv->vid,
index);
return 0;
@@ -489,9 +499,6 @@
DRV_LOG(WARNING, "Failed to disable steering "
"for virtq %d.", index);
}
-   ret = mlx5_vdpa_virtq_stop(priv, index);
-   if (ret)
-   DRV_LOG(WARNING, "Failed to stop virtq %d.", index);
mlx5_vdpa_virtq_unset(virtq);
}
if (enable) {
-- 
1.8.3.1



Re: [dpdk-dev] [20.11, PATCH v2] baseband/fpga_5gnr_fec: add companion PF config App

2020-08-03 Thread Maxime Coquelin
Hi Nicolas,

On 7/31/20 5:17 PM, Chautru, Nicolas wrote:
> Hi Maxime, 
> 
>>
>> Hi Nicolas,
>>
>> On 7/16/20 10:20 PM, Nicolas Chautru wrote:
>>> Adding companion application to configure HW Device from the PF.
>>> Then the device can be accessed through BBDEV from VF (or PF).
>>>
>>> Signed-off-by: Nicolas Chautru 
>>> ---
>>>  doc/guides/bbdevs/fpga_5gnr_fec.rst|  80 +++--
>>>  .../baseband/fpga_5gnr_fec/pf_config_app/Makefile  |  36 +++
>>>  .../fpga_5gnr_fec/pf_config_app/config_app.c   | 333
>> +++
>>>  .../pf_config_app/fpga_5gnr_cfg_app.c  | 351
>> +
>>>  .../pf_config_app/fpga_5gnr_cfg_app.h  | 102 ++
>>>  .../pf_config_app/fpga_5gnr_cfg_parser.c   | 187 +++
>>>  .../pf_config_app/fpga_5gnr_config.cfg |  18 ++
>>>  7 files changed, 1087 insertions(+), 20 deletions(-)  create mode
>>> 100644 drivers/baseband/fpga_5gnr_fec/pf_config_app/Makefile
>>>  create mode 100644
>>> drivers/baseband/fpga_5gnr_fec/pf_config_app/config_app.c
>>>  create mode 100644
>>> drivers/baseband/fpga_5gnr_fec/pf_config_app/fpga_5gnr_cfg_app.c
>>>  create mode 100644
>>> drivers/baseband/fpga_5gnr_fec/pf_config_app/fpga_5gnr_cfg_app.h
>>>  create mode 100644
>>> drivers/baseband/fpga_5gnr_fec/pf_config_app/fpga_5gnr_cfg_parser.c
>>>  create mode 100644
>>> drivers/baseband/fpga_5gnr_fec/pf_config_app/fpga_5gnr_config.cfg
>>
>> I think having the pf_config_app in the driver directory is not a good idea,
>> this is not the place for applications.
>>
>> Also, it is not integrated in the DPDK build system, so it cannot benefit 
>> from
>> the CI. Having an external dependency that is not packaged in distributions
>> will not help to have it integrated in the build system.
>>
> 
> Thanks for sharing.
> Note that all these points were raised openly explicitly earlier as you know, 
> ie part of both pros and cons.  
> Still happy to get feedback from others notably Thomas. It appears you had 
> side conversations with him on this very topic. 
> 
>> I see some alternatives:
>> 1. Move it in another directory in the main DPDK repo, but it is not a DPDK
>> example, not a dev tool and not a build tool, so it would need a new
>> directory.
>> 2. Create a BBDEV tools repository on dpdk.org (It would require techboard
>> approval).
>> 3. Host it in a dedicated repository on Intel's github 4. Move it into some
>> Intel FPGA tools repository
> 
> There are several others options which were indeed considered in case this 
> option was not viable. 
> Still DPDK was considered best option so far to keep everything in one 
> recognized place for BBDEV devices but happy to get further input from 
> others. 
> 
>> I think option 3 would be the best to get it packaged into distributions as 
>> it
>> has no build dependency with any DPDK library.
>> You could maybe add inih library as a git sub-repository within this repo.
>> Other advantage is you wouldn't depend on DPDK release cycles to get fixes
>> merged.
>>
>> Regarding the tool itself, I understand from the commit message that the
>> tool has a dependency on the BBDEV PMD version, but the tool run on the
>> host while the PMD driver is used in the guest/container. So having it in the
>> driver directory will not really help, as host DPDK (if any) and guest DPDK 
>> may
>> come from different parties.
> 
> Yes this was captured earlier, purely stored there as a companion application 
> for a given
> version of the PMD (ie. different subdirectories for each PMD directory).
> They do no run in the same container for deployment and are not built at the 
> same time indeed, their interface is the HW really and one being needed to be 
> run prior to the other one to be functional.  
> 
>> One question I have is whether this is the tool itself that has a dependency 
>> on
>> the PMD, or just the config file?
> 
> Each PMD directory would have its own version of the companion PF config 
> application.
> Ie. the patch above is only for baseband/fpga_5gnr_fec ie. Intel Vista Creek 
> with 5G LDPC user image.

OK. Does it mean the same application and configuration will work for
baseband/fpga_5gnr_fec PMD versions v20.11, v21.02, v21.05, etc, ...?

If not, is there a way for the PMD driver to detect whether a wrong
configuration was applied? Something like checking the FW version of a
NIC is supported by the PMD driver.

> There will be different companion applications upstreamed for each other PMD 
> directories (current and future) as they rely on different HW devices with 
> independent MMIO access. 
> Said otherwise both the config file (features exposed) and implementation 
> (registers required for these features) are defined per HW device (+ user 
> image for FPGA)  hence per PMD version.


Would it make sense to have a single application, with having the
registers map and their values to apply in a configuration file?
It would avoid code duplication between devices and so ease the
m

Re: [dpdk-dev] The mbuf API needs some cleaning up

2020-08-03 Thread Morten Brørup
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz
> Sent: Friday, July 31, 2020 5:25 PM
> 
> Hi Morten,
> 
> Thanks for the feedback.
> 
> On Mon, Jul 13, 2020 at 11:57:38AM +0200, Morten Brørup wrote:
> 
> > The MBUF library exposes some macros and constants without the RTE_
> prefix. I
> > propose cleaning up these, so better names get into the coming LTS
> release.
> 
> Yes, Thomas talked about it some time ago and he even drafted a patch
> to
> fix it. We can target 20.11 for the changes, but I think we'll have to
> keep a compat API until 21.11.
> 

Great, then I will back off. No need for multiple patches fixing the same 
things. :-)

And I agree with all your feedback... although I do consider the mbuf port_id 
so much at the core of DPDK that I suggested RTE_PORT_INVALID over 
RTE_MBUF_PORT_INVALID, but I don't feel strongly about it. Whatever you and 
Thomas prefer is probably fine.

> > The worst is:
> > #define MBUF_INVALID_PORT UINT16_MAX
> >
> > I say it's the worst because when we were looking for the official
> "invalid"
> > port value for our application, we didn't find this one. (Probably
> because its
> > documentation is wrong.)
> >
> > MBUF_INVALID_PORT is defined in rte_mbuf_core.h without any
> description, and
> > in rte_mbuf.h, where it is injected between the rte_pktmbuf_reset()
> function
> > and its description, so the API documentation shows the function's
> description
> > for the constant, and no description for the function.
> 
> The one in rte_mbuf_core.h should be kept, with a documentation.
> 
> > I propose keeping it at a sensible location in rte_mbuf_core.h only,
> adding a description, and renaming it to:
> > #define RTE_PORT_INVALID UINT16_MAX
> 
> I suggest RTE_MBUF_PORT_INVALID
> 
> > For backwards compatibility, we could add:
> > /* this old name is deprecated */
> > #define MBUF_INVALID_PORT RTE_PORT_INVALID
> >
> > I also wonder why there are no compiler warnings about the double
> definition?
> 
> If the value is the same, the compiler won't complain.
> 
> > There are also the data buffer location constants:
> > #define EXT_ATTACHED_MBUF(1ULL << 61)
> > and
> > #define IND_ATTACHED_MBUF(1ULL << 62)
> >
> >
> > There are already macros (with good names) for reading these, so
> > simply adding the RTE_ prefix to these two constants suffices.
> 
> Some applications use it, we also need a compat here.
> 
> > And all the packet offload flags, such as:
> > #define PKT_RX_VLAN  (1ULL << 0)
> >
> >
> > They are supposed to be used by applications, so I guess we should
> > keep them unchanged for ABI stability reasons.
> 
> I propose RTE_MBUF_F_ for the mbuf flags.
> 
> > And the local macro:
> > #define MBUF_RAW_ALLOC_CHECK(m) do { \
> >
> > This might as well be an internal inline function:
> > /* internal */
> > static inline void
> > __rte_mbuf_raw_alloc_check(const struct rte_mbuf *m)
> >
> 
> agree, I don't think a macro is mandatory here
> 
> 
> Thanks,
> Olivier
> 
> 
> > Or we could keep it a macro and move it next to
> > __rte_mbuf_sanity_check(), keeping it clear that it is only relevant
> when
> > RTE_LIBRTE_MBUF_DEBUG is set. But rename it to lower case, similar to
> the
> > __rte_mbuf_sanity_check() macro.
> >
> >
> > Med venlig hilsen / kind regards
> > - Morten Brørup
> >



[dpdk-dev] [PATCH] net/mlx5: fix number of retries for UAR allocation

2020-08-03 Thread Dekel Peled
Previous patch added definition of number of retries for UAR allocation.
This value is adequate for x86 systems with 4K pages.
On power9 system with 64K pages the required value is 32.
This patch updates the defined value from 2 to 32.

Fixes: a0bfe9d56f74 ("net/mlx5: fix UAR memory mapping type")
Cc: sta...@dpdk.org

Signed-off-by: Dekel Peled 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_defs.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index e5f7acc..c26d5a2 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -202,9 +202,7 @@
  * UAR base address if UAR was not the first object in the UAR page.
  * It caused the PMD failure and we should try to get another UAR
  * till we get the first one with non-NULL base address returned.
- * Should follow the rdma_core internal (not exported) definition
- * MLX5_NUM_NON_FP_BFREGS_PER_UAR.
  */
-#define MLX5_ALLOC_UAR_RETRY 2
+#define MLX5_ALLOC_UAR_RETRY 32
 
 #endif /* RTE_PMD_MLX5_DEFS_H_ */
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH V1 2/3] mbuf: change free_cb interface to adapt to GSO case

2020-08-03 Thread yang_y_yi
At 2020-08-03 16:11:39, "Olivier Matz"  wrote:
>On Mon, Aug 03, 2020 at 09:26:40AM +0800, yang_y_yi wrote:
>> At 2020-08-03 04:29:07, "Olivier Matz"  wrote:
>> >Hi,
>> >
>> >On Sun, Aug 02, 2020 at 07:12:36AM +0800, yang_y_yi wrote:
>> >> 
>> >> 
>> >> At 2020-07-31 23:15:43, "Olivier Matz"  wrote:
>> >> >Hi,
>> >> >
>> >> >On Thu, Jul 30, 2020 at 08:08:59PM +0800, yang_y...@163.com wrote:
>> >> >> From: Yi Yang 
>> >> >> 
>> >> >> In GSO case, segmented mbufs are attached to original
>> >> >> mbuf which can't be freed when it is external. The issue
>> >> >> is free_cb doesn't know original mbuf and doesn't free
>> >> >> it when refcnt of shinfo is 0.
>> >> >> 
>> >> >> Original mbuf can be freed by rte_pktmbuf_free segmented
>> >> >> mbufs or by rte_pktmbuf_free original mbuf. Two kind of
>> >> >> cases should have different behaviors. free_cb won't
>> >> >> explicitly call rte_pktmbuf_free to free original mbuf
>> >> >> if it is freed by rte_pktmbuf_free original mbuf, but it
>> >> >> has to call rte_pktmbuf_free to free original mbuf if it
>> >> >> is freed by rte_pktmbuf_free segmented mbufs.
>> >> >> 
>> >> >> In order to fix this issue, free_cb interface has to been
>> >> >> changed, __rte_pktmbuf_free_extbuf must deliver called
>> >> >> mbuf pointer to free_cb, argument opaque can be defined
>> >> >> as a custom struct by user, it can includes original mbuf
>> >> >> pointer, user-defined free_cb can compare caller mbuf with
>> >> >> mbuf in opaque struct, free_cb should free original mbuf
>> >> >> if they are not same, this corresponds to rte_pktmbuf_free
>> >> >> segmented mbufs case, otherwise, free_cb won't free original
>> >> >> mbuf because the caller explicitly called rte_pktmbuf_free
>> >> >> to free it.
>> >> >> 
>> >> >> Here is pseduo code to show two kind of cases.
>> >> >> 
>> >> >> case 1. rte_pktmbuf_free segmented mbufs
>> >> >> 
>> >> >> nb_tx = rte_gso_segment(original_mbuf, /* original mbuf */
>> >> >>   &gso_ctx,
>> >> >>   /* segmented mbuf */
>> >> >>   (struct rte_mbuf **)&gso_mbufs,
>> >> >>   MAX_GSO_MBUFS);
>> >> >
>> >> >I'm sorry but it is not very clear to me what operations are done by
>> >> >rte_gso_segment().
>> >> >
>> >> >In the current code, I only see calls to rte_pktmbuf_attach(),
>> >> >which do not deal with external buffers. Am I missing something?
>> >> >
>> >> >Are you able to show the issue only with mbuf functions? It would
>> >> >be helpful to understand what does not work.
>> >> >
>> >> >
>> >> >Thanks,
>> >> >Olivier
>> >> >
>> >> Oliver, thank you for comment, let me show you why it doesn't work for my 
>> >> use case.  In OVS DPDK, VM uses vhostuserclient to send large packets 
>> >> whose size is about 64K because we enabled TSO & UFO, these large packets 
>> >> use rte_mbufs allocated by DPDK virtio_net function 
>> >> virtio_dev_pktmbuf_alloc() (in file lib/librte_vhost/virtio_net.c. Please 
>> >> refer to [PATCH V1 3/3], I changed free_cb as below, these packets use 
>> >> the same allocate function and the same free_cb no matter they are TCP 
>> >> packet or UDP packets, in case of VXLAN TSO, most NICs can't support 
>> >> inner UDP fragment offload, so OVS DPDK has to do it by software, for UDP 
>> >> case, the original rte_mbuf only can be freed by segmented rte_mbufs 
>> >> which are output packets of rte_gso_segment, i.e. the original rte_mbuf 
>> >> only can freed by free_cb, you can see, it explicitly called 
>> >> rte_pktmbuf_free(arg->mbuf), the condition statement "if (caller_m != 
>> >> arg->mbuf)" is true for this case, this has no problem, but for TCP case, 
>> >> the original mbuf is delivered to rte_eth_tx_burst() but not segmented 
>> >> rte_mbufs output by rte_gso_segment, PMD driver will call 
>> >> rte_pktmbuf_free(original_rte_mbuf) but not 
>> >> rte_pktmbuf_free(segmented_rte_mbufs), the same free_cb will be called, 
>> >> that means original_rte_mbuf will be freed twice, you know what will 
>> >> happen, this is just the issue I'm fixing. I bring in caller_m argument, 
>> >> it can help work around this because caller_m is arg->mbuf and the 
>> >> condition statement "if (caller_m != arg->mbuf)" is false, you can't fix 
>> >> it without the change this patch series did.
>> >
>> >I'm sill not sure to get your issue. Please, if you have a simple test
>> >case using only mbufs functions (without virtio, gso, ...), it would be
>> >very helpful because we will be sure that we are talking about the same
>> >thing. In case there is an issue, it can easily become a unit test.
>> 
>> Oliver, I think you don't get the point, free operation can't be controlled 
>> by the application itself, 
>> it is done by PMD driver and triggered by rte_eth_tx_burst, I have shown 
>> pseudo code,
>> rte_gso_segment just segments a large mbuf to multiple mbufs, it won't send 
>> them, the application
>> will call rte_eth_tx_burst to send them finally.
>>
>> >
>> >That said

Re: [dpdk-dev] [v2] vdpa/mlx5: fix queue update synchronization

2020-08-03 Thread Maxime Coquelin



On 8/2/20 11:21 AM, Xueming Li wrote:
> The driver CQ event management is done by non vhost library thread,
> either the dpdk host thread or the internal vDPA driver thread.
> 
> When a queue is updated the CQ may be destroyed and created by the vhost
> library thread via the queue state operation.
> 
> When the queue update feature was added, it didn't synchronize the CQ
> management to the queue update what may cause invalid memory access.
> 
> Add the aforementioned synchronization by a new per device configuration
> mutex.
> 
> Fixes: c47d6e83334e ("vdpa/mlx5: support queue update")
> 
> Signed-off-by: Xueming Li 
> Acked-by: Matan Azrad 
> ---
>  drivers/vdpa/mlx5/mlx5_vdpa.c   | 8 +++-
>  drivers/vdpa/mlx5/mlx5_vdpa.h   | 1 +
>  drivers/vdpa/mlx5/mlx5_vdpa_event.c | 8 
>  3 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
> index c0b87bcc01..a8f3e4b1de 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa.c
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
> @@ -133,6 +133,7 @@ mlx5_vdpa_set_vring_state(int vid, int vring, int state)
>   struct rte_vdpa_device *vdev = rte_vhost_get_vdpa_device(vid);
>   struct mlx5_vdpa_priv *priv =
>   mlx5_vdpa_find_priv_resource_by_vdev(vdev);
> + int ret;
>  
>   if (priv == NULL) {
>   DRV_LOG(ERR, "Invalid vDPA device: %s.", vdev->device->name);
> @@ -142,7 +143,10 @@ mlx5_vdpa_set_vring_state(int vid, int vring, int state)
>   DRV_LOG(ERR, "Too big vring id: %d.", vring);
>   return -E2BIG;
>   }
> - return mlx5_vdpa_virtq_enable(priv, vring, state);
> + pthread_mutex_lock(&priv->vq_config_lock);
> + ret = mlx5_vdpa_virtq_enable(priv, vring, state);
> + pthread_mutex_unlock(&priv->vq_config_lock);
> + return ret;
>  }
>  
>  static int
> @@ -742,6 +746,7 @@ mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv 
> __rte_unused,
>   }
>   mlx5_vdpa_config_get(pci_dev->device.devargs, priv);
>   SLIST_INIT(&priv->mr_list);
> + pthread_mutex_init(&priv->vq_config_lock, NULL);
>   pthread_mutex_lock(&priv_list_lock);
>   TAILQ_INSERT_TAIL(&priv_list, priv, next);
>   pthread_mutex_unlock(&priv_list_lock);
> @@ -793,6 +798,7 @@ mlx5_vdpa_pci_remove(struct rte_pci_device *pci_dev)
>   priv->var = NULL;
>   }
>   mlx5_glue->close_device(priv->ctx);
> + pthread_mutex_destroy(&priv->vq_config_lock);
>   rte_free(priv);
>   }
>   return 0;
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h
> index 57044d9d33..462805a352 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa.h
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h
> @@ -120,6 +120,7 @@ enum {
>  struct mlx5_vdpa_priv {
>   TAILQ_ENTRY(mlx5_vdpa_priv) next;
>   uint8_t configured;
> + pthread_mutex_t vq_config_lock;
>   uint64_t last_traffic_tic;
>   pthread_t timer_tid;
>   pthread_mutex_t timer_lock;
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c 
> b/drivers/vdpa/mlx5/mlx5_vdpa_event.c
> index 7dc1ac0fa9..4a8b7b0bd9 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c
> @@ -274,6 +274,7 @@ mlx5_vdpa_poll_handle(void *arg)
>priv->event_us;
>   while (1) {
>   max = 0;
> + pthread_mutex_lock(&priv->vq_config_lock);
>   for (i = 0; i < priv->nr_virtqs; i++) {
>   cq = &priv->virtqs[i].eqp.cq;
>   if (cq->cq && !cq->armed) {
> @@ -298,6 +299,7 @@ mlx5_vdpa_poll_handle(void *arg)
>   priv->vdev->device->name);
>   mlx5_vdpa_arm_all_cqs(priv);
>   pthread_mutex_lock(&priv->timer_lock);
> + pthread_mutex_unlock(&priv->vq_config_lock);

Is it mandatory to hold timer_lock before releasing vq_config_lock?
If not, swapping would be maybe safer.

>   priv->timer_on = 0;
>   while (!priv->timer_on)
>   pthread_cond_wait(&priv->timer_cond,
> @@ -312,6 +314,7 @@ mlx5_vdpa_poll_handle(void *arg)
>   } else {
>   priv->last_traffic_tic = current_tic;
>   }
> + pthread_mutex_unlock(&priv->vq_config_lock);
>   mlx5_vdpa_timer_sleep(priv, max);
>   }
>   return NULL;
> @@ -327,6 +330,7 @@ mlx5_vdpa_interrupt_handler(void *cb_arg)
>   uint8_t buf[sizeof(struct mlx5dv_devx_async_event_hdr) + 128];
>   } out;
>  
> + pthread_mutex_lock(&priv->vq_config_lock);
>   while (mlx5_glue->devx_get_event(priv->eventc, &out.event_resp,
>sizeof(out.buf)) >=
>  (ssize_t)sizeof(out.event_r

[dpdk-dev] [PATCH v3 2/5] eal: updated export list for Windows

2020-08-03 Thread Fady Bader
Added eal functions used by ethdev lib to the export list under Windows.

Signed-off-by: Fady Bader 
---
 lib/librte_eal/rte_eal_exports.def | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/lib/librte_eal/rte_eal_exports.def 
b/lib/librte_eal/rte_eal_exports.def
index 69204a92c6..782e6fc721 100644
--- a/lib/librte_eal/rte_eal_exports.def
+++ b/lib/librte_eal/rte_eal_exports.def
@@ -28,6 +28,7 @@ EXPORTS
rte_eal_tailq_register
rte_eal_using_phys_addrs
rte_free
+   rte_get_tsc_hz
rte_hexdump
rte_malloc
rte_malloc_dump_stats
@@ -145,3 +146,13 @@ EXPORTS
rte_mem_map
rte_mem_page_size
rte_mem_unmap
+
+   rte_class_register
+   rte_devargs_parse
+   rte_class_find_by_name
+   rte_socket_id
+   rte_strerror
+   rte_intr_rx_ctl
+   rte_log_register
+   rte_log_set_level
+   rte_log_register_type_and_pick_level
\ No newline at end of file
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v3 4/5] telemetry: implement empty stubs for Windows

2020-08-03 Thread Fady Bader
Telemetry didn't compile under Windows.
Empty stubs implementation was added for Windows.

Signed-off-by: Fady Bader 
---
 lib/librte_telemetry/rte_telemetry.h|  4 +++
 lib/librte_telemetry/telemetry.c| 52 -
 lib/librte_telemetry/telemetry_legacy.c | 26 -
 3 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/lib/librte_telemetry/rte_telemetry.h 
b/lib/librte_telemetry/rte_telemetry.h
index d13010b8fb..c3c0e44599 100644
--- a/lib/librte_telemetry/rte_telemetry.h
+++ b/lib/librte_telemetry/rte_telemetry.h
@@ -5,6 +5,10 @@
 #include 
 #include 
 
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include 
+#endif
+
 #ifndef _RTE_TELEMETRY_H_
 #define _RTE_TELEMETRY_H_
 
diff --git a/lib/librte_telemetry/telemetry.c b/lib/librte_telemetry/telemetry.c
index 0252282735..51fc0de6ae 100644
--- a/lib/librte_telemetry/telemetry.c
+++ b/lib/librte_telemetry/telemetry.c
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2020 Intel Corporation
  */
-
+#ifndef RTE_EXEC_ENV_WINDOWS
 #include 
 #include 
 #include 
@@ -20,6 +20,20 @@
 #include "telemetry_data.h"
 #include "rte_telemetry_legacy.h"
 
+#else
+
+/* includes for Windows */
+#include 
+#include 
+
+#include "rte_telemetry.h"
+#include "telemetry_json.h"
+#include "telemetry_data.h"
+#include "rte_telemetry_legacy.h"
+
+#endif
+
+#ifndef RTE_EXEC_ENV_WINDOWS
 #define MAX_CMD_LEN 56
 #define MAX_HELP_LEN 64
 #define MAX_OUTPUT_LEN (1024 * 16)
@@ -42,17 +56,25 @@ struct socket {
 };
 static struct socket v2_socket; /* socket for v2 telemetry */
 static struct socket v1_socket; /* socket for v1 telemetry */
+#endif
+
 static char telemetry_log_error[1024]; /* Will contain error on init failure */
+
+#ifndef RTE_EXEC_ENV_WINDOWS
 /* list of command callbacks, with one command registered by default */
 static struct cmd_callback callbacks[TELEMETRY_MAX_CALLBACKS];
 static int num_callbacks; /* How many commands are registered */
 /* Used when accessing or modifying list of command callbacks */
 static rte_spinlock_t callback_sl = RTE_SPINLOCK_INITIALIZER;
 static uint16_t v2_clients;
+int
+#endif
 
 int
 rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, const char *help)
 {
+#ifndef RTE_EXEC_ENV_WINDOWS
+
int i = 0;
 
if (strlen(cmd) >= MAX_CMD_LEN || fn == NULL || cmd[0] != '/'
@@ -76,8 +98,19 @@ rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, 
const char *help)
rte_spinlock_unlock(&callback_sl);
 
return 0;
+
+#else
+
+   RTE_SET_USED(cmd);
+   RTE_SET_USED(fn);
+   RTE_SET_USED(help);
+
+   return 0;
+
+#endif
 }
 
+#ifndef RTE_EXEC_ENV_WINDOWS
 static int
 list_commands(const char *cmd __rte_unused, const char *params __rte_unused,
struct rte_tel_data *d)
@@ -411,11 +444,14 @@ telemetry_v2_init(const char *runtime_dir, rte_cpuset_t 
*cpuset)
 
return 0;
 }
+#endif
 
 int32_t
 rte_telemetry_init(const char *runtime_dir, rte_cpuset_t *cpuset,
const char **err_str)
 {
+#ifndef RTE_EXEC_ENV_WINDOWS
+
if (telemetry_v2_init(runtime_dir, cpuset) != 0) {
*err_str = telemetry_log_error;
return -1;
@@ -424,4 +460,18 @@ rte_telemetry_init(const char *runtime_dir, rte_cpuset_t 
*cpuset,
*err_str = telemetry_log_error;
}
return 0;
+
+#else
+
+   RTE_SET_USED(runtime_dir);
+   RTE_SET_USED(cpuset);
+   RTE_SET_USED(err_str);
+
+   snprintf(telemetry_log_error, sizeof(telemetry_log_error),
+   "Telemetry is not supported on Windows.");
+
+   return 0;
+
+#endif
 }
+
diff --git a/lib/librte_telemetry/telemetry_legacy.c 
b/lib/librte_telemetry/telemetry_legacy.c
index a341fe4ebd..674f9c40ef 100644
--- a/lib/librte_telemetry/telemetry_legacy.c
+++ b/lib/librte_telemetry/telemetry_legacy.c
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2020 Intel Corporation
  */
-
+#ifndef RTE_EXEC_ENV_WINDOWS
 #include 
 #include 
 #include 
@@ -15,6 +15,15 @@
 
 #include "rte_telemetry_legacy.h"
 
+#else
+
+#include 
+
+#include "rte_telemetry_legacy.h"
+
+#endif
+#ifndef RTE_EXEC_ENV_WINDOWS
+
 #define MAX_LEN 128
 #define BUF_SIZE 1024
 #define CLIENTS_UNREG_ACTION "\"action\":2"
@@ -48,12 +57,15 @@ struct json_command 
callbacks[TELEMETRY_LEGACY_MAX_CALLBACKS] = {
 };
 int num_legacy_callbacks = 1;
 static rte_spinlock_t callback_sl = RTE_SPINLOCK_INITIALIZER;
+#endif
 
 int
 rte_telemetry_legacy_register(const char *cmd,
enum rte_telemetry_legacy_data_req data_req,
telemetry_legacy_cb fn)
 {
+#ifndef RTE_EXEC_ENV_WINDOWS
+
if (fn == NULL)
return -EINVAL;
if (num_legacy_callbacks >= (int) RTE_DIM(callbacks))
@@ -71,8 +83,19 @@ rte_telemetry_legacy_register(const char *cmd,
rte_spinlock_unlock(&callback_sl);
 
return 0;
+
+#else
+
+   RTE_SET_USED(cmd);
+   RTE_SET_USED(data_req);
+   RTE_S

[dpdk-dev] [PATCH v3 3/5] ethdev: remove structs from export list

2020-08-03 Thread Fady Bader
Some ethdev structs were present in ethdev export list.
There structs were removed from the export list.

Signed-off-by: Fady Bader 
---
 lib/librte_ethdev/rte_ethdev_version.map | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_ethdev/rte_ethdev_version.map 
b/lib/librte_ethdev/rte_ethdev_version.map
index 1212a17d32..21a2177756 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -89,7 +89,6 @@ DPDK_20.0 {
rte_eth_iterator_next;
rte_eth_led_off;
rte_eth_led_on;
-   rte_eth_link;
rte_eth_link_get;
rte_eth_link_get_nowait;
rte_eth_macaddr_get;
@@ -104,7 +103,6 @@ DPDK_20.0 {
rte_eth_rx_queue_setup;
rte_eth_set_queue_rate_limit;
rte_eth_speed_bitflag;
-   rte_eth_stats;
rte_eth_stats_get;
rte_eth_stats_reset;
rte_eth_timesync_adjust_time;
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v3 1/5] eal: added interrupts empty stubs

2020-08-03 Thread Fady Bader
The ethdev lib uses interrupts. Interrupts are not implemented for Windows.
To solve this, empty interrupt stubs were added under Windows.

Signed-off-by: Fady Bader 
---
 lib/librte_eal/windows/eal_interrupts.c | 17 +
 lib/librte_eal/windows/meson.build  |  1 +
 2 files changed, 18 insertions(+)
 create mode 100644 lib/librte_eal/windows/eal_interrupts.c

diff --git a/lib/librte_eal/windows/eal_interrupts.c 
b/lib/librte_eal/windows/eal_interrupts.c
new file mode 100644
index 00..1e3c6d20d2
--- /dev/null
+++ b/lib/librte_eal/windows/eal_interrupts.c
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+#include 
+
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+   int epfd, int op, unsigned int vec, void *data)
+{
+   RTE_SET_USED(intr_handle);
+   RTE_SET_USED(epfd);
+   RTE_SET_USED(op);
+   RTE_SET_USED(vec);
+   RTE_SET_USED(data);
+
+   return -ENOTSUP;
+}
diff --git a/lib/librte_eal/windows/meson.build 
b/lib/librte_eal/windows/meson.build
index 08c888e018..c5a19648d6 100644
--- a/lib/librte_eal/windows/meson.build
+++ b/lib/librte_eal/windows/meson.build
@@ -17,6 +17,7 @@ sources += files(
'eal_timer.c',
'fnmatch.c',
'getopt.c',
+   'eal_interrupts.c',
 )
 
 dpdk_conf.set10('RTE_EAL_NUMA_AWARE_HUGEPAGES', true)
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v3 5/5] ethdev: compiling ethdev under Windows

2020-08-03 Thread Fady Bader
Compiling needed libraries for ethdev under Windows.

Signed-off-by: Fady Bader 
---
 lib/meson.build | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/meson.build b/lib/meson.build
index 6bbaf242a9..c145240eb9 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -38,9 +38,10 @@ libraries = [
 if is_windows
libraries = [
'kvargs',
+   'telemetry',
'eal',
'ring',
-   'mempool', 'mbuf', 'pci', 'net',
+   'mempool', 'mbuf', 'net', 'meter', 'ethdev', 'pci', # core
] # only supported libraries for windows
 endif
 
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v3 0/5] compiling ethdev lib under windows

2020-08-03 Thread Fady Bader
Added needed changes in order to get ethdev compiling under windows.

Depends-on: series-10382 ("compile librte_net for windows")

v3: rebased on current master, added more exports to export list

v2: fixed logging issue in telemetry lib.

Fady Bader (5):
  eal: added interrupts empty stubs
  eal: updated export list for Windows
  ethdev: remove structs from export list
  telemetry: implement empty stubs for Windows
  ethdev: compiling ethdev under Windows

 lib/librte_eal/rte_eal_exports.def   | 11 +++
 lib/librte_eal/windows/eal_interrupts.c  | 17 +++
 lib/librte_eal/windows/meson.build   |  1 +
 lib/librte_ethdev/rte_ethdev_version.map |  2 --
 lib/librte_telemetry/rte_telemetry.h |  4 +++
 lib/librte_telemetry/telemetry.c | 52 +++-
 lib/librte_telemetry/telemetry_legacy.c  | 26 +++-
 lib/meson.build  |  3 +-
 8 files changed, 111 insertions(+), 5 deletions(-)
 create mode 100644 lib/librte_eal/windows/eal_interrupts.c

-- 
2.16.1.windows.4



Re: [dpdk-dev] The mbuf API needs some cleaning up

2020-08-03 Thread Thomas Monjalon
03/08/2020 10:42, Morten Brørup:
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz
> > Sent: Friday, July 31, 2020 5:25 PM
> > 
> > Hi Morten,
> > 
> > Thanks for the feedback.
> > 
> > On Mon, Jul 13, 2020 at 11:57:38AM +0200, Morten Brørup wrote:
> > 
> > > The MBUF library exposes some macros and constants without the RTE_
> > prefix. I
> > > propose cleaning up these, so better names get into the coming LTS
> > release.
> > 
> > Yes, Thomas talked about it some time ago and he even drafted a patch
> > to
> > fix it. We can target 20.11 for the changes, but I think we'll have to
> > keep a compat API until 21.11.
> > 
> 
> Great, then I will back off. No need for multiple patches fixing the same 
> things. :-)

Morten, you are very welcome to work on cleaning mbuf API.
I think we must focus on introducing the new names,
while keeping old names as alias for one year.

The other things to do is to provide a devtools script
to help converting applications to the new names.




[dpdk-dev] [PATCH] net/bnxt: update resource allocation settings

2020-08-03 Thread Somnath Kotur
From: Shahaji Bhosle 

Adjusted resource allocations for the hardware resources
like TCAM entries, action records, stat counters, exact match records to
scale up offload flows.
Also increased ipv4 nat entries to 1023.
This patch is a critical fix to enable driver load on current and all
FW versions going forward.

Fixes: cef3749d501e2 ("net/bnxt: update TruFlow resource allocation numbers")

Signed-off-by: Shahaji Bhosle 
Signed-off-by: Kishore Padmanabha 
Reviewed-by: Ajit Kumar Khaparde 
Reviewed-by: Michael Baucom 
Signed-off-by: Venkat Duvvuru 
---
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c | 42 +++---
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c 
b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
index c19cd1d..0d4a455 100644
--- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
+++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
@@ -86,68 +86,68 @@ ulp_ctx_session_open(struct bnxt *bp,
resources = ¶ms.resources;
/** RX **/
/* Identifiers */
-   resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_L2_CTXT_HIGH] = 200;
-   resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_L2_CTXT_LOW] = 20;
+   resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_L2_CTXT_HIGH] = 422;
+   resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_L2_CTXT_LOW] = 6;
resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_WC_PROF] = 8;
resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_PROF_FUNC] = 8;
resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_EM_PROF] = 8;
 
/* Table Types */
-   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_FULL_ACT_RECORD] = 720;
-   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_STATS_64] = 720;
-   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_MODIFY_IPV4] = 8;
+   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_FULL_ACT_RECORD] = 8192;
+   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_STATS_64] = 8192;
+   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_MODIFY_IPV4] = 1023;
 
/* ENCAP */
resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_ENCAP_8B] = 16;
-   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_ENCAP_16B] = 16;
+   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_ENCAP_16B] = 63;
 
/* TCAMs */
resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_L2_CTXT_TCAM_HIGH] =
-   200;
+   422;
resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_L2_CTXT_TCAM_LOW] =
-   20;
+   6;
resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_PROF_TCAM] = 8;
-   resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_WC_TCAM] = 416;
+   resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_WC_TCAM] = 88;
 
/* EM */
-   resources->em_cnt[TF_DIR_RX].cnt[TF_EM_TBL_TYPE_EM_RECORD] = 2048;
+   resources->em_cnt[TF_DIR_RX].cnt[TF_EM_TBL_TYPE_EM_RECORD] = 13176;
 
/* EEM */
resources->em_cnt[TF_DIR_RX].cnt[TF_EM_TBL_TYPE_TBL_SCOPE] = 1;
 
/** TX **/
/* Identifiers */
-   resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_L2_CTXT_HIGH] = 200;
-   resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_L2_CTXT_LOW] = 20;
+   resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_L2_CTXT_HIGH] = 292;
+   resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_L2_CTXT_LOW] = 144;
resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_WC_PROF] = 8;
resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_PROF_FUNC] = 8;
resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_EM_PROF] = 8;
 
/* Table Types */
-   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_FULL_ACT_RECORD] = 16;
-   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_STATS_64] = 16;
-   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_MODIFY_IPV4] = 8;
+   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_FULL_ACT_RECORD] = 8192;
+   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_STATS_64] = 8192;
+   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_MODIFY_IPV4] = 1023;
 
/* ENCAP */
-   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_ENCAP_64B] = 16;
-   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_ENCAP_16B] = 16;
+   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_ENCAP_64B] = 511;
+   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_ENCAP_16B] = 200;
 
/* TCAMs */
resources->tcam_cnt[TF_DIR_TX].cnt[TF_TCAM_TBL_TYPE_L2_CTXT_TCAM_HIGH] =
-   200;
+   292;
resources->tcam_cnt[TF_DIR_TX].cnt[TF_TCAM_TBL_TYPE_L2_CTXT_TCAM_LOW] =
-   20;
+   144;
resources->tcam_cnt[TF_DIR_TX].cnt[TF_TCAM_TBL_TYPE_PROF_TCAM] = 8;
resources->tcam_cnt[TF_DIR_TX].cnt[TF_TCAM_TBL_TYPE_WC_TCAM] = 8;
 
/* EM */
-   resources->em_cnt[TF_DIR_TX].cnt[TF_EM_TBL_TYPE_EM_RECORD] = 2048;
+   resources->em_cnt[TF_DIR_TX].cnt[TF_EM_TBL_T

[dpdk-dev] [PATCH] doc: announce changes to ethdev rxconf structure

2020-08-03 Thread Viacheslav Ovsiienko
The DPDK datapath in the transmit direction is very flexible.
The applications can build multisegment packets and manages
almost all data aspects - the memory pools where segments
are allocated from, the segment lengths, the memory attributes
like external, registered, etc.

In the receiving direction, the datapath is much less flexible,
the applications can only specify the memory pool to configure
the receiving queue and nothing more. In order to extend the
receiving datapath capabilities it is proposed to add the new
fields into rte_eth_rxconf structure:

struct rte_eth_rxconf {
...
uint16_t rx_split_num; /* number of segments to split */
uint16_t *rx_split_len; /* array of segment lengthes */
struct rte_mempool **mp; /* array of segment memory pools */
...
};

The non-zero value of rx_split_num field configures the receiving
queue to split ingress packets into multiple segments to the mbufs
allocated from various memory pools according to the specified
lengths. The zero value of rx_split_num field provides the
backward compatibility and queue should be configured in a regular
way (with single/multiple mbufs of the same data buffer length
allocated from the single memory pool).

The new approach would allow splitting the ingress packets into
multiple parts pushed to the memory with different attributes.
For example, the packet headers can be pushed to the embedded data
buffers within mbufs and the application data into the external
buffers attached to mbufs allocated from the different memory
pools. The memory attributes for the split parts may differ
either - for example the application data may be pushed into
the external memory located on the dedicated physical device,
say GPU or NVMe. This would improve the DPDK receiving datapath
flexibility preserving compatibility with existing API.

Signed-off-by: Viacheslav Ovsiienko 
---
 doc/guides/rel_notes/deprecation.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7..cd700ae 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -99,6 +99,11 @@ Deprecation Notices
   In 19.11 PMDs will still update the field even when the offload is not
   enabled.
 
+* ethdev: add new fields to ``rte_eth_rxconf`` to configure the receiving
+  queues to split ingress packets into multiple segments according to the
+  specified lengths into the buffers allocated from the specified
+  memory pools. The backward compatibility to existing API is preserved.
+
 * ethdev: ``rx_descriptor_done`` dev_ops and ``rte_eth_rx_descriptor_done``
   will be deprecated in 20.11 and will be removed in 21.11.
   Existing ``rte_eth_rx_descriptor_status`` and 
``rte_eth_tx_descriptor_status``
-- 
1.8.3.1



[dpdk-dev] [Bug 520] i40e: UDP PTPv2 packets sent to port 320 not timestamped

2020-08-03 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=520

Bug ID: 520
   Summary: i40e: UDP PTPv2 packets sent to port 320 not
timestamped
   Product: DPDK
   Version: 20.05
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: pawel...@gmail.com
  Target Milestone: ---

Hi,

PTPv2 UDP SYNC packets sent to port 320 are not timestamped with X710 NIC? The
same packets sent to port 319 are timestamped.

I checked registers configuration under debugger after i40e_timesync_enable()
is called:

PRTTSYN_CTL1.TSYNTYPE = 0x2
PRTTSYN_CTL1.UDP_ENA = 0x3

So both ports 319 and 320 should work fine.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [dpdk-dev] [v2] vdpa/mlx5: fix queue update synchronization

2020-08-03 Thread Xueming(Steven) Li
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Monday, August 3, 2020 5:51 PM
> To: Xueming(Steven) Li 
> Cc: dev@dpdk.org; a...@dpdk.org; "Penso  Subject: Re: [dpdk-dev] [v2] vdpa/mlx5: fix queue update synchronization
> 
> 
> 
> On 8/2/20 11:21 AM, Xueming Li wrote:
> > The driver CQ event management is done by non vhost library thread,
> > either the dpdk host thread or the internal vDPA driver thread.
> >
> > When a queue is updated the CQ may be destroyed and created by the
> > vhost library thread via the queue state operation.
> >
> > When the queue update feature was added, it didn't synchronize the CQ
> > management to the queue update what may cause invalid memory access.
> >
> > Add the aforementioned synchronization by a new per device
> > configuration mutex.
> >
> > Fixes: c47d6e83334e ("vdpa/mlx5: support queue update")
> >
> > Signed-off-by: Xueming Li 
> > Acked-by: Matan Azrad 
> > ---
> >  drivers/vdpa/mlx5/mlx5_vdpa.c   | 8 +++-
> >  drivers/vdpa/mlx5/mlx5_vdpa.h   | 1 +
> >  drivers/vdpa/mlx5/mlx5_vdpa_event.c | 8 
> >  3 files changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c
> > b/drivers/vdpa/mlx5/mlx5_vdpa.c index c0b87bcc01..a8f3e4b1de 100644
> > --- a/drivers/vdpa/mlx5/mlx5_vdpa.c
> > +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
> > @@ -133,6 +133,7 @@ mlx5_vdpa_set_vring_state(int vid, int vring, int state)
> > struct rte_vdpa_device *vdev = rte_vhost_get_vdpa_device(vid);
> > struct mlx5_vdpa_priv *priv =
> > mlx5_vdpa_find_priv_resource_by_vdev(vdev);
> > +   int ret;
> >
> > if (priv == NULL) {
> > DRV_LOG(ERR, "Invalid vDPA device: %s.", vdev->device-
> >name); @@
> > -142,7 +143,10 @@ mlx5_vdpa_set_vring_state(int vid, int vring, int state)
> > DRV_LOG(ERR, "Too big vring id: %d.", vring);
> > return -E2BIG;
> > }
> > -   return mlx5_vdpa_virtq_enable(priv, vring, state);
> > +   pthread_mutex_lock(&priv->vq_config_lock);
> > +   ret = mlx5_vdpa_virtq_enable(priv, vring, state);
> > +   pthread_mutex_unlock(&priv->vq_config_lock);
> > +   return ret;
> >  }
> >
> >  static int
> > @@ -742,6 +746,7 @@ mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv
> __rte_unused,
> > }
> > mlx5_vdpa_config_get(pci_dev->device.devargs, priv);
> > SLIST_INIT(&priv->mr_list);
> > +   pthread_mutex_init(&priv->vq_config_lock, NULL);
> > pthread_mutex_lock(&priv_list_lock);
> > TAILQ_INSERT_TAIL(&priv_list, priv, next);
> > pthread_mutex_unlock(&priv_list_lock);
> > @@ -793,6 +798,7 @@ mlx5_vdpa_pci_remove(struct rte_pci_device
> *pci_dev)
> > priv->var = NULL;
> > }
> > mlx5_glue->close_device(priv->ctx);
> > +   pthread_mutex_destroy(&priv->vq_config_lock);
> > rte_free(priv);
> > }
> > return 0;
> > diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h
> > b/drivers/vdpa/mlx5/mlx5_vdpa.h index 57044d9d33..462805a352 100644
> > --- a/drivers/vdpa/mlx5/mlx5_vdpa.h
> > +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h
> > @@ -120,6 +120,7 @@ enum {
> >  struct mlx5_vdpa_priv {
> > TAILQ_ENTRY(mlx5_vdpa_priv) next;
> > uint8_t configured;
> > +   pthread_mutex_t vq_config_lock;
> > uint64_t last_traffic_tic;
> > pthread_t timer_tid;
> > pthread_mutex_t timer_lock;
> > diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c
> > b/drivers/vdpa/mlx5/mlx5_vdpa_event.c
> > index 7dc1ac0fa9..4a8b7b0bd9 100644
> > --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c
> > +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c
> > @@ -274,6 +274,7 @@ mlx5_vdpa_poll_handle(void *arg)
> >  priv-
> >event_us;
> > while (1) {
> > max = 0;
> > +   pthread_mutex_lock(&priv->vq_config_lock);
> > for (i = 0; i < priv->nr_virtqs; i++) {
> > cq = &priv->virtqs[i].eqp.cq;
> > if (cq->cq && !cq->armed) {
> > @@ -298,6 +299,7 @@ mlx5_vdpa_poll_handle(void *arg)
> > priv->vdev->device->name);
> > mlx5_vdpa_arm_all_cqs(priv);
> > pthread_mutex_lock(&priv->timer_lock);
> > +   pthread_mutex_unlock(&priv-
> >vq_config_lock);
> 
> Is it mandatory to hold timer_lock before releasing vq_config_lock?
> If not, swapping would be maybe safer.

Yes, could you please replace lines in integration? We don't care according the 
lock usage in cq handling.

> 
> > priv->timer_on = 0;
> > while (!priv->timer_on)
> > pthread_cond_wait(&priv-
> >timer_cond,
> > @@ -312,6 +314,7 @@ mlx5_vdpa_poll_handle(void *arg)
> > } else {
> > priv->last_traffic_tic = current_tic;
> > }
> > +   pthread_mutex_unlock(&priv->vq_config_lock);
> > mlx5_v

[dpdk-dev] [PATCH] doc: announce API change in timer

2020-08-03 Thread Sarosh Arif
If the user tries to reset/stop some other timer in it's callback
function, which is also about to expire, using 
rte_timer_reset_sync/rte_timer_stop_sync the application goes into
an infinite loop. This happens because 
rte_timer_reset_sync/rte_timer_stop_sync loop until the timer 
resets/stops and there is check inside timer_set_config_state which
prevents a running timer from being reset/stopped by not it's own 
timer_cb. Therefore timer_set_config_state returns -1 due to which 
rte_timer_reset returns -1 and rte_timer_reset_sync goes into an 
infinite loop

To to prevent this rte_timer_reset_sync and rte_timer_stop_sync should
have int return types, so that -1 can be returned if the above condition
occurs

Signed-off-by: Sarosh Arif 
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7a4..ed93a707d 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -151,3 +151,9 @@ Deprecation Notices
   Python 2 support will be completely removed in 20.11.
   In 20.08, explicit deprecation warnings will be displayed when running
   scripts with Python 2.
+
+* timer: Since timer can get stuck in an infinite loop if the application 
tries to
+  reset/stop some other timer in it's callback function, which is also about to
+  expire. The function ``rte_timer_stop_sync`` and ``rte_timer_stop_sync``  
will
+  have a int return type in order to return with -1 in when this condition
+  occures.
-- 
2.17.1



[dpdk-dev] [PATCH] doc: announce removal of legacy ethdev filtering API

2020-08-03 Thread Thomas Monjalon
Deprecation of rte_eth_dev_filter_ctrl() was announced in 2016,
and confirmed last year by avoiding include of rte_eth_ctrl.h.
After 4 years, it is time to complete the removal of the API
which is replaced with rte_flow.

Signed-off-by: Thomas Monjalon 
---
 doc/guides/rel_notes/deprecation.rst | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7a48..6e12e57a13 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -81,8 +81,7 @@ Deprecation Notices
   as filter types MACVLAN, ETHERTYPE, FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR,
   HASH and L2_TUNNEL, is superseded by the generic flow API (rte_flow) in
   PMDs that implement the latter.
-  Target release for removal of the legacy API will be defined once most
-  PMDs have switched to rte_flow.
+  The legacy API will be removed in DPDK 20.11.
 
 * ethdev: Update API functions returning ``void`` to return ``int`` with
   negative errno values to indicate various error conditions (e.g.
-- 
2.27.0



Re: [dpdk-dev] [PATCH] doc: announce removal of legacy ethdev filtering API

2020-08-03 Thread Andrew Rybchenko
On 8/3/20 2:49 PM, Thomas Monjalon wrote:
> Deprecation of rte_eth_dev_filter_ctrl() was announced in 2016,
> and confirmed last year by avoiding include of rte_eth_ctrl.h.
> After 4 years, it is time to complete the removal of the API
> which is replaced with rte_flow.
>
> Signed-off-by: Thomas Monjalon 
>

Acked-by: Andrew Rybchenko 



Re: [dpdk-dev] [PATCH] doc: announce removal of legacy ethdev filtering API

2020-08-03 Thread Jerin Jacob
On Mon, Aug 3, 2020 at 5:24 PM Andrew Rybchenko
 wrote:
>
> On 8/3/20 2:49 PM, Thomas Monjalon wrote:
> > Deprecation of rte_eth_dev_filter_ctrl() was announced in 2016,
> > and confirmed last year by avoiding include of rte_eth_ctrl.h.
> > After 4 years, it is time to complete the removal of the API
> > which is replaced with rte_flow.
> >
> > Signed-off-by: Thomas Monjalon 
> >
>
> Acked-by: Andrew Rybchenko 

Acked-by: Jerin Jacob 


>


Re: [dpdk-dev] [PATCH] doc: announce changes to ethdev rxconf structure

2020-08-03 Thread Jerin Jacob
On Mon, Aug 3, 2020 at 4:28 PM Viacheslav Ovsiienko
 wrote:
>
> The DPDK datapath in the transmit direction is very flexible.
> The applications can build multisegment packets and manages
> almost all data aspects - the memory pools where segments
> are allocated from, the segment lengths, the memory attributes
> like external, registered, etc.
>
> In the receiving direction, the datapath is much less flexible,
> the applications can only specify the memory pool to configure
> the receiving queue and nothing more. In order to extend the
> receiving datapath capabilities it is proposed to add the new
> fields into rte_eth_rxconf structure:
>
> struct rte_eth_rxconf {
> ...
> uint16_t rx_split_num; /* number of segments to split */
> uint16_t *rx_split_len; /* array of segment lengthes */
> struct rte_mempool **mp; /* array of segment memory pools */

The pool has the packet length it's been configured for.
So I think, rx_split_len can be removed.

This feature also available in Marvell HW. So it not specific to one vendor.
Maybe we could just the use case mention the use case in the depreciation notice
and the tentative change in rte_eth_rxconf and exact details can be worked
out at the time of implementation.

With the above change:
Acked-by: Jerin Jacob 


> ...
> };
>
> The non-zero value of rx_split_num field configures the receiving
> queue to split ingress packets into multiple segments to the mbufs
> allocated from various memory pools according to the specified
> lengths. The zero value of rx_split_num field provides the
> backward compatibility and queue should be configured in a regular
> way (with single/multiple mbufs of the same data buffer length
> allocated from the single memory pool).
>
> The new approach would allow splitting the ingress packets into
> multiple parts pushed to the memory with different attributes.
> For example, the packet headers can be pushed to the embedded data
> buffers within mbufs and the application data into the external
> buffers attached to mbufs allocated from the different memory
> pools. The memory attributes for the split parts may differ
> either - for example the application data may be pushed into
> the external memory located on the dedicated physical device,
> say GPU or NVMe. This would improve the DPDK receiving datapath
> flexibility preserving compatibility with existing API.
>
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  doc/guides/rel_notes/deprecation.rst | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index ea4cfa7..cd700ae 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -99,6 +99,11 @@ Deprecation Notices
>In 19.11 PMDs will still update the field even when the offload is not
>enabled.
>
> +* ethdev: add new fields to ``rte_eth_rxconf`` to configure the receiving
> +  queues to split ingress packets into multiple segments according to the
> +  specified lengths into the buffers allocated from the specified
> +  memory pools. The backward compatibility to existing API is preserved.
> +
>  * ethdev: ``rx_descriptor_done`` dev_ops and ``rte_eth_rx_descriptor_done``
>will be deprecated in 20.11 and will be removed in 21.11.
>Existing ``rte_eth_rx_descriptor_status`` and 
> ``rte_eth_tx_descriptor_status``
> --
> 1.8.3.1
>


Re: [dpdk-dev] [PATCH V1 2/3] mbuf: change free_cb interface to adapt to GSO case

2020-08-03 Thread Olivier Matz
On Mon, Aug 03, 2020 at 05:42:13PM +0800, yang_y_yi wrote:
> At 2020-08-03 16:11:39, "Olivier Matz"  wrote:
> >On Mon, Aug 03, 2020 at 09:26:40AM +0800, yang_y_yi wrote:
> >> At 2020-08-03 04:29:07, "Olivier Matz"  wrote:
> >> >Hi,
> >> >
> >> >On Sun, Aug 02, 2020 at 07:12:36AM +0800, yang_y_yi wrote:
> >> >> 
> >> >> 
> >> >> At 2020-07-31 23:15:43, "Olivier Matz"  wrote:
> >> >> >Hi,
> >> >> >
> >> >> >On Thu, Jul 30, 2020 at 08:08:59PM +0800, yang_y...@163.com wrote:
> >> >> >> From: Yi Yang 
> >> >> >> 
> >> >> >> In GSO case, segmented mbufs are attached to original
> >> >> >> mbuf which can't be freed when it is external. The issue
> >> >> >> is free_cb doesn't know original mbuf and doesn't free
> >> >> >> it when refcnt of shinfo is 0.
> >> >> >> 
> >> >> >> Original mbuf can be freed by rte_pktmbuf_free segmented
> >> >> >> mbufs or by rte_pktmbuf_free original mbuf. Two kind of
> >> >> >> cases should have different behaviors. free_cb won't
> >> >> >> explicitly call rte_pktmbuf_free to free original mbuf
> >> >> >> if it is freed by rte_pktmbuf_free original mbuf, but it
> >> >> >> has to call rte_pktmbuf_free to free original mbuf if it
> >> >> >> is freed by rte_pktmbuf_free segmented mbufs.
> >> >> >> 
> >> >> >> In order to fix this issue, free_cb interface has to been
> >> >> >> changed, __rte_pktmbuf_free_extbuf must deliver called
> >> >> >> mbuf pointer to free_cb, argument opaque can be defined
> >> >> >> as a custom struct by user, it can includes original mbuf
> >> >> >> pointer, user-defined free_cb can compare caller mbuf with
> >> >> >> mbuf in opaque struct, free_cb should free original mbuf
> >> >> >> if they are not same, this corresponds to rte_pktmbuf_free
> >> >> >> segmented mbufs case, otherwise, free_cb won't free original
> >> >> >> mbuf because the caller explicitly called rte_pktmbuf_free
> >> >> >> to free it.
> >> >> >> 
> >> >> >> Here is pseduo code to show two kind of cases.
> >> >> >> 
> >> >> >> case 1. rte_pktmbuf_free segmented mbufs
> >> >> >> 
> >> >> >> nb_tx = rte_gso_segment(original_mbuf, /* original mbuf */
> >> >> >>   &gso_ctx,
> >> >> >>   /* segmented mbuf */
> >> >> >>   (struct rte_mbuf **)&gso_mbufs,
> >> >> >>   MAX_GSO_MBUFS);
> >> >> >
> >> >> >I'm sorry but it is not very clear to me what operations are done by
> >> >> >rte_gso_segment().
> >> >> >
> >> >> >In the current code, I only see calls to rte_pktmbuf_attach(),
> >> >> >which do not deal with external buffers. Am I missing something?
> >> >> >
> >> >> >Are you able to show the issue only with mbuf functions? It would
> >> >> >be helpful to understand what does not work.
> >> >> >
> >> >> >
> >> >> >Thanks,
> >> >> >Olivier
> >> >> >
> >> >> Oliver, thank you for comment, let me show you why it doesn't work for 
> >> >> my use case.  In OVS DPDK, VM uses vhostuserclient to send large 
> >> >> packets whose size is about 64K because we enabled TSO & UFO, these 
> >> >> large packets use rte_mbufs allocated by DPDK virtio_net function 
> >> >> virtio_dev_pktmbuf_alloc() (in file lib/librte_vhost/virtio_net.c. 
> >> >> Please refer to [PATCH V1 3/3], I changed free_cb as below, these 
> >> >> packets use the same allocate function and the same free_cb no matter 
> >> >> they are TCP packet or UDP packets, in case of VXLAN TSO, most NICs 
> >> >> can't support inner UDP fragment offload, so OVS DPDK has to do it by 
> >> >> software, for UDP case, the original rte_mbuf only can be freed by 
> >> >> segmented rte_mbufs which are output packets of rte_gso_segment, i.e. 
> >> >> the original rte_mbuf only can freed by free_cb, you can see, it 
> >> >> explicitly called rte_pktmbuf_free(arg->mbuf), the condition statement 
> >> >> "if (caller_m != arg->mbuf)" is true for this case, this has no 
> >> >> problem, but for TCP case, the original mbuf is delivered to 
> >> >> rte_eth_tx_burst() but not segmented rte_mbufs output by 
> >> >> rte_gso_segment, PMD driver will call 
> >> >> rte_pktmbuf_free(original_rte_mbuf) but not 
> >> >> rte_pktmbuf_free(segmented_rte_mbufs), the same free_cb will be called, 
> >> >> that means original_rte_mbuf will be freed twice, you know what will 
> >> >> happen, this is just the issue I'm fixing. I bring in caller_m 
> >> >> argument, it can help work around this because caller_m is arg->mbuf 
> >> >> and the condition statement "if (caller_m != arg->mbuf)" is false, you 
> >> >> can't fix it without the change this patch series did.
> >> >
> >> >I'm sill not sure to get your issue. Please, if you have a simple test
> >> >case using only mbufs functions (without virtio, gso, ...), it would be
> >> >very helpful because we will be sure that we are talking about the same
> >> >thing. In case there is an issue, it can easily become a unit test.
> >> 
> >> Oliver, I think you don't get the point, free operation can't be 
> >> controlled by the application itself, 
> >> it is done by 

Re: [dpdk-dev] [PATCH] doc: announce API change in mbuf

2020-08-03 Thread Andrew Rybchenko
On 8/2/20 10:28 PM, Olivier Matz wrote:
> On Fri, Jul 31, 2020 at 06:03:49PM +0200, Thomas Monjalon wrote:
>> In order to prepare for adding more features requiring more space in mbuf,
>> some static fields must become dynamic.
>> Some small layout changes may have performance benefits as well.
>>
>> The deprecation notice for atomic refcount is moved and reworded
>> to fit below the layout deprecation.
>>
>> Signed-off-by: Thomas Monjalon 
> 
> Acked-by: Olivier Matz 
> 

Acked-by: Andrew Rybchenko 


[dpdk-dev] [PATCH] doc: announce GENEVE header options support

2020-08-03 Thread Viacheslav Ovsiienko
In order to add support of the GENEVE header variable length
options the rte_flow_item_geneve_option item will be introduced:

struct rte_flow_item_geneve_option {
rte_be16_t option_class;
uint8_t option_type:7;
uint8_t critical:1;
uint8_t length:5;
uint8_t rsvd0:3;
uint8_t data[];
};

This option (one or multiple) must follow the rte_flow_item_geneve
item and provide the pattern to match with GENEVE header option.

Signed-off-by: Viacheslav Ovsiienko 
---
 doc/guides/rel_notes/deprecation.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index cd700ae..b6bdb83 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -109,6 +109,10 @@ Deprecation Notices
   Existing ``rte_eth_rx_descriptor_status`` and 
``rte_eth_tx_descriptor_status``
   APIs can be used as replacement.
 
+* ethdev: in order to provide support of the GENEVE header variable length
+  options in rte_flow API the rte_flow_item_geneve_option item will be
+  introduced.
+
 * ethdev: Some internal APIs for driver usage are exported in the .map file.
   Now DPDK has ``__rte_internal`` marker so we can mark internal APIs and move
   them to the INTERNAL block in .map. Although these APIs are internal it will
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH] doc: announce changes to ethdev rxconf structure

2020-08-03 Thread Slava Ovsiienko
Hi, Jerin,

Thanks for the comment,  please, see below.

> -Original Message-
> From: Jerin Jacob 
> Sent: Monday, August 3, 2020 14:57
> To: Slava Ovsiienko 
> Cc: dpdk-dev ; Matan Azrad ;
> Raslan Darawsheh ; Thomas Monjalon
> ; Ferruh Yigit ; Stephen
> Hemminger ; Andrew Rybchenko
> ; Ajit Khaparde
> ; Maxime Coquelin
> ; Olivier Matz ;
> David Marchand 
> Subject: Re: [PATCH] doc: announce changes to ethdev rxconf structure
> 
> On Mon, Aug 3, 2020 at 4:28 PM Viacheslav Ovsiienko
>  wrote:
> >
> > The DPDK datapath in the transmit direction is very flexible.
> > The applications can build multisegment packets and manages almost all
> > data aspects - the memory pools where segments are allocated from, the
> > segment lengths, the memory attributes like external, registered, etc.
> >
> > In the receiving direction, the datapath is much less flexible, the
> > applications can only specify the memory pool to configure the
> > receiving queue and nothing more. In order to extend the receiving
> > datapath capabilities it is proposed to add the new fields into
> > rte_eth_rxconf structure:
> >
> > struct rte_eth_rxconf {
> > ...
> > uint16_t rx_split_num; /* number of segments to split */
> > uint16_t *rx_split_len; /* array of segment lengthes */
> > struct rte_mempool **mp; /* array of segment memory pools */
> 
> The pool has the packet length it's been configured for.
> So I think, rx_split_len can be removed.

Yes, it is one of the supposed options - if pointer to array of segment lengths
is NULL , the queue_setup() could use the lengths from the pool's properties.
But we are talking about packet split, in general, it should not depend
on pool properties. What if application provides the single pool
and just wants to have the tunnel header in the first dedicated mbuf?

> 
> This feature also available in Marvell HW. So it not specific to one vendor.
> Maybe we could just the use case mention the use case in the depreciation
> notice and the tentative change in rte_eth_rxconf and exact details can be
> worked out at the time of implementation.
> 
So, if I understand correctly, the struct changes in the commit message
should be marked as just possible implementation?

With best regards,
Slava

> With the above change:
> Acked-by: Jerin Jacob 
> 
> 
> > ...
> > };
> >
> > The non-zero value of rx_split_num field configures the receiving
> > queue to split ingress packets into multiple segments to the mbufs
> > allocated from various memory pools according to the specified
> > lengths. The zero value of rx_split_num field provides the backward
> > compatibility and queue should be configured in a regular way (with
> > single/multiple mbufs of the same data buffer length allocated from
> > the single memory pool).
> >
> > The new approach would allow splitting the ingress packets into
> > multiple parts pushed to the memory with different attributes.
> > For example, the packet headers can be pushed to the embedded data
> > buffers within mbufs and the application data into the external
> > buffers attached to mbufs allocated from the different memory pools.
> > The memory attributes for the split parts may differ either - for
> > example the application data may be pushed into the external memory
> > located on the dedicated physical device, say GPU or NVMe. This would
> > improve the DPDK receiving datapath flexibility preserving
> > compatibility with existing API.
> >
> > Signed-off-by: Viacheslav Ovsiienko 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index ea4cfa7..cd700ae 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -99,6 +99,11 @@ Deprecation Notices
> >In 19.11 PMDs will still update the field even when the offload is not
> >enabled.
> >
> > +* ethdev: add new fields to ``rte_eth_rxconf`` to configure the
> > +receiving
> > +  queues to split ingress packets into multiple segments according to
> > +the
> > +  specified lengths into the buffers allocated from the specified
> > +  memory pools. The backward compatibility to existing API is preserved.
> > +
> >  * ethdev: ``rx_descriptor_done`` dev_ops and
> ``rte_eth_rx_descriptor_done``
> >will be deprecated in 20.11 and will be removed in 21.11.
> >Existing ``rte_eth_rx_descriptor_status`` and
> > ``rte_eth_tx_descriptor_status``
> > --
> > 1.8.3.1
> >


Re: [dpdk-dev] [PATCH] doc: announce API change in mbuf

2020-08-03 Thread David Marchand
On Fri, Jul 31, 2020 at 6:04 PM Thomas Monjalon  wrote:
>
> In order to prepare for adding more features requiring more space in mbuf,
> some static fields must become dynamic.
> Some small layout changes may have performance benefits as well.
>
> The deprecation notice for atomic refcount is moved and reworded
> to fit below the layout deprecation.
>
> Signed-off-by: Thomas Monjalon 

Acked-by: David Marchand 


-- 
David Marchand



[dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Ori Kam
Using the rte_flow action RSS types field,
may result in an undefined outcome.

For example selecting both UDP and TCP,
selecting TCP RSS type but the pattern is targeting UDP traffic.
another option is that the PMD doesn't support all requested types.

Until now, it wasn't clear what will happen in such cases.
This commit clarify this issue by stating that the PMD
will work in the best-effort mode.

Signed-off-by: Ori Kam 
---
 doc/guides/prog_guide/rte_flow.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index 3e5cd1e..d000560 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1735,6 +1735,16 @@ unspecified "best-effort" settings from the underlying 
PMD, which depending
 on the flow rule, may result in anything ranging from empty (single queue)
 to all-inclusive RSS.
 
+Best effort will be used, in case there is a conflict inside the rule.
+The conflict can be the result of:
+
+- Conflicting RSS types, for example setting both UDP and TCP.
+
+- Conflicting between RSS types and the requested pattern to match,
+  for example matching on UDP and hashing RSS on TCP.
+
+- Hashing on types that are not supported by the PMD.
+
 Note: RSS hash result is stored in the ``hash.rss`` mbuf field which
 overlaps ``hash.fdir.lo``. Since `Action: MARK`_ sets the ``hash.fdir.hi``
 field only, both can be requested simultaneously.
-- 
1.8.3.1



Re: [dpdk-dev] ***Spam*** [PATCH] doc: announce changes to ethdev rxconf structure

2020-08-03 Thread Andrew Rybchenko
On 8/3/20 1:58 PM, Viacheslav Ovsiienko wrote:
> The DPDK datapath in the transmit direction is very flexible.
> The applications can build multisegment packets and manages
> almost all data aspects - the memory pools where segments
> are allocated from, the segment lengths, the memory attributes
> like external, registered, etc.
>
> In the receiving direction, the datapath is much less flexible,
> the applications can only specify the memory pool to configure
> the receiving queue and nothing more. In order to extend the
> receiving datapath capabilities it is proposed to add the new
> fields into rte_eth_rxconf structure:
>
> struct rte_eth_rxconf {
> ...
> uint16_t rx_split_num; /* number of segments to split */
> uint16_t *rx_split_len; /* array of segment lengthes */
> struct rte_mempool **mp; /* array of segment memory pools */
> ...
> };
>
> The non-zero value of rx_split_num field configures the receiving
> queue to split ingress packets into multiple segments to the mbufs
> allocated from various memory pools according to the specified
> lengths. The zero value of rx_split_num field provides the
> backward compatibility and queue should be configured in a regular
> way (with single/multiple mbufs of the same data buffer length
> allocated from the single memory pool).

>From the above description it is not 100% clear how it will
coexist with:
 - existing mb_pool argument of the rte_eth_rx_queue_setup()
 - DEV_RX_OFFLOAD_SCATTER
 - DEV_RX_OFFLOAD_HEADER_SPLIT
How will application know that the feature is supported? Limitations?
Is it always split by specified/fixed length?
What happens if header length is actually different?

> The new approach would allow splitting the ingress packets into
> multiple parts pushed to the memory with different attributes.
> For example, the packet headers can be pushed to the embedded data
> buffers within mbufs and the application data into the external
> buffers attached to mbufs allocated from the different memory
> pools. The memory attributes for the split parts may differ
> either - for example the application data may be pushed into
> the external memory located on the dedicated physical device,
> say GPU or NVMe. This would improve the DPDK receiving datapath
> flexibility preserving compatibility with existing API.
>
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  doc/guides/rel_notes/deprecation.rst | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index ea4cfa7..cd700ae 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -99,6 +99,11 @@ Deprecation Notices
>In 19.11 PMDs will still update the field even when the offload is not
>enabled.
>  
> +* ethdev: add new fields to ``rte_eth_rxconf`` to configure the receiving
> +  queues to split ingress packets into multiple segments according to the
> +  specified lengths into the buffers allocated from the specified
> +  memory pools. The backward compatibility to existing API is preserved.
> +
>  * ethdev: ``rx_descriptor_done`` dev_ops and ``rte_eth_rx_descriptor_done``
>will be deprecated in 20.11 and will be removed in 21.11.
>Existing ``rte_eth_rx_descriptor_status`` and 
> ``rte_eth_tx_descriptor_status``



Re: [dpdk-dev] Userspace testing

2020-08-03 Thread Owen Hilyard
On Fri, Jul 31, 2020 at 5:12 AM Burakov, Anatoly 
wrote:

> On 30-Jul-20 5:54 PM, Owen Hilyard wrote:
> > Thanks for the advice.
> >
> > I was wondering about the state of the "Setup VFIO permissions" option
> > in the setup script. It seems to just modify the character device's
> > permissions and then check their memory limit. Should this option also
> > handle the hugepages setup?
>
> I was under the (mis?)impression that the hugepage setup part of the
> script did that?
>
>
It doesn't appear to set them up so that non-root users can use them. From
what I can tell it only creates the pages, but doesn't change permissions
on them in any way.


Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Andrew Rybchenko
On 8/3/20 5:28 PM, Ori Kam wrote:
> Using the rte_flow action RSS types field,
> may result in an undefined outcome.
> 
> For example selecting both UDP and TCP,
> selecting TCP RSS type but the pattern is targeting UDP traffic.
> another option is that the PMD doesn't support all requested types.
> 
> Until now, it wasn't clear what will happen in such cases.
> This commit clarify this issue by stating that the PMD
> will work in the best-effort mode.
> 
> Signed-off-by: Ori Kam 
> ---
>  doc/guides/prog_guide/rte_flow.rst | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/rte_flow.rst 
> b/doc/guides/prog_guide/rte_flow.rst
> index 3e5cd1e..d000560 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -1735,6 +1735,16 @@ unspecified "best-effort" settings from the underlying 
> PMD, which depending
>  on the flow rule, may result in anything ranging from empty (single queue)
>  to all-inclusive RSS.
>  
> +Best effort will be used, in case there is a conflict inside the rule.
> +The conflict can be the result of:
> +
> +- Conflicting RSS types, for example setting both UDP and TCP.
> +
> +- Conflicting between RSS types and the requested pattern to match,
> +  for example matching on UDP and hashing RSS on TCP.

IMHO it is not a problem at all, but good to clarify anyway.

> +
> +- Hashing on types that are not supported by the PMD.

Shouldn't it return error to the caller?

> +
>  Note: RSS hash result is stored in the ``hash.rss`` mbuf field which
>  overlaps ``hash.fdir.lo``. Since `Action: MARK`_ sets the ``hash.fdir.hi``
>  field only, both can be requested simultaneously.
> 



Re: [dpdk-dev] [v2] vdpa/mlx5: fix queue update synchronization

2020-08-03 Thread Maxime Coquelin



On 8/3/20 1:12 PM, Xueming(Steven) Li wrote:
> Hi Maxime,
> 
>> -Original Message-
>> From: Maxime Coquelin 
>> Sent: Monday, August 3, 2020 5:51 PM
>> To: Xueming(Steven) Li 
>> Cc: dev@dpdk.org; a...@dpdk.org; "Penso > Subject: Re: [dpdk-dev] [v2] vdpa/mlx5: fix queue update synchronization
>>
>>
>>
>> On 8/2/20 11:21 AM, Xueming Li wrote:
>>> The driver CQ event management is done by non vhost library thread,
>>> either the dpdk host thread or the internal vDPA driver thread.
>>>
>>> When a queue is updated the CQ may be destroyed and created by the
>>> vhost library thread via the queue state operation.
>>>
>>> When the queue update feature was added, it didn't synchronize the CQ
>>> management to the queue update what may cause invalid memory access.
>>>
>>> Add the aforementioned synchronization by a new per device
>>> configuration mutex.
>>>
>>> Fixes: c47d6e83334e ("vdpa/mlx5: support queue update")
>>>
>>> Signed-off-by: Xueming Li 
>>> Acked-by: Matan Azrad 
>>> ---
>>>  drivers/vdpa/mlx5/mlx5_vdpa.c   | 8 +++-
>>>  drivers/vdpa/mlx5/mlx5_vdpa.h   | 1 +
>>>  drivers/vdpa/mlx5/mlx5_vdpa_event.c | 8 
>>>  3 files changed, 16 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c
>>> b/drivers/vdpa/mlx5/mlx5_vdpa.c index c0b87bcc01..a8f3e4b1de 100644
>>> --- a/drivers/vdpa/mlx5/mlx5_vdpa.c
>>> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
>>> @@ -133,6 +133,7 @@ mlx5_vdpa_set_vring_state(int vid, int vring, int state)
>>> struct rte_vdpa_device *vdev = rte_vhost_get_vdpa_device(vid);
>>> struct mlx5_vdpa_priv *priv =
>>> mlx5_vdpa_find_priv_resource_by_vdev(vdev);
>>> +   int ret;
>>>
>>> if (priv == NULL) {
>>> DRV_LOG(ERR, "Invalid vDPA device: %s.", vdev->device-
>>> name); @@
>>> -142,7 +143,10 @@ mlx5_vdpa_set_vring_state(int vid, int vring, int state)
>>> DRV_LOG(ERR, "Too big vring id: %d.", vring);
>>> return -E2BIG;
>>> }
>>> -   return mlx5_vdpa_virtq_enable(priv, vring, state);
>>> +   pthread_mutex_lock(&priv->vq_config_lock);
>>> +   ret = mlx5_vdpa_virtq_enable(priv, vring, state);
>>> +   pthread_mutex_unlock(&priv->vq_config_lock);
>>> +   return ret;
>>>  }
>>>
>>>  static int
>>> @@ -742,6 +746,7 @@ mlx5_vdpa_pci_probe(struct rte_pci_driver *pci_drv
>> __rte_unused,
>>> }
>>> mlx5_vdpa_config_get(pci_dev->device.devargs, priv);
>>> SLIST_INIT(&priv->mr_list);
>>> +   pthread_mutex_init(&priv->vq_config_lock, NULL);
>>> pthread_mutex_lock(&priv_list_lock);
>>> TAILQ_INSERT_TAIL(&priv_list, priv, next);
>>> pthread_mutex_unlock(&priv_list_lock);
>>> @@ -793,6 +798,7 @@ mlx5_vdpa_pci_remove(struct rte_pci_device
>> *pci_dev)
>>> priv->var = NULL;
>>> }
>>> mlx5_glue->close_device(priv->ctx);
>>> +   pthread_mutex_destroy(&priv->vq_config_lock);
>>> rte_free(priv);
>>> }
>>> return 0;
>>> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h
>>> b/drivers/vdpa/mlx5/mlx5_vdpa.h index 57044d9d33..462805a352 100644
>>> --- a/drivers/vdpa/mlx5/mlx5_vdpa.h
>>> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.h
>>> @@ -120,6 +120,7 @@ enum {
>>>  struct mlx5_vdpa_priv {
>>> TAILQ_ENTRY(mlx5_vdpa_priv) next;
>>> uint8_t configured;
>>> +   pthread_mutex_t vq_config_lock;
>>> uint64_t last_traffic_tic;
>>> pthread_t timer_tid;
>>> pthread_mutex_t timer_lock;
>>> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c
>>> b/drivers/vdpa/mlx5/mlx5_vdpa_event.c
>>> index 7dc1ac0fa9..4a8b7b0bd9 100644
>>> --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c
>>> +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c
>>> @@ -274,6 +274,7 @@ mlx5_vdpa_poll_handle(void *arg)
>>>  priv-
>>> event_us;
>>> while (1) {
>>> max = 0;
>>> +   pthread_mutex_lock(&priv->vq_config_lock);
>>> for (i = 0; i < priv->nr_virtqs; i++) {
>>> cq = &priv->virtqs[i].eqp.cq;
>>> if (cq->cq && !cq->armed) {
>>> @@ -298,6 +299,7 @@ mlx5_vdpa_poll_handle(void *arg)
>>> priv->vdev->device->name);
>>> mlx5_vdpa_arm_all_cqs(priv);
>>> pthread_mutex_lock(&priv->timer_lock);
>>> +   pthread_mutex_unlock(&priv-
>>> vq_config_lock);
>>
>> Is it mandatory to hold timer_lock before releasing vq_config_lock?
>> If not, swapping would be maybe safer.
> 
> Yes, could you please replace lines in integration? We don't care according 
> the lock usage in cq handling.

OK, will do the change while applying:
Reviewed-by: Maxime Coquelin 

Thanks,
Maxime



Re: [dpdk-dev] [PATCH] vdpa/mlx5: fix virtq unset

2020-08-03 Thread Maxime Coquelin



On 8/3/20 10:25 AM, Matan Azrad wrote:
> When a virtq is destroyed, the SW should be able to continue the virtq
> processing from where the HW stopped.
> 
> The current destroy behavior in the driver saves the virtq state (used
> and available indexes) only when LM is requested.
> So, when LM is not requested the queue state is not saved and the SW
> indexes stay invalid.
> 
> Save the virtq state in the virtq destroy process.
> 
> Fixes: bff735011078 ("vdpa/mlx5: prepare virtio queues")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Matan Azrad 
> Acked-by: Xueming Li 
> ---
>  drivers/vdpa/mlx5/mlx5_vdpa.h   |  1 +
>  drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 17 -
>  2 files changed, 13 insertions(+), 5 deletions(-)
> 

Reviewed-by: Maxime Coquelin 

Thanks,
Maxime



[dpdk-dev] [PATCH] doc: announce Vhost dequeue zero-copy removal

2020-08-03 Thread Maxime Coquelin
Vhost-user dequeue zero-copy support will be removed in
20.11. The only known user is OVS where the feature is
still experimental, and has not received any update for
several years. This feature faces reliability issues and
is often conflicting with new features being implemented.

Signed-off-by: Maxime Coquelin 
---

Hi, the topic was discussed during OVS-DPDK public meeting
on July 22nd. Ian, if you had time to discuss the topic with
your team and agree with the removal, please ack the patch.
If, not please let me know.

 doc/guides/rel_notes/deprecation.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7a48..a923849419 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -151,3 +151,8 @@ Deprecation Notices
   Python 2 support will be completely removed in 20.11.
   In 20.08, explicit deprecation warnings will be displayed when running
   scripts with Python 2.
+
+* vhost: Vhost-user dequeue zero-copy support will be removed in 20.11. The
+  only known user is OVS where the feature is still experimental, and has not
+  received any update for 2.5 years. This feature faces reliability issues and
+  is often conflicting with new features being implemented.
-- 
2.26.2



Re: [dpdk-dev] [PATCH] doc: announce changes to ethdev rxconf structure

2020-08-03 Thread Slava Ovsiienko
Hi, Andrew

Thanks for the comment, please, see below.

> -Original Message-
> From: Andrew Rybchenko 
> Sent: Monday, August 3, 2020 17:31
> To: Slava Ovsiienko ; dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Thomas Monjalon ;
> ferruh.yi...@intel.com; jerinjac...@gmail.com;
> step...@networkplumber.org; ajit.khapa...@broadcom.com;
> maxime.coque...@redhat.com; olivier.m...@6wind.com;
> david.march...@redhat.com
> Subject: Re: ***Spam*** [PATCH] doc: announce changes to ethdev rxconf
> structure
> 
> On 8/3/20 1:58 PM, Viacheslav Ovsiienko wrote:
> > The DPDK datapath in the transmit direction is very flexible.
> > The applications can build multisegment packets and manages almost all
> > data aspects - the memory pools where segments are allocated from, the
> > segment lengths, the memory attributes like external, registered, etc.
> >
> > In the receiving direction, the datapath is much less flexible, the
> > applications can only specify the memory pool to configure the
> > receiving queue and nothing more. In order to extend the receiving
> > datapath capabilities it is proposed to add the new fields into
> > rte_eth_rxconf structure:
> >
> > struct rte_eth_rxconf {
> > ...
> > uint16_t rx_split_num; /* number of segments to split */
> > uint16_t *rx_split_len; /* array of segment lengthes */
> > struct rte_mempool **mp; /* array of segment memory pools */
> > ...
> > };
> >
> > The non-zero value of rx_split_num field configures the receiving
> > queue to split ingress packets into multiple segments to the mbufs
> > allocated from various memory pools according to the specified
> > lengths. The zero value of rx_split_num field provides the backward
> > compatibility and queue should be configured in a regular way (with
> > single/multiple mbufs of the same data buffer length allocated from
> > the single memory pool).
> 
> From the above description it is not 100% clear how it will coexist with:
>  - existing mb_pool argument of the rte_eth_rx_queue_setup()
>  - DEV_RX_OFFLOAD_SCATTER

DEV_RX_OFFLOAD_SCATTER flag is required to be reported and configured
for the new feature to indicate the application is prepared for the 
multisegment packets.

But SCATTER it just tells that ingress packet length can exceed
the mbuf data buffer length and the chain of mbufs must be built to store
the entire packet. But there is the limitation - all mbufs are allocated
 from the same memory pool, and all data buffers have the same length.
The new feature provides an opportunity to allocated mbufs from the desired
pools and specifies the length of each buffer/part.

>  - DEV_RX_OFFLOAD_HEADER_SPLIT
The new feature (let's name it as "BUFFER_SPLIT") might be supported
in conjunction with HEADER_SPLIT (say, split the rest of the data after the 
header)
or rejected if HEADER_SPLIT is configured on the port, depending on PMD
implementation (return ENOTSUP if both features are requested on the same port).

> How will application know that the feature is supported? Limitations?
It is subject for further discussion, I see two options:
 - introduce the DEV_RX_OFFLOAD_BUFFER_SPLIT flag
- return ENOTSUP/EINVAL from rx_queue_setup() if feature is requested
  (mp parameter is supposed to be NULL for the case)

> Is it always split by specified/fixed length?
Yes, it is simple feature, it splits the data to the buffers with required
memory attributes provided by specified pools according to the fixed lengths.
It should be OK for protocols like eCPRI or some tunneling.

> What happens if header length is actually different?
It is per queue configuration, packets might be sorted with rte_flow engine 
between the queues.
The supposed use case is to filter out specific protocol packets (say eCPRI 
with fixed header length)
and split ones on specific Rx queue.


With best regards,
Slava

> 
> > The new approach would allow splitting the ingress packets into
> > multiple parts pushed to the memory with different attributes.
> > For example, the packet headers can be pushed to the embedded data
> > buffers within mbufs and the application data into the external
> > buffers attached to mbufs allocated from the different memory pools.
> > The memory attributes for the split parts may differ either - for
> > example the application data may be pushed into the external memory
> > located on the dedicated physical device, say GPU or NVMe. This would
> > improve the DPDK receiving datapath flexibility preserving
> > compatibility with existing API.
> >
> > Signed-off-by: Viacheslav Ovsiienko 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index ea4cfa7..cd700ae 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -99,6 +99,11 @@ Deprecation Notices
> >In 19.11 PMDs will still update the field even when

Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Ori Kam
Hi Andrew,

> -Original Message-
> From: Andrew Rybchenko 
> 
> On 8/3/20 5:28 PM, Ori Kam wrote:
> > Using the rte_flow action RSS types field,
> > may result in an undefined outcome.
> >
> > For example selecting both UDP and TCP,
> > selecting TCP RSS type but the pattern is targeting UDP traffic.
> > another option is that the PMD doesn't support all requested types.
> >
> > Until now, it wasn't clear what will happen in such cases.
> > This commit clarify this issue by stating that the PMD
> > will work in the best-effort mode.
> >
> > Signed-off-by: Ori Kam 
> > ---
> >  doc/guides/prog_guide/rte_flow.rst | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> > index 3e5cd1e..d000560 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -1735,6 +1735,16 @@ unspecified "best-effort" settings from the
> underlying PMD, which depending
> >  on the flow rule, may result in anything ranging from empty (single queue)
> >  to all-inclusive RSS.
> >
> > +Best effort will be used, in case there is a conflict inside the rule.
> > +The conflict can be the result of:
> > +
> > +- Conflicting RSS types, for example setting both UDP and TCP.
> > +
> > +- Conflicting between RSS types and the requested pattern to match,
> > +  for example matching on UDP and hashing RSS on TCP.
> 
> IMHO it is not a problem at all, but good to clarify anyway.
> 
Agree this patch is just for clarification.

> > +
> > +- Hashing on types that are not supported by the PMD.
> 
> Shouldn't it return error to the caller?
> 
That’s depends, if for example the application requested eth and IPv4,
and the PMD can do only IPv4 so according to this, the PMD will just do IPv4.

> > +
> >  Note: RSS hash result is stored in the ``hash.rss`` mbuf field which
> >  overlaps ``hash.fdir.lo``. Since `Action: MARK`_ sets the ``hash.fdir.hi``
> >  field only, both can be requested simultaneously.
> >



Re: [dpdk-dev] [PATCH] doc: announce changes to ethdev rxconf structure

2020-08-03 Thread Andrew Rybchenko
Hi Slava,

On 8/3/20 6:18 PM, Slava Ovsiienko wrote:
> Hi, Andrew
> 
> Thanks for the comment, please, see below.
> 
>> -Original Message-
>> From: Andrew Rybchenko 
>> Sent: Monday, August 3, 2020 17:31
>> To: Slava Ovsiienko ; dev@dpdk.org
>> Cc: Matan Azrad ; Raslan Darawsheh
>> ; Thomas Monjalon ;
>> ferruh.yi...@intel.com; jerinjac...@gmail.com;
>> step...@networkplumber.org; ajit.khapa...@broadcom.com;
>> maxime.coque...@redhat.com; olivier.m...@6wind.com;
>> david.march...@redhat.com
>> Subject: Re: ***Spam*** [PATCH] doc: announce changes to ethdev rxconf
>> structure
>>
>> On 8/3/20 1:58 PM, Viacheslav Ovsiienko wrote:
>>> The DPDK datapath in the transmit direction is very flexible.
>>> The applications can build multisegment packets and manages almost all
>>> data aspects - the memory pools where segments are allocated from, the
>>> segment lengths, the memory attributes like external, registered, etc.
>>>
>>> In the receiving direction, the datapath is much less flexible, the
>>> applications can only specify the memory pool to configure the
>>> receiving queue and nothing more. In order to extend the receiving
>>> datapath capabilities it is proposed to add the new fields into
>>> rte_eth_rxconf structure:
>>>
>>> struct rte_eth_rxconf {
>>> ...
>>> uint16_t rx_split_num; /* number of segments to split */
>>> uint16_t *rx_split_len; /* array of segment lengthes */
>>> struct rte_mempool **mp; /* array of segment memory pools */
>>> ...
>>> };
>>>
>>> The non-zero value of rx_split_num field configures the receiving
>>> queue to split ingress packets into multiple segments to the mbufs
>>> allocated from various memory pools according to the specified
>>> lengths. The zero value of rx_split_num field provides the backward
>>> compatibility and queue should be configured in a regular way (with
>>> single/multiple mbufs of the same data buffer length allocated from
>>> the single memory pool).
>>
>> From the above description it is not 100% clear how it will coexist with:
>>  - existing mb_pool argument of the rte_eth_rx_queue_setup()
>>  - DEV_RX_OFFLOAD_SCATTER
> 
> DEV_RX_OFFLOAD_SCATTER flag is required to be reported and configured
> for the new feature to indicate the application is prepared for the 
> multisegment packets.

I hope it will be mentioned in the feature documentation in the future,
but I'm not 100% sure that it is required. See below.

> 
> But SCATTER it just tells that ingress packet length can exceed
> the mbuf data buffer length and the chain of mbufs must be built to store
> the entire packet. But there is the limitation - all mbufs are allocated
>  from the same memory pool, and all data buffers have the same length.
> The new feature provides an opportunity to allocated mbufs from the desired
> pools and specifies the length of each buffer/part.

Yes, it is clear, but what happens if packet does not fit into
the provided pools chain? Is the last used many times? May be
it logical to use Rx queue setup mb_pool as well for the
purpose? I.e. use suggested here pools only once and use
mb_pool many times for the rest if SCATTER is supported and
only once if SCATTER is not supported.

> 
>>  - DEV_RX_OFFLOAD_HEADER_SPLIT
> The new feature (let's name it as "BUFFER_SPLIT") might be supported
> in conjunction with HEADER_SPLIT (say, split the rest of the data after the 
> header)
> or rejected if HEADER_SPLIT is configured on the port, depending on PMD
> implementation (return ENOTSUP if both features are requested on the same 
> port).

OK, consider to make SCATTER and BUFFER_SPLIT independent as
suggested above.

> 
>> How will application know that the feature is supported? Limitations?
> It is subject for further discussion, I see two options:
>  - introduce the DEV_RX_OFFLOAD_BUFFER_SPLIT flag

+1

> - return ENOTSUP/EINVAL from rx_queue_setup() if feature is requested
>   (mp parameter is supposed to be NULL for the case)

I'd say that it should be used for corner cases only which are
hard to formalize. It could be important to know maximum
number of buffers to split, total length which could be split
from the remaining, limitations on split lengths.

> 
>> Is it always split by specified/fixed length?
> Yes, it is simple feature, it splits the data to the buffers with required
> memory attributes provided by specified pools according to the fixed lengths.
> It should be OK for protocols like eCPRI or some tunneling.

I see. Thanks.

> 
>> What happens if header length is actually different?
> It is per queue configuration, packets might be sorted with rte_flow engine 
> between the queues.
> The supposed use case is to filter out specific protocol packets (say eCPRI 
> with fixed header length)
> and split ones on specific Rx queue.

Got it.

Thanks,
Andrew.

> 
> 
> With best regards,
> Slava
> 
>>
>>> The new approach would allow splitting the ingress packets into
>>> multiple parts pushed to the memory with different attributes.
>>> For exa

[dpdk-dev] [PATCH] doc: announce deprecation of port mirroring API

2020-08-03 Thread Thomas Monjalon
A new API is planned to be introduced for sampling and mirroring
with rte_flow. It should be more generic and allow more use cases.

This deprecation is to show the direction, avoiding overlapping APIs.

Signed-off-by: Thomas Monjalon 
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 6e12e57a13..ca0d8af96a 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -103,6 +103,12 @@ Deprecation Notices
   Existing ``rte_eth_rx_descriptor_status`` and 
``rte_eth_tx_descriptor_status``
   APIs can be used as replacement.
 
+* ethdev: The port mirroring API can be replaced with a more fine grain flow 
API.
+  The structs ``rte_eth_mirror_conf``, ``rte_eth_vlan_mirror`` and the 
functions
+  ``rte_eth_mirror_rule_set``, ``rte_eth_mirror_rule_reset`` will be marked
+  as deprecated in DPDK 20.11, along with the associated macros 
``ETH_MIRROR_*``.
+  This API will be fully removed in DPDK 21.11.
+
 * ethdev: Some internal APIs for driver usage are exported in the .map file.
   Now DPDK has ``__rte_internal`` marker so we can mark internal APIs and move
   them to the INTERNAL block in .map. Although these APIs are internal it will
-- 
2.27.0



Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Andrew Rybchenko
On 8/3/20 6:22 PM, Ori Kam wrote:
> Hi Andrew,
>
>> -Original Message-
>> From: Andrew Rybchenko 
>>
>> On 8/3/20 5:28 PM, Ori Kam wrote:
>>> Using the rte_flow action RSS types field,
>>> may result in an undefined outcome.
>>>
>>> For example selecting both UDP and TCP,
>>> selecting TCP RSS type but the pattern is targeting UDP traffic.
>>> another option is that the PMD doesn't support all requested types.
>>>
>>> Until now, it wasn't clear what will happen in such cases.
>>> This commit clarify this issue by stating that the PMD
>>> will work in the best-effort mode.
>>>
>>> Signed-off-by: Ori Kam 
>>> ---
>>>  doc/guides/prog_guide/rte_flow.rst | 10 ++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/doc/guides/prog_guide/rte_flow.rst
>> b/doc/guides/prog_guide/rte_flow.rst
>>> index 3e5cd1e..d000560 100644
>>> --- a/doc/guides/prog_guide/rte_flow.rst
>>> +++ b/doc/guides/prog_guide/rte_flow.rst
>>> @@ -1735,6 +1735,16 @@ unspecified "best-effort" settings from the
>> underlying PMD, which depending
>>>  on the flow rule, may result in anything ranging from empty (single queue)
>>>  to all-inclusive RSS.
>>>
>>> +Best effort will be used, in case there is a conflict inside the rule.
>>> +The conflict can be the result of:
>>> +
>>> +- Conflicting RSS types, for example setting both UDP and TCP.
>>> +
>>> +- Conflicting between RSS types and the requested pattern to match,
>>> +  for example matching on UDP and hashing RSS on TCP.
>> IMHO it is not a problem at all, but good to clarify anyway.
>>
> Agree this patch is just for clarification.
>
>>> +
>>> +- Hashing on types that are not supported by the PMD.
>> Shouldn't it return error to the caller?
>>
> That’s depends, if for example the application requested eth and IPv4,
> and the PMD can do only IPv4 so according to this, the PMD will just do IPv4.

I think it is a bad behaviour. Application has required information to
provide correct parameters and therefore incorrect should be rejected.
Am I missing something?

>>> +
>>>  Note: RSS hash result is stored in the ``hash.rss`` mbuf field which
>>>  overlaps ``hash.fdir.lo``. Since `Action: MARK`_ sets the ``hash.fdir.hi``
>>>  field only, both can be requested simultaneously.
>>>



Re: [dpdk-dev] [PATCH] doc: announce deprecation of port mirroring API

2020-08-03 Thread Jerin Jacob
On Mon, Aug 3, 2020 at 9:03 PM Thomas Monjalon  wrote:
>
> A new API is planned to be introduced for sampling and mirroring
> with rte_flow. It should be more generic and allow more use cases.
>
> This deprecation is to show the direction, avoiding overlapping APIs.
>
> Signed-off-by: Thomas Monjalon 

Acked-by: Jerin Jacob 


> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 6e12e57a13..ca0d8af96a 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -103,6 +103,12 @@ Deprecation Notices
>Existing ``rte_eth_rx_descriptor_status`` and 
> ``rte_eth_tx_descriptor_status``
>APIs can be used as replacement.
>
> +* ethdev: The port mirroring API can be replaced with a more fine grain flow 
> API.
> +  The structs ``rte_eth_mirror_conf``, ``rte_eth_vlan_mirror`` and the 
> functions
> +  ``rte_eth_mirror_rule_set``, ``rte_eth_mirror_rule_reset`` will be marked
> +  as deprecated in DPDK 20.11, along with the associated macros 
> ``ETH_MIRROR_*``.
> +  This API will be fully removed in DPDK 21.11.
> +
>  * ethdev: Some internal APIs for driver usage are exported in the .map file.
>Now DPDK has ``__rte_internal`` marker so we can mark internal APIs and 
> move
>them to the INTERNAL block in .map. Although these APIs are internal it 
> will
> --
> 2.27.0
>


Re: [dpdk-dev] [PATCH] doc: announce deprecation of port mirroring API

2020-08-03 Thread Andrew Rybchenko
On 8/3/20 6:33 PM, Thomas Monjalon wrote:
> A new API is planned to be introduced for sampling and mirroring
> with rte_flow. It should be more generic and allow more use cases.
>
> This deprecation is to show the direction, avoiding overlapping APIs.
>
> Signed-off-by: Thomas Monjalon 

I like the idea. Not everything is 100% clear, but the direction is right.

Acked-by: Andrew Rybchenko 



Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread David Marchand
On Mon, Aug 3, 2020 at 5:23 PM Ori Kam  wrote:
> > > +
> > > +- Hashing on types that are not supported by the PMD.
> >
> > Shouldn't it return error to the caller?
> >
> That’s depends, if for example the application requested eth and IPv4,
> and the PMD can do only IPv4 so according to this, the PMD will just do IPv4.

We should have some validation in ethdev to catch this.
Is it not the case?


-- 
David Marchand



Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Ori Kam


> -Original Message-
> From: Andrew Rybchenko 
> 
> On 8/3/20 6:22 PM, Ori Kam wrote:
> > Hi Andrew,
> >
> >> -Original Message-
> >> From: Andrew Rybchenko 
> >>
> >> On 8/3/20 5:28 PM, Ori Kam wrote:
> >>> Using the rte_flow action RSS types field,
> >>> may result in an undefined outcome.
> >>>
> >>> For example selecting both UDP and TCP,
> >>> selecting TCP RSS type but the pattern is targeting UDP traffic.
> >>> another option is that the PMD doesn't support all requested types.
> >>>
> >>> Until now, it wasn't clear what will happen in such cases.
> >>> This commit clarify this issue by stating that the PMD
> >>> will work in the best-effort mode.
> >>>
> >>> Signed-off-by: Ori Kam 
> >>> ---
> >>>  doc/guides/prog_guide/rte_flow.rst | 10 ++
> >>>  1 file changed, 10 insertions(+)
> >>>
> >>> diff --git a/doc/guides/prog_guide/rte_flow.rst
> >> b/doc/guides/prog_guide/rte_flow.rst
> >>> index 3e5cd1e..d000560 100644
> >>> --- a/doc/guides/prog_guide/rte_flow.rst
> >>> +++ b/doc/guides/prog_guide/rte_flow.rst
> >>> @@ -1735,6 +1735,16 @@ unspecified "best-effort" settings from the
> >> underlying PMD, which depending
> >>>  on the flow rule, may result in anything ranging from empty (single 
> >>> queue)
> >>>  to all-inclusive RSS.
> >>>
> >>> +Best effort will be used, in case there is a conflict inside the rule.
> >>> +The conflict can be the result of:
> >>> +
> >>> +- Conflicting RSS types, for example setting both UDP and TCP.
> >>> +
> >>> +- Conflicting between RSS types and the requested pattern to match,
> >>> +  for example matching on UDP and hashing RSS on TCP.
> >> IMHO it is not a problem at all, but good to clarify anyway.
> >>
> > Agree this patch is just for clarification.
> >
> >>> +
> >>> +- Hashing on types that are not supported by the PMD.
> >> Shouldn't it return error to the caller?
> >>
> > That’s depends, if for example the application requested eth and IPv4,
> > and the PMD can do only IPv4 so according to this, the PMD will just do 
> > IPv4.
> 
> I think it is a bad behaviour. Application has required information to
> provide correct parameters and therefore incorrect should be rejected.
> Am I missing something?
> 
In RTE flow you can't know what are the parameters that are supported.
One option is to fail the rule, but this means failing the rule even if some of 
them are supported.
Or we can choose to fail if all the types are unsupported, but this will result 
in miss match between
adding two types and adding just one type. For example the PMD doesn't support 
RSS on eth
and the application request RSS on ETH and IPv4 so the PMD will do IPv4 RSS,
while if the user selects only ETH the rule will fail.
Same this is exactly the same case as miss match the application requests 
something
that the PMD can't supply. 
In any case I think this is the current behavior of the PMD so this just 
clarify it.

> >>> +
> >>>  Note: RSS hash result is stored in the ``hash.rss`` mbuf field which
> >>>  overlaps ``hash.fdir.lo``. Since `Action: MARK`_ sets the 
> >>> ``hash.fdir.hi``
> >>>  field only, both can be requested simultaneously.
> >>>



Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Ori Kam
Hi David,

> -Original Message-
> From: David Marchand 
> 
> On Mon, Aug 3, 2020 at 5:23 PM Ori Kam  wrote:
> > > > +
> > > > +- Hashing on types that are not supported by the PMD.
> > >
> > > Shouldn't it return error to the caller?
> > >
> > That’s depends, if for example the application requested eth and IPv4,
> > and the PMD can do only IPv4 so according to this, the PMD will just do 
> > IPv4.
> 
> We should have some validation in ethdev to catch this.
> Is it not the case?
> 
Like I said in my reply to Andrew, in rte_flow we don't have such caps.
Maybe we should think about adding them for the RSS case, but again it is tricky
What if one RSS type is supported depending on the flow or other types?

> 
> --
> David Marchand



Re: [dpdk-dev] [v2] vdpa/mlx5: fix queue update synchronization

2020-08-03 Thread Maxime Coquelin



On 8/2/20 11:21 AM, Xueming Li wrote:
> The driver CQ event management is done by non vhost library thread,
> either the dpdk host thread or the internal vDPA driver thread.
> 
> When a queue is updated the CQ may be destroyed and created by the vhost
> library thread via the queue state operation.
> 
> When the queue update feature was added, it didn't synchronize the CQ
> management to the queue update what may cause invalid memory access.
> 
> Add the aforementioned synchronization by a new per device configuration
> mutex.
> 
> Fixes: c47d6e83334e ("vdpa/mlx5: support queue update")
> 
> Signed-off-by: Xueming Li 
> Acked-by: Matan Azrad 
> ---
>  drivers/vdpa/mlx5/mlx5_vdpa.c   | 8 +++-
>  drivers/vdpa/mlx5/mlx5_vdpa.h   | 1 +
>  drivers/vdpa/mlx5/mlx5_vdpa_event.c | 8 
>  3 files changed, 16 insertions(+), 1 deletion(-)
> 

Applied to dpdk-next-virtio/master with suggested change.

Thanks,
Maxime



Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Andrew Rybchenko
On 8/3/20 6:49 PM, Ori Kam wrote:
> Hi David,
>
>> -Original Message-
>> From: David Marchand 
>>
>> On Mon, Aug 3, 2020 at 5:23 PM Ori Kam  wrote:
> +
> +- Hashing on types that are not supported by the PMD.
 Shouldn't it return error to the caller?

>>> That’s depends, if for example the application requested eth and IPv4,
>>> and the PMD can do only IPv4 so according to this, the PMD will just do 
>>> IPv4.
>> We should have some validation in ethdev to catch this.
>> Is it not the case?
>>
> Like I said in my reply to Andrew, in rte_flow we don't have such caps.
> Maybe we should think about adding them for the RSS case, but again it is 
> tricky
> What if one RSS type is supported depending on the flow or other types?

Also I'd like to add that ethdev layer is dummy for rte_flow API.
It does not parse pattern/actions etc. May be should change it one day.



Re: [dpdk-dev] [PATCH] vdpa/mlx5: fix virtq unset

2020-08-03 Thread Maxime Coquelin



On 8/3/20 10:25 AM, Matan Azrad wrote:
> When a virtq is destroyed, the SW should be able to continue the virtq
> processing from where the HW stopped.
> 
> The current destroy behavior in the driver saves the virtq state (used
> and available indexes) only when LM is requested.
> So, when LM is not requested the queue state is not saved and the SW
> indexes stay invalid.
> 
> Save the virtq state in the virtq destroy process.
> 
> Fixes: bff735011078 ("vdpa/mlx5: prepare virtio queues")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Matan Azrad 
> Acked-by: Xueming Li 
> ---
>  drivers/vdpa/mlx5/mlx5_vdpa.h   |  1 +
>  drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 17 -
>  2 files changed, 13 insertions(+), 5 deletions(-)

Applied to dpdk-next-virtio/master.

Thanks,
Maxime



Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Andrew Rybchenko
On 8/3/20 6:47 PM, Ori Kam wrote:
> 
> 
>> -Original Message-
>> From: Andrew Rybchenko 
>>
>> On 8/3/20 6:22 PM, Ori Kam wrote:
>>> Hi Andrew,
>>>
 -Original Message-
 From: Andrew Rybchenko 

 On 8/3/20 5:28 PM, Ori Kam wrote:
> Using the rte_flow action RSS types field,
> may result in an undefined outcome.
>
> For example selecting both UDP and TCP,
> selecting TCP RSS type but the pattern is targeting UDP traffic.
> another option is that the PMD doesn't support all requested types.
>
> Until now, it wasn't clear what will happen in such cases.
> This commit clarify this issue by stating that the PMD
> will work in the best-effort mode.
>
> Signed-off-by: Ori Kam 
> ---
>  doc/guides/prog_guide/rte_flow.rst | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/doc/guides/prog_guide/rte_flow.rst
 b/doc/guides/prog_guide/rte_flow.rst
> index 3e5cd1e..d000560 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -1735,6 +1735,16 @@ unspecified "best-effort" settings from the
 underlying PMD, which depending
>  on the flow rule, may result in anything ranging from empty (single 
> queue)
>  to all-inclusive RSS.
>
> +Best effort will be used, in case there is a conflict inside the rule.
> +The conflict can be the result of:
> +
> +- Conflicting RSS types, for example setting both UDP and TCP.
> +
> +- Conflicting between RSS types and the requested pattern to match,
> +  for example matching on UDP and hashing RSS on TCP.
 IMHO it is not a problem at all, but good to clarify anyway.

>>> Agree this patch is just for clarification.
>>>
> +
> +- Hashing on types that are not supported by the PMD.
 Shouldn't it return error to the caller?

>>> That’s depends, if for example the application requested eth and IPv4,
>>> and the PMD can do only IPv4 so according to this, the PMD will just do 
>>> IPv4.
>>
>> I think it is a bad behaviour. Application has required information to
>> provide correct parameters and therefore incorrect should be rejected.
>> Am I missing something?
>>
> In RTE flow you can't know what are the parameters that are supported.
> One option is to fail the rule, but this means failing the rule even if some 
> of them are supported.
> Or we can choose to fail if all the types are unsupported, but this will 
> result in miss match between
> adding two types and adding just one type. For example the PMD doesn't 
> support RSS on eth
> and the application request RSS on ETH and IPv4 so the PMD will do IPv4 RSS,
> while if the user selects only ETH the rule will fail.
> Same this is exactly the same case as miss match the application requests 
> something
> that the PMD can't supply. 
> In any case I think this is the current behavior of the PMD so this just 
> clarify it.

I'm sorry, but the PMD is hardly applicable here.
Do you mean the "mlx5" PMD? Or have you reviewed all PMDs?

No I understand that the topic is more complicated, but
covering complex cases should not spoil simple.
I think that information provided in dev_info should
specify all supported protocols/fields.
Yes, it is acceptable to behave best effort if requested
field is supported in general, but it not for the flow.
Please, make it a bit clearer in the suggested description.

> 
> +
>  Note: RSS hash result is stored in the ``hash.rss`` mbuf field which
>  overlaps ``hash.fdir.lo``. Since `Action: MARK`_ sets the 
> ``hash.fdir.hi``
>  field only, both can be requested simultaneously.
>
> 



Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Thomas Monjalon
03/08/2020 17:55, Andrew Rybchenko:
> On 8/3/20 6:49 PM, Ori Kam wrote:
> > From: David Marchand 
> >> On Mon, Aug 3, 2020 at 5:23 PM Ori Kam  wrote:
> > +
> > +- Hashing on types that are not supported by the PMD.
>  Shouldn't it return error to the caller?
> 
> >>> That’s depends, if for example the application requested eth and IPv4,
> >>> and the PMD can do only IPv4 so according to this, the PMD will just do 
> >>> IPv4.
> >> We should have some validation in ethdev to catch this.
> >> Is it not the case?
> >>
> > Like I said in my reply to Andrew, in rte_flow we don't have such caps.
> > Maybe we should think about adding them for the RSS case, but again it is 
> > tricky
> > What if one RSS type is supported depending on the flow or other types?
> 
> Also I'd like to add that ethdev layer is dummy for rte_flow API.
> It does not parse pattern/actions etc. May be should change it one day.

For now, what we can do is to have "best effort" (sic) checks.
In the case we are 100% sure a rule cannot be fully supported,
I think we should reject the rule.
If there are more complex cases to handle, we can accept the rule
and support it with "best effort" handling.

The other appropriate action is to implement tests in DTS
to check the behaviour of the PMDs, validating the compliance
of the accepted rules with real flows.




Re: [dpdk-dev] [20.11, PATCH v2] baseband/fpga_5gnr_fec: add companion PF config App

2020-08-03 Thread Chautru, Nicolas
Hi Maxime, Thomas, 

> From: Maxime Coquelin 
> Hi Nicolas,
> 
> On 7/31/20 5:17 PM, Chautru, Nicolas wrote:
> > Hi Maxime,
> >
> >>
> >> Hi Nicolas,
> >>
> >> On 7/16/20 10:20 PM, Nicolas Chautru wrote:
> >>> Adding companion application to configure HW Device from the PF.
> >>> Then the device can be accessed through BBDEV from VF (or PF).
> >>>
> >>> Signed-off-by: Nicolas Chautru 
> >>> ---
> >>>  doc/guides/bbdevs/fpga_5gnr_fec.rst|  80 +++--
> >>>  .../baseband/fpga_5gnr_fec/pf_config_app/Makefile  |  36 +++
> >>>  .../fpga_5gnr_fec/pf_config_app/config_app.c   | 333
> >> +++
> >>>  .../pf_config_app/fpga_5gnr_cfg_app.c  | 351
> >> +
> >>>  .../pf_config_app/fpga_5gnr_cfg_app.h  | 102 ++
> >>>  .../pf_config_app/fpga_5gnr_cfg_parser.c   | 187 +++
> >>>  .../pf_config_app/fpga_5gnr_config.cfg |  18 ++
> >>>  7 files changed, 1087 insertions(+), 20 deletions(-)  create mode
> >>> 100644 drivers/baseband/fpga_5gnr_fec/pf_config_app/Makefile
> >>>  create mode 100644
> >>> drivers/baseband/fpga_5gnr_fec/pf_config_app/config_app.c
> >>>  create mode 100644
> >>> drivers/baseband/fpga_5gnr_fec/pf_config_app/fpga_5gnr_cfg_app.c
> >>>  create mode 100644
> >>> drivers/baseband/fpga_5gnr_fec/pf_config_app/fpga_5gnr_cfg_app.h
> >>>  create mode 100644
> >>>
> drivers/baseband/fpga_5gnr_fec/pf_config_app/fpga_5gnr_cfg_parser.c
> >>>  create mode 100644
> >>> drivers/baseband/fpga_5gnr_fec/pf_config_app/fpga_5gnr_config.cfg
> >>
> >> I think having the pf_config_app in the driver directory is not a
> >> good idea, this is not the place for applications.
> >>
> >> Also, it is not integrated in the DPDK build system, so it cannot
> >> benefit from the CI. Having an external dependency that is not
> >> packaged in distributions will not help to have it integrated in the build
> system.
> >>
> >
> > Thanks for sharing.
> > Note that all these points were raised openly explicitly earlier as you 
> > know,
> ie part of both pros and cons.
> > Still happy to get feedback from others notably Thomas. It appears you had
> side conversations with him on this very topic.
> >
> >> I see some alternatives:
> >> 1. Move it in another directory in the main DPDK repo, but it is not
> >> a DPDK example, not a dev tool and not a build tool, so it would need
> >> a new directory.
> >> 2. Create a BBDEV tools repository on dpdk.org (It would require
> >> techboard approval).
> >> 3. Host it in a dedicated repository on Intel's github 4. Move it
> >> into some Intel FPGA tools repository
> >
> > There are several others options which were indeed considered in case this
> option was not viable.
> > Still DPDK was considered best option so far to keep everything in one
> recognized place for BBDEV devices but happy to get further input from
> others.
> >
> >> I think option 3 would be the best to get it packaged into
> >> distributions as it has no build dependency with any DPDK library.
> >> You could maybe add inih library as a git sub-repository within this repo.
> >> Other advantage is you wouldn't depend on DPDK release cycles to get
> >> fixes merged.
> >>
> >> Regarding the tool itself, I understand from the commit message that
> >> the tool has a dependency on the BBDEV PMD version, but the tool run
> >> on the host while the PMD driver is used in the guest/container. So
> >> having it in the driver directory will not really help, as host DPDK
> >> (if any) and guest DPDK may come from different parties.
> >
> > Yes this was captured earlier, purely stored there as a companion
> > application for a given version of the PMD (ie. different subdirectories for
> each PMD directory).
> > They do no run in the same container for deployment and are not built at
> the same time indeed, their interface is the HW really and one being needed
> to be run prior to the other one to be functional.
> >
> >> One question I have is whether this is the tool itself that has a
> >> dependency on the PMD, or just the config file?
> >
> > Each PMD directory would have its own version of the companion PF
> config application.
> > Ie. the patch above is only for baseband/fpga_5gnr_fec ie. Intel Vista Creek
> with 5G LDPC user image.
> 
> OK. Does it mean the same application and configuration will work for
> baseband/fpga_5gnr_fec PMD versions v20.11, v21.02, v21.05, etc, ...?
> 
> If not, is there a way for the PMD driver to detect whether a wrong
> configuration was applied? Something like checking the FW version of a NIC
> is supported by the PMD driver.
> 

As mentioned above there is no API between the 2 config-app companion and 
BBDEV/DPDK, as they belong logically to 2 different entities (possibly 
containers) without shared interface except indirectly through HW. 
There is no strict API which could be broken between the 2, but purely 2 SW 
ingredients that ideally should be aligned to avoid discrepancy during 
deployment or validation

Re: [dpdk-dev] [PATCH] doc: announce API change in timer

2020-08-03 Thread Carrillo, Erik G
> -Original Message-
> From: Sarosh Arif 
> Sent: Monday, August 3, 2020 6:21 AM
> To: Carrillo, Erik G ; rsanf...@akamai.com
> Cc: dev@dpdk.org; Sarosh Arif 
> Subject: [PATCH] doc: announce API change in timer
> 
> If the user tries to reset/stop some other timer in it's callback function, 
> which
> is also about to expire, using rte_timer_reset_sync/rte_timer_stop_sync the
> application goes into an infinite loop. This happens because
> rte_timer_reset_sync/rte_timer_stop_sync loop until the timer resets/stops
> and there is check inside timer_set_config_state which prevents a running
> timer from being reset/stopped by not it's own timer_cb. Therefore
> timer_set_config_state returns -1 due to which rte_timer_reset returns -1
> and rte_timer_reset_sync goes into an infinite loop
> 
> To to prevent this rte_timer_reset_sync and rte_timer_stop_sync should
> have int return types, so that -1 can be returned if the above condition
> occurs
> 
> Signed-off-by: Sarosh Arif 
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index ea4cfa7a4..ed93a707d 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -151,3 +151,9 @@ Deprecation Notices
>Python 2 support will be completely removed in 20.11.
>In 20.08, explicit deprecation warnings will be displayed when running
>scripts with Python 2.
> +
> +* timer: Since timer can get stuck in an infinite loop if the
> +application tries to
> +  reset/stop some other timer in it's callback function, which is also
> +about to
> +  expire. The function ``rte_timer_stop_sync`` and

It looks like this should be rte_timer_reset_sync.  Maybe something like:

timer:  Timers can get stuck in an infinite loop if their callback tries to 
synchronously reset/stop some other timer that is also about to expire.  The 
functions ``rte_timer_reset_sync`` and ``rte_timer_stop_sync`` will updated 
with an int return type so that an error code can be returned when this 
condition occurs.

Thanks,
Erik

> +``rte_timer_stop_sync``  will
> +  have a int return type in order to return with -1 in when this
> +condition
> +  occures.
> --
> 2.17.1



Re: [dpdk-dev] [PATCH] doc: announce changes to ethdev rxconf structure

2020-08-03 Thread Slava Ovsiienko
> -Original Message-
> From: Andrew Rybchenko 
> Sent: Monday, August 3, 2020 18:31
> To: Slava Ovsiienko ; dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Thomas Monjalon ;
> ferruh.yi...@intel.com; jerinjac...@gmail.com;
> step...@networkplumber.org; ajit.khapa...@broadcom.com;
> maxime.coque...@redhat.com; olivier.m...@6wind.com;
> david.march...@redhat.com
> Subject: Re: [PATCH] doc: announce changes to ethdev rxconf structure
> 
> Hi Slava,
> 
> On 8/3/20 6:18 PM, Slava Ovsiienko wrote:
> > Hi, Andrew
> >
> > Thanks for the comment, please, see below.
> >
> >> -Original Message-
> >> From: Andrew Rybchenko 
> >> Sent: Monday, August 3, 2020 17:31
> >> To: Slava Ovsiienko ; dev@dpdk.org
> >> Cc: Matan Azrad ; Raslan Darawsheh
> >> ; Thomas Monjalon ;
> >> ferruh.yi...@intel.com; jerinjac...@gmail.com;
> >> step...@networkplumber.org; ajit.khapa...@broadcom.com;
> >> maxime.coque...@redhat.com; olivier.m...@6wind.com;
> >> david.march...@redhat.com
> >> Subject: Re: ***Spam*** [PATCH] doc: announce changes to ethdev
> >> rxconf structure
> >>
> >> On 8/3/20 1:58 PM, Viacheslav Ovsiienko wrote:
> >>> The DPDK datapath in the transmit direction is very flexible.
> >>> The applications can build multisegment packets and manages almost
> >>> all data aspects - the memory pools where segments are allocated
> >>> from, the segment lengths, the memory attributes like external,
> registered, etc.
> >>>
> >>> In the receiving direction, the datapath is much less flexible, the
> >>> applications can only specify the memory pool to configure the
> >>> receiving queue and nothing more. In order to extend the receiving
> >>> datapath capabilities it is proposed to add the new fields into
> >>> rte_eth_rxconf structure:
> >>>
> >>> struct rte_eth_rxconf {
> >>> ...
> >>> uint16_t rx_split_num; /* number of segments to split */
> >>> uint16_t *rx_split_len; /* array of segment lengthes */
> >>> struct rte_mempool **mp; /* array of segment memory pools */
> >>> ...
> >>> };
> >>>
> >>> The non-zero value of rx_split_num field configures the receiving
> >>> queue to split ingress packets into multiple segments to the mbufs
> >>> allocated from various memory pools according to the specified
> >>> lengths. The zero value of rx_split_num field provides the backward
> >>> compatibility and queue should be configured in a regular way (with
> >>> single/multiple mbufs of the same data buffer length allocated from
> >>> the single memory pool).
> >>
> >> From the above description it is not 100% clear how it will coexist with:
> >>  - existing mb_pool argument of the rte_eth_rx_queue_setup()
> >>  - DEV_RX_OFFLOAD_SCATTER
> >
> > DEV_RX_OFFLOAD_SCATTER flag is required to be reported and configured
> > for the new feature to indicate the application is prepared for the
> > multisegment packets.
> 
> I hope it will be mentioned in the feature documentation in the future, but
> I'm not 100% sure that it is required. See below.
I suppose there is the hierarchy:
- applications configures DEV_RX_OFFLOAD_SCATTER on the port and tells in this 
way:
"Hey, driver, I'm ready to handle multi-segment packets". Readiness in general.
- application configures BUFFER_SPLIT and tells PMD _HOW_ it wants to split, in 
particular way:
"Hey, driver, please, drop ten bytes here, here and here, and the rest - over 
there"


> >
> > But SCATTER it just tells that ingress packet length can exceed the
> > mbuf data buffer length and the chain of mbufs must be built to store
> > the entire packet. But there is the limitation - all mbufs are
> > allocated  from the same memory pool, and all data buffers have the same
> length.
> > The new feature provides an opportunity to allocated mbufs from the
> > desired pools and specifies the length of each buffer/part.
> 
> Yes, it is clear, but what happens if packet does not fit into the provided
> pools chain? Is the last used many times? May be it logical to use Rx queue
> setup mb_pool as well for the purpose? I.e. use suggested here pools only
> once and use mb_pool many times for the rest if SCATTER is supported and
> only once if SCATTER is not supported.

It could be easily configured w/o involving SCATTER flag - just specify the 
last pool
multiple times. I.e.
pool 0 - 14B
pool 1 - 20B
...
pool N - 512B
pool N - 512B
pool N - 512B, sum of length >= max packet size 1518

It was supposed the sum of lengths in array covers the maximal packet size.
Currently there is the limitation on packet size, for example mlx5 PMD 
just drops the packets with the length exceeded the one queue is configured for.

> 
> >
> >>  - DEV_RX_OFFLOAD_HEADER_SPLIT
> > The new feature (let's name it as "BUFFER_SPLIT") might be supported
> > in conjunction with HEADER_SPLIT (say, split the rest of the data
> > after the header) or rejected if HEADER_SPLIT is configured on the
> > port, depending on PMD implementation (return ENOTSUP if both features
> are requested on the same port).
> 
> 

Re: [dpdk-dev] [PATCH] doc: update RSS action with best effort

2020-08-03 Thread Ori Kam
Hi Andrew,

> -Original Message-
> From: Andrew Rybchenko 
> 
> On 8/3/20 6:47 PM, Ori Kam wrote:
> >
> >
> >> -Original Message-
> >> From: Andrew Rybchenko 
> >>
> >> On 8/3/20 6:22 PM, Ori Kam wrote:
> >>> Hi Andrew,
> >>>
>  -Original Message-
>  From: Andrew Rybchenko 
> 
>  On 8/3/20 5:28 PM, Ori Kam wrote:
> > Using the rte_flow action RSS types field,
> > may result in an undefined outcome.
> >
> > For example selecting both UDP and TCP,
> > selecting TCP RSS type but the pattern is targeting UDP traffic.
> > another option is that the PMD doesn't support all requested types.
> >
> > Until now, it wasn't clear what will happen in such cases.
> > This commit clarify this issue by stating that the PMD
> > will work in the best-effort mode.
> >
> > Signed-off-by: Ori Kam 
> > ---
> >  doc/guides/prog_guide/rte_flow.rst | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
>  b/doc/guides/prog_guide/rte_flow.rst
> > index 3e5cd1e..d000560 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -1735,6 +1735,16 @@ unspecified "best-effort" settings from the
>  underlying PMD, which depending
> >  on the flow rule, may result in anything ranging from empty (single
> queue)
> >  to all-inclusive RSS.
> >
> > +Best effort will be used, in case there is a conflict inside the rule.
> > +The conflict can be the result of:
> > +
> > +- Conflicting RSS types, for example setting both UDP and TCP.
> > +
> > +- Conflicting between RSS types and the requested pattern to match,
> > +  for example matching on UDP and hashing RSS on TCP.
>  IMHO it is not a problem at all, but good to clarify anyway.
> 
> >>> Agree this patch is just for clarification.
> >>>
> > +
> > +- Hashing on types that are not supported by the PMD.
>  Shouldn't it return error to the caller?
> 
> >>> That’s depends, if for example the application requested eth and IPv4,
> >>> and the PMD can do only IPv4 so according to this, the PMD will just do
> IPv4.
> >>
> >> I think it is a bad behaviour. Application has required information to
> >> provide correct parameters and therefore incorrect should be rejected.
> >> Am I missing something?
> >>
> > In RTE flow you can't know what are the parameters that are supported.
> > One option is to fail the rule, but this means failing the rule even if 
> > some of
> them are supported.
> > Or we can choose to fail if all the types are unsupported, but this will 
> > result in
> miss match between
> > adding two types and adding just one type. For example the PMD doesn't
> support RSS on eth
> > and the application request RSS on ETH and IPv4 so the PMD will do IPv4 RSS,
> > while if the user selects only ETH the rule will fail.
> > Same this is exactly the same case as miss match the application requests
> something
> > that the PMD can't supply.
> > In any case I think this is the current behavior of the PMD so this just 
> > clarify it.
> 
> I'm sorry, but the PMD is hardly applicable here.
> Do you mean the "mlx5" PMD? Or have you reviewed all PMDs?
> 
I assume that each PMD as its own limitations, I'm sure that no
PMD can support all possible RSS. But maybe there is such perfect PMD.

> No I understand that the topic is more complicated, but
> covering complex cases should not spoil simple.
> I think that information provided in dev_info should
> specify all supported protocols/fields.
> Yes, it is acceptable to behave best effort if requested
> field is supported in general, but it not for the flow.
> Please, make it a bit clearer in the suggested description.
> 
So what about the following wording: (after the current part)
If requested RSS hash type is not supported, and no supported hash function
is requested then the PMD may reject the flow.

> >
> > +
> >  Note: RSS hash result is stored in the ``hash.rss`` mbuf field which
> >  overlaps ``hash.fdir.lo``. Since `Action: MARK`_ sets the 
> > ``hash.fdir.hi``
> >  field only, both can be requested simultaneously.
> >
> >



[dpdk-dev] [RFC v2] ethdev: add extensions attributes to IPv6 item

2020-08-03 Thread Dekel Peled
Using the current implementation of DPDK, an application cannot match on
IPv6 packets, based on the existing extension headers, in a simple way.

Field 'Next Header' in IPv6 header indicates type of the first extension
header only. Following extension headers can't be identified by
inspecting the IPv6 header.
As a result, the existence or absence of specific extension headers
can't be used for packet matching.

For example, fragmented IPv6 packets contain a dedicated extension header,
as detailed in RFC [1], which is not yet supported in rte_flow.
Non-fragmented packets don't contain the fragment extension header.
For an application to match on non-fragmented IPv6 packets, the current
implementation doesn't provide a suitable solution.
Matching on the Next Header field is not sufficient, since additional
extension headers might be present in the same packet.
To match on fragmented IPv6 packets, the same difficulty exists.

Proposed update:
A set of additional values will be added to IPv6 header struct.
These values will indicate the existence of every defined extension
header type, providing simple means for identification of existing
extensions in the packet header.
Continuing the above example, fragmented packets can be identified using
the specific value indicating existence of fragment extension header.

This update changes ABI, and is proposed for the 20.11 LTS version.

[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html

Signed-off-by: Dekel Peled 
---
 lib/librte_ethdev/rte_flow.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..8d2073d 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -792,11 +792,33 @@ struct rte_flow_item_ipv4 {
  *
  * Matches an IPv6 header.
  *
+ * Dedicated flags indicate existence of specific extension headers.
+ *
  * Note: IPv6 options are handled by dedicated pattern items, see
  * RTE_FLOW_ITEM_TYPE_IPV6_EXT.
  */
 struct rte_flow_item_ipv6 {
struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
+   uint64_t hop_ext_exist:1;
+   /**< Hop-by-Hop Options extension header exists. */
+uint64_t rout_ext_exist:1;
+/**< Routing extension header exists. */
+uint64_t frag_ext_exist:1;
+/**< Fragment extension header exists. */
+uint64_t auth_ext_exist:1;
+/**< Authentication extension header exists. */
+uint64_t esp_ext_exist:1;
+/**< Encapsulation Security Payload extension header exists. */
+uint64_t dest_ext_exist:1;
+/**< Destination Options extension header exists. */
+uint64_t mobil_ext_exist:1;
+/**< Mobility extension header exists. */
+uint64_t hip_ext_exist:1;
+/**< Host Identity Protocol extension header exists. */
+uint64_t shim6_ext_exist:1;
+/**< Shim6 Protocol extension header exists. */
+uint64_t reserved:55;
+/**< Reserved for future extension headers, must be zero. */
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_IPV6. */
-- 
1.8.3.1



[dpdk-dev] [RFC v3] ethdev: add extensions attributes to IPv6 item

2020-08-03 Thread Dekel Peled
Using the current implementation of DPDK, an application cannot match on
IPv6 packets, based on the existing extension headers, in a simple way.

Field 'Next Header' in IPv6 header indicates type of the first extension
header only. Following extension headers can't be identified by
inspecting the IPv6 header.
As a result, the existence or absence of specific extension headers
can't be used for packet matching.

For example, fragmented IPv6 packets contain a dedicated extension header,
as detailed in RFC [1], which is not yet supported in rte_flow.
Non-fragmented packets don't contain the fragment extension header.
For an application to match on non-fragmented IPv6 packets, the current
implementation doesn't provide a suitable solution.
Matching on the Next Header field is not sufficient, since additional
extension headers might be present in the same packet.
To match on fragmented IPv6 packets, the same difficulty exists.

Proposed update:
A set of additional values will be added to IPv6 header struct.
These values will indicate the existence of every defined extension
header type, providing simple means for identification of existing
extensions in the packet header.
Continuing the above example, fragmented packets can be identified using
the specific value indicating existence of fragment extension header.

This update changes ABI, and is proposed for the 20.11 LTS version.

[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html

---
v3: Fix checkpatch comments.
v2: Update from fragment attribute to all extensions attributes.
---

Signed-off-by: Dekel Peled 
---
 lib/librte_ethdev/rte_flow.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..246918e 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -792,11 +792,33 @@ struct rte_flow_item_ipv4 {
  *
  * Matches an IPv6 header.
  *
+ * Dedicated flags indicate existence of specific extension headers.
+ *
  * Note: IPv6 options are handled by dedicated pattern items, see
  * RTE_FLOW_ITEM_TYPE_IPV6_EXT.
  */
 struct rte_flow_item_ipv6 {
struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
+   uint64_t hop_ext_exist:1;
+   /**< Hop-by-Hop Options extension header exists. */
+   uint64_t rout_ext_exist:1;
+   /**< Routing extension header exists. */
+   uint64_t frag_ext_exist:1;
+   /**< Fragment extension header exists. */
+   uint64_t auth_ext_exist:1;
+   /**< Authentication extension header exists. */
+   uint64_t esp_ext_exist:1;
+   /**< Encapsulation Security Payload extension header exists. */
+   uint64_t dest_ext_exist:1;
+   /**< Destination Options extension header exists. */
+   uint64_t mobil_ext_exist:1;
+   /**< Mobility extension header exists. */
+   uint64_t hip_ext_exist:1;
+   /**< Host Identity Protocol extension header exists. */
+   uint64_t shim6_ext_exist:1;
+   /**< Shim6 Protocol extension header exists. */
+   uint64_t reserved:55;
+   /**< Reserved for future extension headers, must be zero. */
 };
 
 /** Default mask for RTE_FLOW_ITEM_TYPE_IPV6. */
-- 
1.8.3.1



Re: [dpdk-dev] Idem with Multicast Packets Feature

2020-08-03 Thread Daniel Kirichok
Hi all,

I wanted to send a follow up question regarding the development of the Idem
for Multicast Packets feature. Based on the plan we have so far, it
involves having one port be designated to send the packet, and two ports be
designated to receive the packet. In order to send out one packet for the
two ports to receive, it looks like this would require for all three ports
to be on the same network and a 1-2 pairing configuration would be needed
on the server. Is this the right approach for developing this feature?
Would an alternate approach be better to avoid server reconfiguration such
as sending out a multicast packet from each of the ports (two total packets
sent out) and see if they each receive what was sent?

Thanks,
Dan

On Fri, Jul 10, 2020 at 1:27 PM Daniel Kirichok 
wrote:

> Hi all,
>
> The Idem with Multicast Packets test will first create three groups of
> port combinations. The first group will have both ports be capable of
> receiving a packet. The second group will have one port be open to receive
> a packet and one be closed. The third group will have both ports be closed.
> Each of the ports will be pinged to ensure that they are reachable.
> Afterwards, a packet will be generated and sent to each of these groups,
> and it will ensure that the correct behavior of a failed or successful
> delivery occurs on each port.
>
> Let me know if there are any comments or questions.
>
> Thanks,
> Dan
>
> --
>
> Dan Kirichok
>
> UNH InterOperability Laboratory
>
> 21 Madbury Rd, Suite 100, Durham, NH 03824
>
> dkiric...@iol.unh.edu
>
> www.iol.unh.edu
>
>
>
>

-- 

Dan Kirichok

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

dkiric...@iol.unh.edu

www.iol.unh.edu


[dpdk-dev] [PATCH] doc: announce removal of ethdev flow director API

2020-08-03 Thread Thomas Monjalon
The flow director config, part of the legacy filtering API,
was marked as deprecated last year.
A separate notice is added to make clear that these specific structs
will be removed as well in DPDK 20.11, as the rest of the legacy
filtering API.

Signed-off-by: Thomas Monjalon 
---
 doc/guides/rel_notes/deprecation.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7a48..24474c563f 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -84,6 +84,10 @@ Deprecation Notices
   Target release for removal of the legacy API will be defined once most
   PMDs have switched to rte_flow.
 
+* ethdev: The flow director API, including ``rte_eth_conf.fdir_conf`` field,
+  and the related structures (``rte_fdir_*`` and ``rte_eth_fdir_*``),
+  will be removed in DPDK 20.11.
+
 * ethdev: Update API functions returning ``void`` to return ``int`` with
   negative errno values to indicate various error conditions (e.g.
   invalid port ID, unsupported operation, failed operation):
-- 
2.27.0



[dpdk-dev] [PATCH] doc: eventdev ABI change to support DLB PMD

2020-08-03 Thread McDaniel, Timothy
From: "McDaniel, Timothy" 

The ABI changes associated with this notification will better support
devices that:
1. Have limits on the number or queues that may be linked to a port
2. Have ports that are limited to exactly one linked queue
3. Are not able to transparently transfer the event flow_id field

Signed-off-by: McDaniel Timothy

---
 doc/guides/rel_notes/deprecation.rst |   11 +++
 1 file changed, 11 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 99c9806..bfe6661 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -148,3 +148,14 @@ Deprecation Notices
   Python 2 support will be completely removed in 20.11.
   In 20.08, explicit deprecation warnings will be displayed when running
   scripts with Python 2.
+
+* eventdev: ABI changes to support DLB PMD and future extensions:
+  ``rte_event_dev_info``, ``rte_event_dev_config``, ``rte_event_port_conf`` 
will
+  be modified to support DLB PMD and future extensions in the eventdev library.
+  Patches containing justification, documentation, and proposed modifications
+  can be found at:
+
+  - https://patches.dpdk.org/patch/71457/
+  - https://patches.dpdk.org/patch/71456/
+
+
-- 
1.7.10



[dpdk-dev] [PATCH] doc: announce removal of L2 tunnel filtering API

2020-08-03 Thread Thomas Monjalon
The functions for L2 tunnel were missed when marking the legacy
filtering API as deprecated. That's why a separate notice is done
to make clear that it will be removed as well in DPDK 20.11.

Signed-off-by: Thomas Monjalon 
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7a48..cad2f777f2 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -84,6 +84,12 @@ Deprecation Notices
   Target release for removal of the legacy API will be defined once most
   PMDs have switched to rte_flow.
 
+* ethdev: the legacy L2 tunnel filtering API is deprecated as the rest of
+  the legacy filtering API.
+  The functions ``rte_eth_dev_l2_tunnel_eth_type_conf`` and
+  ``rte_eth_dev_l2_tunnel_offload_set`` which were not marked as deprecated,
+  will be removed in DPDK 20.11.
+
 * ethdev: Update API functions returning ``void`` to return ``int`` with
   negative errno values to indicate various error conditions (e.g.
   invalid port ID, unsupported operation, failed operation):
-- 
2.27.0



[dpdk-dev] [PATCH] net/bnxt: fixes in flow counter mgr

2020-08-03 Thread Somnath Kotur
OVS-DPDK seems to set the reset bit for every flow query. Honor
the bit by resetting the SW counter values after assigning them.
Also set the 'hit' bit only if the counter value retrieved by HW
is non-zero.
While querying flow stats, use max possible entries in the fc table scan
for valid entries instead of active entries as the active entry can be in
any slot in the table.

This is a critical fix for OVS-DPDK flow aging.

Fixes: 306c2d28e247 ("net/bnxt: support count action in flow query")

Reviewed-by: Venkat Duvvuru 
Signed-off-by: Somnath Kotur 
---
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c 
b/drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c
index febda94..df1921d 100644
--- a/drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c
+++ b/drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c
@@ -339,7 +339,7 @@ ulp_fc_mgr_alarm_cb(void *arg)
struct bnxt_ulp_fc_info *ulp_fc_info;
struct bnxt_ulp_device_params *dparms;
struct tf *tfp;
-   uint32_t dev_id, hw_cntr_id = 0;
+   uint32_t dev_id, hw_cntr_id = 0, num_entries = 0;
 
ulp_fc_info = bnxt_ulp_cntxt_ptr2_fc_info_get(ctxt);
if (!ulp_fc_info)
@@ -384,8 +384,9 @@ ulp_fc_mgr_alarm_cb(void *arg)
break;
}
*/
+   num_entries = dparms->flow_count_db_entries / 2;
for (i = 0; i < TF_DIR_MAX; i++) {
-   for (j = 0; j < ulp_fc_info->num_entries; j++) {
+   for (j = 0; j < num_entries; j++) {
if (!ulp_fc_info->sw_acc_tbl[i][j].valid)
continue;
hw_cntr_id = ulp_fc_info->sw_acc_tbl[i][j].hw_cntr_id;
@@ -551,7 +552,7 @@ int ulp_fc_mgr_query_count_get(struct bnxt_ulp_context 
*ctxt,
struct ulp_flow_db_res_params params;
enum tf_dir dir;
uint32_t hw_cntr_id = 0, sw_cntr_idx = 0;
-   struct sw_acc_counter sw_acc_tbl_entry;
+   struct sw_acc_counter *sw_acc_tbl_entry;
bool found_cntr_resource = false;
 
ulp_fc_info = bnxt_ulp_cntxt_ptr2_fc_info_get(ctxt);
@@ -584,13 +585,21 @@ int ulp_fc_mgr_query_count_get(struct bnxt_ulp_context 
*ctxt,
hw_cntr_id = params.resource_hndl;
sw_cntr_idx = hw_cntr_id -
ulp_fc_info->shadow_hw_tbl[dir].start_idx;
-   sw_acc_tbl_entry = ulp_fc_info->sw_acc_tbl[dir][sw_cntr_idx];
+   sw_acc_tbl_entry = &ulp_fc_info->sw_acc_tbl[dir][sw_cntr_idx];
if (params.resource_sub_type ==
BNXT_ULP_RESOURCE_SUB_TYPE_INDEX_TYPE_INT_COUNT) {
-   count->hits_set = 1;
-   count->bytes_set = 1;
-   count->hits = sw_acc_tbl_entry.pkt_count;
-   count->bytes = sw_acc_tbl_entry.byte_count;
+   pthread_mutex_lock(&ulp_fc_info->fc_lock);
+   if (sw_acc_tbl_entry->pkt_count) {
+   count->hits_set = 1;
+   count->bytes_set = 1;
+   count->hits = sw_acc_tbl_entry->pkt_count;
+   count->bytes = sw_acc_tbl_entry->byte_count;
+   }
+   if (count->reset) {
+   sw_acc_tbl_entry->pkt_count = 0;
+   sw_acc_tbl_entry->byte_count = 0;
+   }
+   pthread_mutex_unlock(&ulp_fc_info->fc_lock);
} else {
/* TBD: Handle External counters */
rc = -EINVAL;
-- 
2.7.4



Re: [dpdk-dev] [PATCH] net/bnxt: update resource allocation settings

2020-08-03 Thread Ajit Khaparde
On Sun, Aug 2, 2020 at 11:51 PM Somnath Kotur 
wrote:

> From: Shahaji Bhosle 
>
> Adjusted resource allocations for the hardware resources
> like TCAM entries, action records, stat counters, exact match records to
> scale up offload flows.
> Also increased ipv4 nat entries to 1023.
> This patch is a critical fix to enable driver load on current and all
> FW versions going forward.
>
> Fixes: cef3749d501e2 ("net/bnxt: update TruFlow resource allocation
> numbers")
>
> Signed-off-by: Shahaji Bhosle 
> Signed-off-by: Kishore Padmanabha 
> Reviewed-by: Ajit Kumar Khaparde 
> Reviewed-by: Michael Baucom 
> Signed-off-by: Venkat Duvvuru 
>
Applied to dpdk-next-net-brcm.
Ferruh, Thomas,
I applied this patch since it is a critical fix needed to work with
different versions of FW.

Thanks
Ajit

> ---
>  drivers/net/bnxt/tf_ulp/bnxt_ulp.c | 42
> +++---
>  1 file changed, 21 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
> b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
> index c19cd1d..0d4a455 100644
> --- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
> +++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
> @@ -86,68 +86,68 @@ ulp_ctx_session_open(struct bnxt *bp,
> resources = ¶ms.resources;
> /** RX **/
> /* Identifiers */
> -   resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_L2_CTXT_HIGH] =
> 200;
> -   resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_L2_CTXT_LOW] =
> 20;
> +   resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_L2_CTXT_HIGH] =
> 422;
> +   resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_L2_CTXT_LOW] = 6;
> resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_WC_PROF] = 8;
> resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_PROF_FUNC] = 8;
> resources->ident_cnt[TF_DIR_RX].cnt[TF_IDENT_TYPE_EM_PROF] = 8;
>
> /* Table Types */
> -   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_FULL_ACT_RECORD] =
> 720;
> -   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_STATS_64] = 720;
> -   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_MODIFY_IPV4] = 8;
> +   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_FULL_ACT_RECORD] =
> 8192;
> +   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_STATS_64] = 8192;
> +   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_MODIFY_IPV4] =
> 1023;
>
> /* ENCAP */
> resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_ENCAP_8B] = 16;
> -   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_ENCAP_16B] = 16;
> +   resources->tbl_cnt[TF_DIR_RX].cnt[TF_TBL_TYPE_ACT_ENCAP_16B] = 63;
>
> /* TCAMs */
>
> resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_L2_CTXT_TCAM_HIGH] =
> -   200;
> +   422;
>
> resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_L2_CTXT_TCAM_LOW] =
> -   20;
> +   6;
> resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_PROF_TCAM] = 8;
> -   resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_WC_TCAM] = 416;
> +   resources->tcam_cnt[TF_DIR_RX].cnt[TF_TCAM_TBL_TYPE_WC_TCAM] = 88;
>
> /* EM */
> -   resources->em_cnt[TF_DIR_RX].cnt[TF_EM_TBL_TYPE_EM_RECORD] = 2048;
> +   resources->em_cnt[TF_DIR_RX].cnt[TF_EM_TBL_TYPE_EM_RECORD] = 13176;
>
> /* EEM */
> resources->em_cnt[TF_DIR_RX].cnt[TF_EM_TBL_TYPE_TBL_SCOPE] = 1;
>
> /** TX **/
> /* Identifiers */
> -   resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_L2_CTXT_HIGH] =
> 200;
> -   resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_L2_CTXT_LOW] =
> 20;
> +   resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_L2_CTXT_HIGH] =
> 292;
> +   resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_L2_CTXT_LOW] =
> 144;
> resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_WC_PROF] = 8;
> resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_PROF_FUNC] = 8;
> resources->ident_cnt[TF_DIR_TX].cnt[TF_IDENT_TYPE_EM_PROF] = 8;
>
> /* Table Types */
> -   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_FULL_ACT_RECORD] =
> 16;
> -   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_STATS_64] = 16;
> -   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_MODIFY_IPV4] = 8;
> +   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_FULL_ACT_RECORD] =
> 8192;
> +   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_STATS_64] = 8192;
> +   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_MODIFY_IPV4] =
> 1023;
>
> /* ENCAP */
> -   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_ENCAP_64B] = 16;
> -   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_ENCAP_16B] = 16;
> +   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_ENCAP_64B] = 511;
> +   resources->tbl_cnt[TF_DIR_TX].cnt[TF_TBL_TYPE_ACT_ENCAP_16B] = 200;
>
> /* TCAMs */
>
> resources->tcam_cnt[TF_DIR_TX].cnt[TF_TCAM_TBL_TYPE_L2_CTXT_TCAM_HIGH] =
> -   200;
> +   292;
>
> resources->tcam_cnt[

Re: [dpdk-dev] [PATCH] doc: announce removal of legacy ethdev filtering API

2020-08-03 Thread Ajit Khaparde
On Mon, Aug 3, 2020 at 4:57 AM Jerin Jacob  wrote:

> On Mon, Aug 3, 2020 at 5:24 PM Andrew Rybchenko
>  wrote:
> >
> > On 8/3/20 2:49 PM, Thomas Monjalon wrote:
> > > Deprecation of rte_eth_dev_filter_ctrl() was announced in 2016,
> > > and confirmed last year by avoiding include of rte_eth_ctrl.h.
> > > After 4 years, it is time to complete the removal of the API
> > > which is replaced with rte_flow.
> > >
> > > Signed-off-by: Thomas Monjalon 
> > >
> >
> > Acked-by: Andrew Rybchenko 
>
> Acked-by: Jerin Jacob 
>
Acked-by: Ajit Khaparde 


>
>
> >
>


Re: [dpdk-dev] CALL to eth PMD maintainers: complete closing of port

2020-08-03 Thread Thomas Monjalon
18/04/2019 12:59, Thomas Monjalon:
> Hi all,
> 
> Since DPDK 18.11, the behaviour of the close operation is changed
> if RTE_ETH_DEV_CLOSE_REMOVE is enabled in the driver:
> port is released (i.e. totally freed and data erased) on close.
> This new behaviour is enabled per driver for a migration period.
> 
> Looking at the code, you can see these comments:
> /* old behaviour: only free queue arrays */
> RTE_ETHDEV_LOG(DEBUG, "Port closing is using an old behaviour.\n"
>   "The driver %s should migrate to the new behaviour.\n",
> /* new behaviour: send event + reset state + free all data */
> 
> You can find an advice in the commit:
>   http://git.dpdk.org/dpdk/commit/?id=23ea57a2a
> "
> When enabling RTE_ETH_DEV_CLOSE_REMOVE,
> the PMD must free all its private resources for the port,
> in its dev_close function.
> It is advised to call the dev_close function in the remove function
> in order to support removing a device without closing its ports.
> "
> 
> It would be great to complete this migration for the next LTS
> version, which will be 19.11.

For the record, it did not happen in 19.11.

> It means the work should be ideally finished during the summer.
> 
> The migration to the new behaviour is done in 4 drivers:
> git grep -l RTE_ETH_DEV_CLOSE_REMOVE drivers | cut -d/ -f3
>   ena
>   enic
>   mlx5
>   vmxnet3
> 
> Following drivers should be migrated:
> ( find drivers/net -mindepth 1 -maxdepth 1 -type d | cut -d/ -f3 ; git grep 
> -l RTE_ETH_DEV_CLOSE_REMOVE drivers | cut -d/ -f3 ) | sort | uniq -u
[...]

The progress in April 2019 was 4 of 46 (9%).

> Please let's progress smoothly on this topic, thanks.

More than one year later, the progress is 26 of 53 (49%).

> The concerned maintainers (Cc) can be found with the following command:
> devtools/get-maintainer.sh $(( find drivers/net -mindepth 1 -maxdepth 1 -type 
> d | cut -d/ -f-3 ; git grep -l RTE_ETH_DEV_CLOSE_REMOVE drivers ) | sort | 
> uniq -u)

We cannot wait forever. Temporary cannot be longer than 2 years.
I am going to send a deprecation notice to remove the "temporary"
flag RTE_ETH_DEV_CLOSE_REMOVE.
It will break drivers which are not migrated.
It will probably help to find motivation in new priorities.

More details on what to do can be found in this mail thread:
http://inbox.dpdk.org/dev/1748144.UFpUr2FPnr@xps/




[dpdk-dev] CPU hog & memory leak on FreeBSD

2020-08-03 Thread Lewis Donzis
Hello. 

We've posted about this before, but I'm hoping to find someone willing to 
commit a patched version of lib/librte_eal/bsdapp/eal/eal_interrupts.c that 
corrects a memory leak and 100% CPU hog that is immediately noticeable with the 
i40e driver, at a minimum. We have modified this file to correct the problem, 
and would be happy to provide it back to whomever is a committer for this. 

The detailed explanation is: 

Currently, s etting an alarm with rte_eal_alarm_set() registers an alarm 
interrupt by calling rte_intr_callback_register(), which links the callback 
function (eal_alarm_callback) onto a list for that source and sets up a 
one-shot timer via kevent. Setting additional alarms links them on to the 
alarm_list, but also calls rte_eal_alarm_set() again, which links the callback 
function onto the source callback list again. 

When the alarm triggers and eal_alarm_callback() gets called, it goes down the 
list of pending alarms, calling as many callback functions as possible and 
removing each one from the list until it reaches one which has not yet expired. 
Once it's done, if alarm_list is not empty, it calls 
rte_intr_callback_register(), which then links yet another callback onto the 
interrupt source's list, thus creating an infinite loop. 

The problem is that the source callback list grows infinitely under this 
condition (essentially, when more than one alarm is queued). However, the call 
is necessary in order to reset the kevent timer. 

The proposed fix recognizes and leverages the fact that an alarm interrupt in 
FreeBSD should never have more than one callback on its list, so if 
rte_intr_callback_register() is called with an interrupt handle type of 
RTE_INTR_HANDLE_ALARM, and if such an interrupt type already has a non-empty 
list, then a new callback is not created, but the kevent timer is restarted 
properly. 

A much simpler change is possible if we don't mind the overhead of allocating, 
filling-in, linking, de-linking, and freeing a callback unnecessarily. This 
proposed change makes every attempt to avoid that overhead. 



[dpdk-dev] [PATCH] doc: announce change in IPv6 item struct

2020-08-03 Thread Dekel Peled
Struct rte_flow_item_ipv6 will be modified to include additional
values, indicating existence or absence of IPv6 extension headers
following the IPv6 header, as proposed in RFC
https://mails.dpdk.org/archives/dev/2020-August/177257.html.
Because of ABI break this change is proposed for 20.11.

Signed-off-by: Dekel Peled 
---
 doc/guides/rel_notes/deprecation.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7..5201142 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -110,6 +110,11 @@ Deprecation Notices
   break the ABI checks, that is why change is planned for 20.11.
   The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
 
+* ethdev: The ``struct rte_flow_item_ipv6`` struct will be modified to include
+  additional values, indicating existence or absence of IPv6 extension headers
+  following the IPv6 header, as proposed in RFC
+  https://mails.dpdk.org/archives/dev/2020-August/177257.html.
+
 * traffic manager: All traffic manager API's in ``rte_tm.h`` were mistakenly 
made
   ABI stable in the v19.11 release. The TM maintainer and other contributors 
have
   agreed to keep the TM APIs as experimental in expectation of additional spec
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH] doc: announce removal of ethdev flow director API

2020-08-03 Thread Ajit Khaparde
On Mon, Aug 3, 2020 at 10:49 AM Thomas Monjalon  wrote:

> The flow director config, part of the legacy filtering API,
> was marked as deprecated last year.
> A separate notice is added to make clear that these specific structs
> will be removed as well in DPDK 20.11, as the rest of the legacy
> filtering API.
>
> Signed-off-by: Thomas Monjalon 
>
Acked-by: Ajit Khaparde 


> ---
>  doc/guides/rel_notes/deprecation.rst | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index ea4cfa7a48..24474c563f 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -84,6 +84,10 @@ Deprecation Notices
>Target release for removal of the legacy API will be defined once most
>PMDs have switched to rte_flow.
>
> +* ethdev: The flow director API, including ``rte_eth_conf.fdir_conf``
> field,
> +  and the related structures (``rte_fdir_*`` and ``rte_eth_fdir_*``),
> +  will be removed in DPDK 20.11.
> +
>  * ethdev: Update API functions returning ``void`` to return ``int`` with
>negative errno values to indicate various error conditions (e.g.
>invalid port ID, unsupported operation, failed operation):
> --
> 2.27.0
>
>


Re: [dpdk-dev] [PATCH] doc: announce deprecation of port mirroring API

2020-08-03 Thread Ajit Khaparde
On Mon, Aug 3, 2020 at 8:36 AM Andrew Rybchenko 
wrote:

> On 8/3/20 6:33 PM, Thomas Monjalon wrote:
> > A new API is planned to be introduced for sampling and mirroring
> > with rte_flow. It should be more generic and allow more use cases.
> >
> > This deprecation is to show the direction, avoiding overlapping APIs.
> >
> > Signed-off-by: Thomas Monjalon 
>
> I like the idea. Not everything is 100% clear, but the direction is right.
>
+1


>
> Acked-by: Andrew Rybchenko 
>
Acked-by: Ajit Khaparde 


Re: [dpdk-dev] [PATCH] doc: announce API change in timer

2020-08-03 Thread Honnappa Nagarahalli


> 
> If the user tries to reset/stop some other timer in it's callback function, 
> which
Is there any use case for this? Why not just say document that the user is not 
allowed to reset some other timer in the call back function? How does the user 
get access to some other timer in the call back function?
Not sure if this was discussed earlier, I might have missed.

> is also about to expire, using rte_timer_reset_sync/rte_timer_stop_sync the
> application goes into an infinite loop. This happens because
> rte_timer_reset_sync/rte_timer_stop_sync loop until the timer resets/stops
> and there is check inside timer_set_config_state which prevents a running
> timer from being reset/stopped by not it's own timer_cb. Therefore
> timer_set_config_state returns -1 due to which rte_timer_reset returns -1 and
> rte_timer_reset_sync goes into an infinite loop
> 
> To to prevent this rte_timer_reset_sync and rte_timer_stop_sync should have
> int return types, so that -1 can be returned if the above condition occurs
> 
> Signed-off-by: Sarosh Arif 
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index ea4cfa7a4..ed93a707d 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -151,3 +151,9 @@ Deprecation Notices
>Python 2 support will be completely removed in 20.11.
>In 20.08, explicit deprecation warnings will be displayed when running
>scripts with Python 2.
> +
> +* timer: Since timer can get stuck in an infinite loop if the
> +application tries to
> +  reset/stop some other timer in it's callback function, which is also
> +about to
> +  expire. The function ``rte_timer_stop_sync`` and
> +``rte_timer_stop_sync``  will
> +  have a int return type in order to return with -1 in when this
> +condition
> +  occures.
> --
> 2.17.1



Re: [dpdk-dev] CPU hog & memory leak on FreeBSD

2020-08-03 Thread Honnappa Nagarahalli
Hello,
I can take a look if you can post the patch.

> -Original Message-
> From: dev  On Behalf Of Lewis Donzis
> Sent: Monday, August 3, 2020 2:43 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] CPU hog & memory leak on FreeBSD
> 
> Hello.
> 
> We've posted about this before, but I'm hoping to find someone willing to
> commit a patched version of lib/librte_eal/bsdapp/eal/eal_interrupts.c that
> corrects a memory leak and 100% CPU hog that is immediately noticeable
> with the i40e driver, at a minimum. We have modified this file to correct the
> problem, and would be happy to provide it back to whomever is a committer
> for this.
If you can send a patch, I can take a look.

> 
> The detailed explanation is:
> 
> Currently, s etting an alarm with rte_eal_alarm_set() registers an alarm
> interrupt by calling rte_intr_callback_register(), which links the callback
> function (eal_alarm_callback) onto a list for that source and sets up a one-
> shot timer via kevent. Setting additional alarms links them on to the
> alarm_list, but also calls rte_eal_alarm_set() again, which links the callback
> function onto the source callback list again.
> 
> When the alarm triggers and eal_alarm_callback() gets called, it goes down
> the list of pending alarms, calling as many callback functions as possible and
> removing each one from the list until it reaches one which has not yet 
> expired.
> Once it's done, if alarm_list is not empty, it calls 
> rte_intr_callback_register(),
> which then links yet another callback onto the interrupt source's list, thus
> creating an infinite loop.
> 
> The problem is that the source callback list grows infinitely under this
> condition (essentially, when more than one alarm is queued). However, the
> call is necessary in order to reset the kevent timer.
> 
> The proposed fix recognizes and leverages the fact that an alarm interrupt in
> FreeBSD should never have more than one callback on its list, so if
Is your fix applicable only for FreeBSD?

> rte_intr_callback_register() is called with an interrupt handle type of
> RTE_INTR_HANDLE_ALARM, and if such an interrupt type already has a non-
> empty list, then a new callback is not created, but the kevent timer is
> restarted properly.
> 
> A much simpler change is possible if we don't mind the overhead of allocating,
> filling-in, linking, de-linking, and freeing a callback unnecessarily. This
> proposed change makes every attempt to avoid that overhead.


Re: [dpdk-dev] CPU hog & memory leak on FreeBSD

2020-08-03 Thread Lewis Donzis
Sure, that would be great.  This is from 18.11.9.

Also, the entire modified file is attached.

Thanks very much!
lew


84,86c84,86
<   struct rte_intr_callback *callback = NULL;
<   struct rte_intr_source *src = NULL;
<   int ret, add_event;
---
>   struct rte_intr_callback *callback;
>   struct rte_intr_source *src;
>   int ret, add_event = 0;
99,106c99
<   /* allocate a new interrupt callback entity */
<   callback = calloc(1, sizeof(*callback));
<   if (callback == NULL) {
<   RTE_LOG(ERR, EAL, "Can not allocate memory\n");
<   return -ENOMEM;
<   }
<   callback->cb_fn = cb;
<   callback->cb_arg = cb_arg;
---
>   rte_spinlock_lock(&intr_lock);
108,110c101
<   rte_spinlock_lock(&intr_lock);
< 
<   /* check if there is at least one callback registered for the fd */
---
>   /* find the source for this intr_handle */
112,115c103,105
<   if (src->intr_handle.fd == intr_handle->fd) {
<   /* we had no interrupts for this */
<   if (TAILQ_EMPTY(&src->callbacks))
<   add_event = 1;
---
>   if (src->intr_handle.fd == intr_handle->fd)
> break;
> }
117,121c107,121
<   TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
<   ret = 0;
<   break;
<   }
<   }
---
> /* If this is an alarm interrupt and it already has a callback, then 
> we don't want to create a new callback
>  * because the only thing on the list should be eal_alarm_callback() 
> and we may be called just to reset the timer.
>  */
> if (src != NULL && src->intr_handle.type == RTE_INTR_HANDLE_ALARM && 
> !TAILQ_EMPTY(&src->callbacks)) {
> callback = NULL;
> } else {
>   /* allocate a new interrupt callback entity */
>   callback = calloc(1, sizeof(*callback));
>   if (callback == NULL) {
>   RTE_LOG(ERR, EAL, "Can not allocate memory\n");
>   ret = -ENOMEM;
> goto fail;
>   }
>   callback->cb_fn = cb;
>   callback->cb_arg = cb_arg;
123,134c123,137
<   /* no existing callbacks for this - add new source */
<   if (src == NULL) {
<   src = calloc(1, sizeof(*src));
<   if (src == NULL) {
<   RTE_LOG(ERR, EAL, "Can not allocate memory\n");
<   ret = -ENOMEM;
<   goto fail;
<   } else {
<   src->intr_handle = *intr_handle;
<   TAILQ_INIT(&src->callbacks);
<   TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
<   TAILQ_INSERT_TAIL(&intr_sources, src, next);
---
> if (src == NULL) {
>   src = calloc(1, sizeof(*src));
>   if (src == NULL) {
>   RTE_LOG(ERR, EAL, "Can not allocate memory\n");
>   ret = -ENOMEM;
>   goto fail;
>   } else {
>   src->intr_handle = *intr_handle;
>   TAILQ_INIT(&src->callbacks);
>   TAILQ_INSERT_TAIL(&intr_sources, src, next);
>   }
> }
> 
>   /* we had no interrupts for this */
>   if (TAILQ_EMPTY(&src->callbacks))
136,137c139,140
<   ret = 0;
<   }
---
> 
>   TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
177c180
<   return ret;
---
>   return 0;
181c184,185
<   TAILQ_REMOVE(&(src->callbacks), callback, next);
---
> if (callback != NULL)
>   TAILQ_REMOVE(&(src->callbacks), callback, next);



- On Aug 3, 2020, at 6:22 PM, Honnappa Nagarahalli 
honnappa.nagaraha...@arm.com wrote:

> Hello,
>   I can take a look if you can post the patch.
> 
>> -Original Message-
>> From: dev  On Behalf Of Lewis Donzis
>> Sent: Monday, August 3, 2020 2:43 PM
>> To: dev@dpdk.org
>> Subject: [dpdk-dev] CPU hog & memory leak on FreeBSD
>> 
>> Hello.
>> 
>> We've posted about this before, but I'm hoping to find someone willing to
>> commit a patched version of lib/librte_eal/bsdapp/eal/eal_interrupts.c that
>> corrects a memory leak and 100% CPU hog that is immediately noticeable
>> with the i40e driver, at a minimum. We have modified this file to correct the
>> problem, and would be happy to provide it back to whomever is a committer
>> for this.
> If you can send a patch, I can take a look.
> 
>> 
>> The detailed explanation is:
>> 
>> Currently, s etting an alarm with rte_eal_alarm_set() registers an alarm
>> interrupt by calling rte_intr_callback_register(), which lin

Re: [dpdk-dev] [PATCH v3 4/5] telemetry: implement empty stubs for Windows

2020-08-03 Thread Narcisa Ana Maria Vasile
On Mon, Aug 03, 2020 at 01:38:42PM +0300, Fady Bader wrote:
> Telemetry didn't compile under Windows.
> Empty stubs implementation was added for Windows.
> 
> Signed-off-by: Fady Bader 
> ---
>  lib/librte_telemetry/rte_telemetry.h|  4 +++
>  lib/librte_telemetry/telemetry.c| 52 
> -
>  lib/librte_telemetry/telemetry_legacy.c | 26 -
>  3 files changed, 80 insertions(+), 2 deletions(-)
> 
> +
> +#ifndef RTE_EXEC_ENV_WINDOWS
>  /* list of command callbacks, with one command registered by default */
>  static struct cmd_callback callbacks[TELEMETRY_MAX_CALLBACKS];
>  static int num_callbacks; /* How many commands are registered */
>  /* Used when accessing or modifying list of command callbacks */
>  static rte_spinlock_t callback_sl = RTE_SPINLOCK_INITIALIZER;
>  static uint16_t v2_clients;
> +int
Remove 'int'

> +#endif
>  
>  int
>  rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, const char 
> *help)
>  {
> +#ifndef RTE_EXEC_ENV_WINDOWS
> +
>   int i = 0;
>  
>   if (strlen(cmd) >= MAX_CMD_LEN || fn == NULL || cmd[0] != '/'
> @@ -76,8 +98,19 @@ rte_telemetry_register_cmd(const char *cmd, telemetry_cb 
> fn, const char *help)
>   rte_spinlock_unlock(&callback_sl);
>  
>   return 0;
> +
> +#else
> +
> + RTE_SET_USED(cmd);
> + RTE_SET_USED(fn);
> + RTE_SET_USED(help);
> +
> + return 0;
> +
> +#endif
Maybe add a comment like /* RTE_EXEC_ENV_WINDOWS */ next to each #else/#endif, 
to be easier to follow what's being guarded out. Your guards are rather small,
so it's easy to follow, but sometimes when big chunks of code are being 
guarded, it's harder to tell what the guard was for (e.g. line 446 it's the end 
of a big #ifdef/#endif). 
>  }
>  
> +#ifndef RTE_EXEC_ENV_WINDOWS


Re: [dpdk-dev] [PATCH] app/testpmd: fix the default RSS key configuration

2020-08-03 Thread oulijun




在 2020/7/17 15:29, Phil Yang 写道:




Subject: [dpdk-dev] [PATCH] app/testpmd: fix the default RSS key
configuration

When an user runs a flow create cmd to configure an RSS rule
without specifying the empty rss actions in testpmd, this mean
that the flow gets the default RSS functions. However, the
testpmd is not set the default RSS key incorrectly when RSS key
is specified in flow create cmd.

Hi Lijun,

I think it works.
When we create an RSS flow rule which doesn't specify any 'rss-hash-key',

the 'rss-hash-key' will be updated with the default key.

Step 1:
testpmd> show port 0 rss-hash key
RSS functions:
   all ipv4-frag ipv4-other ipv6-frag ipv6-other ip
RSS key:


4439796BB54C5023B675EA5B124F9F30B8A2C03DDFDC4D02A08C9B334AF64A4
C05C6FA343958D8557D99583AE138C92E81150366

Step 2:
testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end actions rss

types ipv4-udp end queues end / end

Flow rule #0 created

Step 3:
testpmd> show port 0 rss-hash key
RSS functions:
   all ipv4-udp udp
RSS key:


74657374706D6427732064656661756C74205253532068617368206B65792C206F
7665727269646520697420666F722062657474


Thanks,
Phil

Yes, However, it is not the default value that users use when testpmd
starts. This may mislead users. When the driver is initialized, the
default key used by the driver is provided for users. The key varies
according to the DPDK vendor.However, after the DPDK is initialized, if
the user runs the flow create command without specifying the rss key,
the driver obtains another default key value, which may be different
from the default value expected by the user.

It needs a dummy key.
a4391f8bae85 - app/testpmd: set default RSS key as null

Hi, Phil Yang
   Yes, I've reviewed the patch("a4391f8bae85 - app/testpmd: set 
default RSS key as null") and read all other people's comments.
However, the patch has been reverted in V2 and restored to the orginal 
states.  The link as follows:

https://patches.dpdk.org/patch/47961/
I also don't think the key_len parameter is very profitable
for users to configure.  Others say that other optimizations will be 
considered later. Do you have any plans?


Best regards, Liju Ou



.






Re: [dpdk-dev] [PATCH V1 2/3] mbuf: change free_cb interface to adapt to GSO case

2020-08-03 Thread yang_y_yi
At 2020-08-03 20:34:25, "Olivier Matz"  wrote:
>On Mon, Aug 03, 2020 at 05:42:13PM +0800, yang_y_yi wrote:
>> At 2020-08-03 16:11:39, "Olivier Matz"  wrote:
>> >On Mon, Aug 03, 2020 at 09:26:40AM +0800, yang_y_yi wrote:
>> >> At 2020-08-03 04:29:07, "Olivier Matz"  wrote:
>> >> >Hi,
>> >> >
>> >> >On Sun, Aug 02, 2020 at 07:12:36AM +0800, yang_y_yi wrote:
>> >> >> 
>> >> >> 
>> >> >> At 2020-07-31 23:15:43, "Olivier Matz"  wrote:
>> >> >> >Hi,
>> >> >> >
>> >> >> >On Thu, Jul 30, 2020 at 08:08:59PM +0800, yang_y...@163.com wrote:
>> >> >> >> From: Yi Yang 
>> >> >> >> 
>> >> >> >> In GSO case, segmented mbufs are attached to original
>> >> >> >> mbuf which can't be freed when it is external. The issue
>> >> >> >> is free_cb doesn't know original mbuf and doesn't free
>> >> >> >> it when refcnt of shinfo is 0.
>> >> >> >> 
>> >> >> >> Original mbuf can be freed by rte_pktmbuf_free segmented
>> >> >> >> mbufs or by rte_pktmbuf_free original mbuf. Two kind of
>> >> >> >> cases should have different behaviors. free_cb won't
>> >> >> >> explicitly call rte_pktmbuf_free to free original mbuf
>> >> >> >> if it is freed by rte_pktmbuf_free original mbuf, but it
>> >> >> >> has to call rte_pktmbuf_free to free original mbuf if it
>> >> >> >> is freed by rte_pktmbuf_free segmented mbufs.
>> >> >> >> 
>> >> >> >> In order to fix this issue, free_cb interface has to been
>> >> >> >> changed, __rte_pktmbuf_free_extbuf must deliver called
>> >> >> >> mbuf pointer to free_cb, argument opaque can be defined
>> >> >> >> as a custom struct by user, it can includes original mbuf
>> >> >> >> pointer, user-defined free_cb can compare caller mbuf with
>> >> >> >> mbuf in opaque struct, free_cb should free original mbuf
>> >> >> >> if they are not same, this corresponds to rte_pktmbuf_free
>> >> >> >> segmented mbufs case, otherwise, free_cb won't free original
>> >> >> >> mbuf because the caller explicitly called rte_pktmbuf_free
>> >> >> >> to free it.
>> >> >> >> 
>> >> >> >> Here is pseduo code to show two kind of cases.
>> >> >> >> 
>> >> >> >> case 1. rte_pktmbuf_free segmented mbufs
>> >> >> >> 
>> >> >> >> nb_tx = rte_gso_segment(original_mbuf, /* original mbuf */
>> >> >> >>   &gso_ctx,
>> >> >> >>   /* segmented mbuf */
>> >> >> >>   (struct rte_mbuf **)&gso_mbufs,
>> >> >> >>   MAX_GSO_MBUFS);
>> >> >> >
>> >> >> >I'm sorry but it is not very clear to me what operations are done by
>> >> >> >rte_gso_segment().
>> >> >> >
>> >> >> >In the current code, I only see calls to rte_pktmbuf_attach(),
>> >> >> >which do not deal with external buffers. Am I missing something?
>> >> >> >
>> >> >> >Are you able to show the issue only with mbuf functions? It would
>> >> >> >be helpful to understand what does not work.
>> >> >> >
>> >> >> >
>> >> >> >Thanks,
>> >> >> >Olivier
>> >> >> >
>> >> >> Oliver, thank you for comment, let me show you why it doesn't work for 
>> >> >> my use case.  In OVS DPDK, VM uses vhostuserclient to send large 
>> >> >> packets whose size is about 64K because we enabled TSO & UFO, these 
>> >> >> large packets use rte_mbufs allocated by DPDK virtio_net function 
>> >> >> virtio_dev_pktmbuf_alloc() (in file lib/librte_vhost/virtio_net.c. 
>> >> >> Please refer to [PATCH V1 3/3], I changed free_cb as below, these 
>> >> >> packets use the same allocate function and the same free_cb no matter 
>> >> >> they are TCP packet or UDP packets, in case of VXLAN TSO, most NICs 
>> >> >> can't support inner UDP fragment offload, so OVS DPDK has to do it by 
>> >> >> software, for UDP case, the original rte_mbuf only can be freed by 
>> >> >> segmented rte_mbufs which are output packets of rte_gso_segment, i.e. 
>> >> >> the original rte_mbuf only can freed by free_cb, you can see, it 
>> >> >> explicitly called rte_pktmbuf_free(arg->mbuf), the condition statement 
>> >> >> "if (caller_m != arg->mbuf)" is true for this case, this has no 
>> >> >> problem, but for TCP case, the original mbuf is delivered to 
>> >> >> rte_eth_tx_burst() but not segmented rte_mbufs output by 
>> >> >> rte_gso_segment, PMD driver will call 
>> >> >> rte_pktmbuf_free(original_rte_mbuf) but not 
>> >> >> rte_pktmbuf_free(segmented_rte_mbufs), the same free_cb will be 
>> >> >> called, that means original_rte_mbuf will be freed twice, you know 
>> >> >> what will happen, this is just the issue I'm fixing. I bring in 
>> >> >> caller_m argument, it can help work around this because caller_m is 
>> >> >> arg->mbuf and the condition statement "if (caller_m != arg->mbuf)" is 
>> >> >> false, you can't fix it without the change this patch series did.
>> >> >
>> >> >I'm sill not sure to get your issue. Please, if you have a simple test
>> >> >case using only mbufs functions (without virtio, gso, ...), it would be
>> >> >very helpful because we will be sure that we are talking about the same
>> >> >thing. In case there is an issue, it can easily become a unit test.
>>

Re: [dpdk-dev] [PATCH] doc: announce Vhost dequeue zero-copy removal

2020-08-03 Thread Xia, Chenbo


> -Original Message-
> From: Maxime Coquelin 
> Sent: Monday, August 3, 2020 10:56 PM
> To: dev@dpdk.org; ktray...@redhat.com; Stokes, Ian ;
> Xia, Chenbo ; Wang, Zhihong
> 
> Cc: Loftus, Ciara ; Maxime Coquelin
> 
> Subject: [PATCH] doc: announce Vhost dequeue zero-copy removal
> 
> Vhost-user dequeue zero-copy support will be removed in 20.11. The only
> known user is OVS where the feature is still experimental, and has not 
> received
> any update for several years. This feature faces reliability issues and is 
> often
> conflicting with new features being implemented.
> 
> Signed-off-by: Maxime Coquelin 
> ---
> 
> Hi, the topic was discussed during OVS-DPDK public meeting on July 22nd. Ian, 
> if
> you had time to discuss the topic with your team and agree with the removal,
> please ack the patch.
> If, not please let me know.
> 
>  doc/guides/rel_notes/deprecation.rst | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index ea4cfa7a48..a923849419 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -151,3 +151,8 @@ Deprecation Notices
>Python 2 support will be completely removed in 20.11.
>In 20.08, explicit deprecation warnings will be displayed when running
>scripts with Python 2.
> +
> +* vhost: Vhost-user dequeue zero-copy support will be removed in 20.11.
> +The
> +  only known user is OVS where the feature is still experimental, and
> +has not
> +  received any update for 2.5 years. This feature faces reliability
> +issues and
> +  is often conflicting with new features being implemented.
> --
> 2.26.2

Acked-by: Chenbo Xia 


[dpdk-dev] [PATCH v1] net/iavf: fix hash default set

2020-08-03 Thread Jeff Guo
Different device has different hash capability, it should not be
expected that all hash set would be successful to set into all
devices by default. So remove the return checking when hash default
set. And remove gtpu hash default set, iavf only enable hash for
general protocols.

Fixes: c94366cfc641 ("net/iavf: add GTPU in default hash")
Signed-off-by: Jeff Guo 
---
 drivers/net/iavf/iavf_hash.c | 23 ++-
 1 file changed, 2 insertions(+), 21 deletions(-)

diff --git a/drivers/net/iavf/iavf_hash.c b/drivers/net/iavf/iavf_hash.c
index e2eebd2d3..c06b52ea9 100644
--- a/drivers/net/iavf/iavf_hash.c
+++ b/drivers/net/iavf/iavf_hash.c
@@ -3639,18 +3639,6 @@ struct virtchnl_proto_hdrs *iavf_hash_default_hdrs[] = {
&hdrs_hint_ipv6_udp,
&hdrs_hint_ipv6_tcp,
&hdrs_hint_ipv6_sctp,
-   &hdrs_hint_ipv4_gtpu_ip,
-   &hdrs_hint_ipv4_udp_gtpu_ip,
-   &hdrs_hint_ipv4_tcp_gtpu_ip,
-   &hdrs_hint_ipv4_gtpu_eh,
-   &hdrs_hint_ipv4_udp_gtpu_eh,
-   &hdrs_hint_ipv4_tcp_gtpu_eh,
-   &hdrs_hint_ipv6_gtpu_ip,
-   &hdrs_hint_ipv6_udp_gtpu_ip,
-   &hdrs_hint_ipv6_tcp_gtpu_ip,
-   &hdrs_hint_ipv6_gtpu_eh,
-   &hdrs_hint_ipv6_udp_gtpu_eh,
-   &hdrs_hint_ipv6_tcp_gtpu_eh,
 };
 
 static struct iavf_flow_engine iavf_hash_engine = {
@@ -3676,7 +3664,6 @@ iavf_hash_default_set(struct iavf_adapter *ad, bool add)
 {
struct virtchnl_rss_cfg *rss_cfg;
uint16_t i;
-   int ret;
 
rss_cfg = rte_zmalloc("iavf rss rule",
  sizeof(struct virtchnl_rss_cfg), 0);
@@ -3687,16 +3674,10 @@ iavf_hash_default_set(struct iavf_adapter *ad, bool add)
rss_cfg->proto_hdrs = *iavf_hash_default_hdrs[i];
rss_cfg->rss_algorithm = VIRTCHNL_RSS_ALG_TOEPLITZ_ASYMMETRIC;
 
-   ret = iavf_add_del_rss_cfg(ad, rss_cfg, add);
-   if (ret) {
-   PMD_DRV_LOG(ERR, "fail to %s RSS configure",
-   add ? "add" : "delete");
-   rte_free(rss_cfg);
-   return ret;
-   }
+   iavf_add_del_rss_cfg(ad, rss_cfg, add);
}
 
-   return ret;
+   return 0;
 }
 
 RTE_INIT(iavf_hash_engine_init)
-- 
2.20.1



Re: [dpdk-dev] [PATCH] doc: announce API change in timer

2020-08-03 Thread Sarosh Arif
Thank you Eric, I will fix the mistakes in v2

On Tue, Aug 4, 2020 at 4:16 AM Honnappa Nagarahalli
 wrote:
>
> 
>
> >
> > If the user tries to reset/stop some other timer in it's callback function, 
> > which
> Is there any use case for this? Why not just say document that the user is 
> not allowed to reset some other timer in the call back function? How does the 
> user get access to some other timer in the call back function?
> Not sure if this was discussed earlier, I might have missed.
The issue is more clearly described in bug 491 here is a link:
https://bugs.dpdk.org/show_bug.cgi?id=491
further discussion on this issue was done on the following patch:
https://patches.dpdk.org/patch/73683/

>
> > is also about to expire, using rte_timer_reset_sync/rte_timer_stop_sync the
> > application goes into an infinite loop. This happens because
> > rte_timer_reset_sync/rte_timer_stop_sync loop until the timer resets/stops
> > and there is check inside timer_set_config_state which prevents a running
> > timer from being reset/stopped by not it's own timer_cb. Therefore
> > timer_set_config_state returns -1 due to which rte_timer_reset returns -1 
> > and
> > rte_timer_reset_sync goes into an infinite loop
> >
> > To to prevent this rte_timer_reset_sync and rte_timer_stop_sync should have
> > int return types, so that -1 can be returned if the above condition occurs
> >
> > Signed-off-by: Sarosh Arif 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index ea4cfa7a4..ed93a707d 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -151,3 +151,9 @@ Deprecation Notices
> >Python 2 support will be completely removed in 20.11.
> >In 20.08, explicit deprecation warnings will be displayed when running
> >scripts with Python 2.
> > +
> > +* timer: Since timer can get stuck in an infinite loop if the
> > +application tries to
> > +  reset/stop some other timer in it's callback function, which is also
> > +about to
> > +  expire. The function ``rte_timer_stop_sync`` and
> > +``rte_timer_stop_sync``  will
> > +  have a int return type in order to return with -1 in when this
> > +condition
> > +  occures.
> > --
> > 2.17.1
>


[dpdk-dev] [PATCH v2] doc: announce API change in timer

2020-08-03 Thread Sarosh Arif
If the user tries to reset/stop some other timer in it's callback
function, which is also about to expire, using
rte_timer_reset_sync/rte_timer_stop_sync the application goes into
an infinite loop. This happens because
rte_timer_reset_sync/rte_timer_stop_sync loop until the timer
resets/stops and there is check inside timer_set_config_state which
prevents a running timer from being reset/stopped by not it's own
timer_cb. Therefore timer_set_config_state returns -1 due to which
rte_timer_reset returns -1 and rte_timer_reset_sync goes into an
infinite loop

To to prevent this rte_timer_reset_sync and rte_timer_stop_sync should
have int return types, so that -1 can be returned if the above condition
occurs

Signed-off-by: Sarosh Arif 
---
v2: rephrase and fix typo
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7a4..227950165 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -151,3 +151,9 @@ Deprecation Notices
   Python 2 support will be completely removed in 20.11.
   In 20.08, explicit deprecation warnings will be displayed when running
   scripts with Python 2.
+
+* timer: Timers can get stuck in an infinite loop if their callback tries to
+  synchronously reset/stop some other timer that is also about to expire. The
+  functions ``rte_timer_reset_sync`` and ``rte_timer_stop_sync`` will be 
updated
+  with an int return type so that an error code can be returned when this
+  condition occurs.
-- 
2.17.1



Re: [dpdk-dev] [PATCH] doc: announce removal of ethdev flow director API

2020-08-03 Thread Jerin Jacob
On Mon, Aug 3, 2020 at 11:28 PM Ajit Khaparde
 wrote:
>
> On Mon, Aug 3, 2020 at 10:49 AM Thomas Monjalon  wrote:
>
> > The flow director config, part of the legacy filtering API,
> > was marked as deprecated last year.
> > A separate notice is added to make clear that these specific structs
> > will be removed as well in DPDK 20.11, as the rest of the legacy
> > filtering API.
> >
> > Signed-off-by: Thomas Monjalon 
> >
> Acked-by: Ajit Khaparde 

Acked-by: Jerin Jacob 



>
>
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index ea4cfa7a48..24474c563f 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -84,6 +84,10 @@ Deprecation Notices
> >Target release for removal of the legacy API will be defined once most
> >PMDs have switched to rte_flow.
> >
> > +* ethdev: The flow director API, including ``rte_eth_conf.fdir_conf``
> > field,
> > +  and the related structures (``rte_fdir_*`` and ``rte_eth_fdir_*``),
> > +  will be removed in DPDK 20.11.
> > +
> >  * ethdev: Update API functions returning ``void`` to return ``int`` with
> >negative errno values to indicate various error conditions (e.g.
> >invalid port ID, unsupported operation, failed operation):
> > --
> > 2.27.0
> >
> >


Re: [dpdk-dev] [PATCH v1] net/iavf: fix hash default set

2020-08-03 Thread Zhang, Qi Z



> -Original Message-
> From: Guo, Jia 
> Sent: Tuesday, August 4, 2020 10:59 AM
> To: Zhang, Qi Z ; Wu, Jingjing ;
> Xing, Beilei 
> Cc: dev@dpdk.org; Guo, Junfeng ; Su, Simei
> ; Guo, Jia 
> Subject: [PATCH v1] net/iavf: fix hash default set
> 
> Different device has different hash capability, it should not be expected 
> that all
> hash set would be successful to set into all devices by default. So remove the
> return checking when hash default set. And remove gtpu hash default set, iavf
> only enable hash for general protocols.
> 
> Fixes: c94366cfc641 ("net/iavf: add GTPU in default hash")
> Signed-off-by: Jeff Guo 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi



[dpdk-dev] [PATCH v2 0/3] add AVF RSS support for IPv6 prefix

2020-08-03 Thread Junfeng Guo
RSS for IPv6 prefix fields are supported in this patchset, so that we
can use prefixes instead of full IPv6 address for AVF RSS. The prefix
here includes the first 64 bits of both SRC and DST IPv6 address.

v2:
* add support RSS for GTPU IPv6 prefix 64bit.

[PATCH v2 1/3] common/iavf: support virtual channel for IPv6 prefix.
[PATCH v2 2/3] net/iavf: support RSS for IPv6 prefix 64bit.
[PATCH v2 3/3] net/iavf: support RSS for GTPU IPv6 prefix 64bit.

Junfeng Guo (3):
  common/iavf: support virtual channel for IPv6 prefix
  net/iavf: support RSS for IPv6 prefix 64bit
  net/iavf: support RSS for GTPU IPv6 prefix 64bit

 drivers/common/iavf/virtchnl.h |   13 +
 drivers/net/iavf/iavf_hash.c   | 1831 +++-
 2 files changed, 1597 insertions(+), 247 deletions(-)

-- 
2.25.1



[dpdk-dev] [PATCH v2 1/3] common/iavf: support virtual channel for IPv6 prefix

2020-08-03 Thread Junfeng Guo
Some IPv6 prefix related protocol header fields are defined in this
patch, so that we can use prefix instead of full IPv6 address for RSS.
Ref https://tools.ietf.org/html/rfc6052.

Signed-off-by: Junfeng Guo 
---
 drivers/common/iavf/virtchnl.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/common/iavf/virtchnl.h b/drivers/common/iavf/virtchnl.h
index 79515ee8b..424e18a11 100644
--- a/drivers/common/iavf/virtchnl.h
+++ b/drivers/common/iavf/virtchnl.h
@@ -881,6 +881,19 @@ enum virtchnl_proto_hdr_field {
VIRTCHNL_PROTO_HDR_IPV6_TC,
VIRTCHNL_PROTO_HDR_IPV6_HOP_LIMIT,
VIRTCHNL_PROTO_HDR_IPV6_PROT,
+   /* IPV6 Prefix */
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX32_SRC,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX32_DST,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX40_SRC,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX40_DST,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX48_SRC,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX48_DST,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX56_SRC,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX56_DST,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_SRC,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_DST,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX96_SRC,
+   VIRTCHNL_PROTO_HDR_IPV6_PREFIX96_DST,
/* TCP */
VIRTCHNL_PROTO_HDR_TCP_SRC_PORT =
PROTO_HDR_FIELD_START(VIRTCHNL_PROTO_HDR_TCP),
-- 
2.25.1



[dpdk-dev] [PATCH v2 2/3] net/iavf: support RSS for IPv6 prefix 64bit

2020-08-03 Thread Junfeng Guo
RSS for IPv6 prefix 64bit fields are supported in this patch, so that
we can use prefix instead of full IPv6 address for RSS. The prefix
here only includes the first 64 bits of both SRC and DST IPv6 address.

Signed-off-by: Junfeng Guo 
---
 drivers/net/iavf/iavf_hash.c | 232 +++
 1 file changed, 232 insertions(+)

diff --git a/drivers/net/iavf/iavf_hash.c b/drivers/net/iavf/iavf_hash.c
index e2eebd2d3..3dc96d0f6 100644
--- a/drivers/net/iavf/iavf_hash.c
+++ b/drivers/net/iavf/iavf_hash.c
@@ -445,6 +445,41 @@ static struct iavf_pattern_match_item 
iavf_hash_pattern_list[] = {
FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_DST) | \
FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PROT), {BUFF_NOUSED } }
 
+/* IPV6 Prefix 64 for L3 */
+#define proto_hint_ipv6_pre64 { \
+   VIRTCHNL_PROTO_HDR_IPV6, \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_SRC) | \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_DST), {BUFF_NOUSED } }
+
+#define proto_hint_ipv6_pre64_src { \
+   VIRTCHNL_PROTO_HDR_IPV6, \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_SRC), {BUFF_NOUSED } }
+
+#define proto_hint_ipv6_pre64_dst { \
+   VIRTCHNL_PROTO_HDR_IPV6, \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_DST), {BUFF_NOUSED } }
+
+/* IPV6 Prefix 64 for L4 */
+#define proto_hint_ipv6_pre64_prot { \
+   VIRTCHNL_PROTO_HDR_IPV6, \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_SRC) | \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_DST) | \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PROT), {BUFF_NOUSED } }
+
+#define proto_hint_ipv6_pre64_src_prot { \
+   VIRTCHNL_PROTO_HDR_IPV6, \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_SRC) | \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PROT), {BUFF_NOUSED } }
+
+#define proto_hint_ipv6_pre64_dst_prot { \
+   VIRTCHNL_PROTO_HDR_IPV6, \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PREFIX64_DST) | \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PROT), {BUFF_NOUSED } }
+
+#define proto_hint_ipv6_pre64_only_prot { \
+   VIRTCHNL_PROTO_HDR_IPV6, \
+   FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_IPV6_PROT), {BUFF_NOUSED } }
+
 #define proto_hint_gtpu_ip_teid { \
VIRTCHNL_PROTO_HDR_GTPU_IP, \
FIELD_SELECTOR(VIRTCHNL_PROTO_HDR_GTPU_IP_TEID), {BUFF_NOUSED } }
@@ -1999,6 +2034,139 @@ struct virtchnl_proto_hdrs hdrs_hint_ipv6_udp_esp = {
proto_hint_udp_only, proto_hint_esp }
 };
 
+/* IPV6 Prefix 64 */
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64 = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_ONE, {proto_hint_ipv6_pre64 }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_ONE, {proto_hint_ipv6_pre64_src }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_ONE, {proto_hint_ipv6_pre64_dst }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_prot = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_ONE, {proto_hint_ipv6_pre64_prot }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_prot = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_ONE, {proto_hint_ipv6_pre64_src_prot }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_prot = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_ONE, {proto_hint_ipv6_pre64_dst_prot }
+};
+
+/* IPV6 Prefix 64 UDP */
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_udp = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_prot,
+   proto_hint_udp }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_udp_src_port = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_src_prot,
+   proto_hint_udp_src_port }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_udp_dst_port = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_src_prot,
+   proto_hint_udp_dst_port }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_udp_src_port = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_dst_prot,
+   proto_hint_udp_src_port }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_udp_dst_port = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_dst_prot,
+   proto_hint_udp_dst_port }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_udp_src_port = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_only_prot,
+   proto_hint_udp_src_port }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_udp_dst_port = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_only_prot,
+   proto_hint_udp_dst_port }
+};
+
+/* IPV6 Prefix 64 TCP */
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_tcp = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_prot,
+   proto_hint_tcp }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_tcp_src_port = {
+   TUNNEL_LEVEL_OUTER, PROTO_COUNT_TWO, {proto_hint_ipv6_pre64_src_prot,
+   proto_hint_tcp_

[dpdk-dev] [PATCH v2 3/3] net/iavf: support RSS for GTPU IPv6 prefix 64bit

2020-08-03 Thread Junfeng Guo
RSS for GTP-U IPv6 prefix 64bit fields are supported in this patch.
The prefix here includes the first 64 bits of both SRC and DST inner
IPv6 address for GTP-U.

Signed-off-by: Junfeng Guo 
---
 drivers/net/iavf/iavf_hash.c | 1119 +-
 1 file changed, 1112 insertions(+), 7 deletions(-)

diff --git a/drivers/net/iavf/iavf_hash.c b/drivers/net/iavf/iavf_hash.c
index 3dc96d0f6..7df45630c 100644
--- a/drivers/net/iavf/iavf_hash.c
+++ b/drivers/net/iavf/iavf_hash.c
@@ -2167,6 +2167,439 @@ struct virtchnl_proto_hdrs 
hdrs_hint_ipv6_pre64_sctp_dst_port = {
proto_hint_sctp_dst_port }
 };
 
+/* GTPU + inner IPV6 Prefix 64 */
+/* GTPU IP + inner IPV6 Prefix 64 */
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_TWO, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_TWO, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_dst }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_TWO, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64 }
+};
+
+/* GTPU IP + inner IPV6 Prefix 64 UDP */
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_udp_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_prot, proto_hint_udp }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_udp_src_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_udp_src_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_udp_dst_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_udp_dst_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_udp_src_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_udp_src_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_udp_dst_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_udp_dst_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_udp_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_udp_only }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_udp_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_dst_prot, proto_hint_udp_only }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_udp_src_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_only_prot, proto_hint_udp_src_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_udp_dst_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_only_prot, proto_hint_udp_dst_port}
+};
+
+/* GTPU IP + inner IPV6 Prefix 64 TCP */
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_tcp_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_prot, proto_hint_tcp }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_tcp_src_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_tcp_src_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_tcp_dst_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_tcp_dst_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_tcp_src_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_tcp_src_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_tcp_dst_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_tcp_dst_port}
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_src_tcp_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_src_prot, proto_hint_tcp_only }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_dst_tcp_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre64_dst_prot, proto_hint_tcp_only }
+};
+
+struct virtchnl_proto_hdrs hdrs_hint_ipv6_pre64_tcp_src_gtpu_ip = {
+   TUNNEL_LEVEL_FIRST_INNER, PROTO_COUNT_THREE, {proto_hint_gtpu_ip_only,
+   proto_hint_ipv6_pre6

Re: [dpdk-dev] [PATCH v3 4/5] telemetry: implement empty stubs for Windows

2020-08-03 Thread Fady Bader



> -Original Message-
> From: Narcisa Ana Maria Vasile 
> Sent: Tuesday, August 4, 2020 4:27 AM
> To: Fady Bader 
> Cc: dev@dpdk.org; Thomas Monjalon ; Tasnim Bashar
> ; Tal Shnaiderman ; Yohad Tor
> ; dmitry.kozl...@gmail.com;
> harini.ramakrish...@microsoft.com; ocard...@microsoft.com;
> pallavi.ka...@intel.com; ranjit.me...@intel.com; kevin.la...@intel.com;
> ferruh.yi...@intel.com; arybche...@solarflare.com; ciara.po...@intel.com
> Subject: Re: [PATCH v3 4/5] telemetry: implement empty stubs for Windows
> 
> On Mon, Aug 03, 2020 at 01:38:42PM +0300, Fady Bader wrote:
> > Telemetry didn't compile under Windows.
> > Empty stubs implementation was added for Windows.
> >
> > Signed-off-by: Fady Bader 
> > ---
> >  lib/librte_telemetry/rte_telemetry.h|  4 +++
> >  lib/librte_telemetry/telemetry.c| 52
> -
> >  lib/librte_telemetry/telemetry_legacy.c | 26 -
> >  3 files changed, 80 insertions(+), 2 deletions(-)
> >
> > +
> > +#ifndef RTE_EXEC_ENV_WINDOWS
> >  /* list of command callbacks, with one command registered by default
> > */  static struct cmd_callback callbacks[TELEMETRY_MAX_CALLBACKS];
> >  static int num_callbacks; /* How many commands are registered */
> >  /* Used when accessing or modifying list of command callbacks */
> > static rte_spinlock_t callback_sl = RTE_SPINLOCK_INITIALIZER;  static
> > uint16_t v2_clients;
> > +int
> Remove 'int'
> 

OK.

> > +#endif
> >
> >  int
> >  rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, const
> > char *help)  {
> > +#ifndef RTE_EXEC_ENV_WINDOWS
> > +
> > int i = 0;
> >
> > if (strlen(cmd) >= MAX_CMD_LEN || fn == NULL || cmd[0] != '/'
> > @@ -76,8 +98,19 @@ rte_telemetry_register_cmd(const char *cmd,
> telemetry_cb fn, const char *help)
> > rte_spinlock_unlock(&callback_sl);
> >
> > return 0;
> > +
> > +#else
> > +
> > +   RTE_SET_USED(cmd);
> > +   RTE_SET_USED(fn);
> > +   RTE_SET_USED(help);
> > +
> > +   return 0;
> > +
> > +#endif
> Maybe add a comment like /* RTE_EXEC_ENV_WINDOWS */ next to each
> #else/#endif, to be easier to follow what's being guarded out. Your guards are
> rather small, so it's easy to follow, but sometimes when big chunks of code 
> are
> being guarded, it's harder to tell what the guard was for (e.g. line 446 it's 
> the end
> of a big #ifdef/#endif).

Ok, I'll send a new patch soon.

> >  }
> >
> > +#ifndef RTE_EXEC_ENV_WINDOWS


[dpdk-dev] [PATCH v4 1/5] eal: added interrupts empty stubs

2020-08-03 Thread Fady Bader
The ethdev lib uses interrupts. Interrupts are not implemented for Windows.
To solve this, empty interrupt stubs were added under Windows.

Signed-off-by: Fady Bader 
---
 lib/librte_eal/windows/eal_interrupts.c | 17 +
 lib/librte_eal/windows/meson.build  |  1 +
 2 files changed, 18 insertions(+)
 create mode 100644 lib/librte_eal/windows/eal_interrupts.c

diff --git a/lib/librte_eal/windows/eal_interrupts.c 
b/lib/librte_eal/windows/eal_interrupts.c
new file mode 100644
index 00..1e3c6d20d2
--- /dev/null
+++ b/lib/librte_eal/windows/eal_interrupts.c
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+#include 
+
+int
+rte_intr_rx_ctl(struct rte_intr_handle *intr_handle,
+   int epfd, int op, unsigned int vec, void *data)
+{
+   RTE_SET_USED(intr_handle);
+   RTE_SET_USED(epfd);
+   RTE_SET_USED(op);
+   RTE_SET_USED(vec);
+   RTE_SET_USED(data);
+
+   return -ENOTSUP;
+}
diff --git a/lib/librte_eal/windows/meson.build 
b/lib/librte_eal/windows/meson.build
index 08c888e018..c5a19648d6 100644
--- a/lib/librte_eal/windows/meson.build
+++ b/lib/librte_eal/windows/meson.build
@@ -17,6 +17,7 @@ sources += files(
'eal_timer.c',
'fnmatch.c',
'getopt.c',
+   'eal_interrupts.c',
 )
 
 dpdk_conf.set10('RTE_EAL_NUMA_AWARE_HUGEPAGES', true)
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v4 5/5] ethdev: compiling ethdev under Windows

2020-08-03 Thread Fady Bader
Compiling needed libraries for ethdev under Windows.

Signed-off-by: Fady Bader 
---
 lib/meson.build | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/meson.build b/lib/meson.build
index 6bbaf242a9..c145240eb9 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -38,9 +38,10 @@ libraries = [
 if is_windows
libraries = [
'kvargs',
+   'telemetry',
'eal',
'ring',
-   'mempool', 'mbuf', 'pci', 'net',
+   'mempool', 'mbuf', 'net', 'meter', 'ethdev', 'pci', # core
] # only supported libraries for windows
 endif
 
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v4 2/5] eal: updated export list for Windows

2020-08-03 Thread Fady Bader
Added eal functions used by ethdev lib to the export list under Windows.

Signed-off-by: Fady Bader 
---
 lib/librte_eal/rte_eal_exports.def | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/lib/librte_eal/rte_eal_exports.def 
b/lib/librte_eal/rte_eal_exports.def
index 69204a92c6..782e6fc721 100644
--- a/lib/librte_eal/rte_eal_exports.def
+++ b/lib/librte_eal/rte_eal_exports.def
@@ -28,6 +28,7 @@ EXPORTS
rte_eal_tailq_register
rte_eal_using_phys_addrs
rte_free
+   rte_get_tsc_hz
rte_hexdump
rte_malloc
rte_malloc_dump_stats
@@ -145,3 +146,13 @@ EXPORTS
rte_mem_map
rte_mem_page_size
rte_mem_unmap
+
+   rte_class_register
+   rte_devargs_parse
+   rte_class_find_by_name
+   rte_socket_id
+   rte_strerror
+   rte_intr_rx_ctl
+   rte_log_register
+   rte_log_set_level
+   rte_log_register_type_and_pick_level
\ No newline at end of file
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v4 3/5] ethdev: remove structs from export list

2020-08-03 Thread Fady Bader
Some ethdev structs were present in ethdev export list.
There structs were removed from the export list.

Signed-off-by: Fady Bader 
---
 lib/librte_ethdev/rte_ethdev_version.map | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_ethdev/rte_ethdev_version.map 
b/lib/librte_ethdev/rte_ethdev_version.map
index 1212a17d32..21a2177756 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -89,7 +89,6 @@ DPDK_20.0 {
rte_eth_iterator_next;
rte_eth_led_off;
rte_eth_led_on;
-   rte_eth_link;
rte_eth_link_get;
rte_eth_link_get_nowait;
rte_eth_macaddr_get;
@@ -104,7 +103,6 @@ DPDK_20.0 {
rte_eth_rx_queue_setup;
rte_eth_set_queue_rate_limit;
rte_eth_speed_bitflag;
-   rte_eth_stats;
rte_eth_stats_get;
rte_eth_stats_reset;
rte_eth_timesync_adjust_time;
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v4 0/5] compiling ethdev lib under windows

2020-08-03 Thread Fady Bader
Added needed changes in order to get ethdev compiling under windows.

Depends-on: series-10382 ("compile librte_net for windows")

v4: added comments to #else and fixed code issue.

v3: rebased on current master, added more exports to export list

v2: fixed logging issue in telemetry lib.

Fady Bader (5):
  eal: added interrupts empty stubs
  eal: updated export list for Windows
  ethdev: remove structs from export list
  telemetry: implement empty stubs for Windows
  ethdev: compiling ethdev under Windows

 lib/librte_eal/rte_eal_exports.def   | 11 +++
 lib/librte_eal/windows/eal_interrupts.c  | 17 +++
 lib/librte_eal/windows/meson.build   |  1 +
 lib/librte_ethdev/rte_ethdev_version.map |  2 --
 lib/librte_telemetry/rte_telemetry.h |  4 +++
 lib/librte_telemetry/telemetry.c | 51 +++-
 lib/librte_telemetry/telemetry_legacy.c  | 26 +++-
 lib/meson.build  |  3 +-
 8 files changed, 110 insertions(+), 5 deletions(-)
 create mode 100644 lib/librte_eal/windows/eal_interrupts.c

-- 
2.16.1.windows.4



[dpdk-dev] [PATCH v4 4/5] telemetry: implement empty stubs for Windows

2020-08-03 Thread Fady Bader
Telemetry didn't compile under Windows.
Empty stubs implementation was added for Windows.

Signed-off-by: Fady Bader 
---
 lib/librte_telemetry/rte_telemetry.h|  4 +++
 lib/librte_telemetry/telemetry.c| 51 -
 lib/librte_telemetry/telemetry_legacy.c | 26 -
 3 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/lib/librte_telemetry/rte_telemetry.h 
b/lib/librte_telemetry/rte_telemetry.h
index d13010b8fb..c3c0e44599 100644
--- a/lib/librte_telemetry/rte_telemetry.h
+++ b/lib/librte_telemetry/rte_telemetry.h
@@ -5,6 +5,10 @@
 #include 
 #include 
 
+#ifdef RTE_EXEC_ENV_WINDOWS
+#include 
+#endif
+
 #ifndef _RTE_TELEMETRY_H_
 #define _RTE_TELEMETRY_H_
 
diff --git a/lib/librte_telemetry/telemetry.c b/lib/librte_telemetry/telemetry.c
index 0252282735..d794e75e5e 100644
--- a/lib/librte_telemetry/telemetry.c
+++ b/lib/librte_telemetry/telemetry.c
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2020 Intel Corporation
  */
-
+#ifndef RTE_EXEC_ENV_WINDOWS
 #include 
 #include 
 #include 
@@ -20,6 +20,20 @@
 #include "telemetry_data.h"
 #include "rte_telemetry_legacy.h"
 
+#else /* RTE_EXEC_ENV_WINDOWS */
+
+/* includes for Windows */
+#include 
+#include 
+
+#include "rte_telemetry.h"
+#include "telemetry_json.h"
+#include "telemetry_data.h"
+#include "rte_telemetry_legacy.h"
+
+#endif
+
+#ifndef RTE_EXEC_ENV_WINDOWS
 #define MAX_CMD_LEN 56
 #define MAX_HELP_LEN 64
 #define MAX_OUTPUT_LEN (1024 * 16)
@@ -42,17 +56,24 @@ struct socket {
 };
 static struct socket v2_socket; /* socket for v2 telemetry */
 static struct socket v1_socket; /* socket for v1 telemetry */
+#endif
+
 static char telemetry_log_error[1024]; /* Will contain error on init failure */
+
+#ifndef RTE_EXEC_ENV_WINDOWS
 /* list of command callbacks, with one command registered by default */
 static struct cmd_callback callbacks[TELEMETRY_MAX_CALLBACKS];
 static int num_callbacks; /* How many commands are registered */
 /* Used when accessing or modifying list of command callbacks */
 static rte_spinlock_t callback_sl = RTE_SPINLOCK_INITIALIZER;
 static uint16_t v2_clients;
+#endif
 
 int
 rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, const char *help)
 {
+#ifndef RTE_EXEC_ENV_WINDOWS
+
int i = 0;
 
if (strlen(cmd) >= MAX_CMD_LEN || fn == NULL || cmd[0] != '/'
@@ -76,8 +97,19 @@ rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, 
const char *help)
rte_spinlock_unlock(&callback_sl);
 
return 0;
+
+#else /* RTE_EXEC_ENV_WINDOWS */
+
+   RTE_SET_USED(cmd);
+   RTE_SET_USED(fn);
+   RTE_SET_USED(help);
+
+   return 0;
+
+#endif
 }
 
+#ifndef RTE_EXEC_ENV_WINDOWS
 static int
 list_commands(const char *cmd __rte_unused, const char *params __rte_unused,
struct rte_tel_data *d)
@@ -411,11 +443,14 @@ telemetry_v2_init(const char *runtime_dir, rte_cpuset_t 
*cpuset)
 
return 0;
 }
+#endif
 
 int32_t
 rte_telemetry_init(const char *runtime_dir, rte_cpuset_t *cpuset,
const char **err_str)
 {
+#ifndef RTE_EXEC_ENV_WINDOWS
+
if (telemetry_v2_init(runtime_dir, cpuset) != 0) {
*err_str = telemetry_log_error;
return -1;
@@ -424,4 +459,18 @@ rte_telemetry_init(const char *runtime_dir, rte_cpuset_t 
*cpuset,
*err_str = telemetry_log_error;
}
return 0;
+
+#else /* RTE_EXEC_ENV_WINDOWS */
+
+   RTE_SET_USED(runtime_dir);
+   RTE_SET_USED(cpuset);
+   RTE_SET_USED(err_str);
+
+   snprintf(telemetry_log_error, sizeof(telemetry_log_error),
+   "Telemetry is not supported on Windows.");
+
+   return 0;
+
+#endif
 }
+
diff --git a/lib/librte_telemetry/telemetry_legacy.c 
b/lib/librte_telemetry/telemetry_legacy.c
index a341fe4ebd..ed2c1cc428 100644
--- a/lib/librte_telemetry/telemetry_legacy.c
+++ b/lib/librte_telemetry/telemetry_legacy.c
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2020 Intel Corporation
  */
-
+#ifndef RTE_EXEC_ENV_WINDOWS
 #include 
 #include 
 #include 
@@ -15,6 +15,15 @@
 
 #include "rte_telemetry_legacy.h"
 
+#else /* RTE_EXEC_ENV_WINDOWS */
+
+#include 
+
+#include "rte_telemetry_legacy.h"
+
+#endif
+#ifndef RTE_EXEC_ENV_WINDOWS
+
 #define MAX_LEN 128
 #define BUF_SIZE 1024
 #define CLIENTS_UNREG_ACTION "\"action\":2"
@@ -48,12 +57,15 @@ struct json_command 
callbacks[TELEMETRY_LEGACY_MAX_CALLBACKS] = {
 };
 int num_legacy_callbacks = 1;
 static rte_spinlock_t callback_sl = RTE_SPINLOCK_INITIALIZER;
+#endif
 
 int
 rte_telemetry_legacy_register(const char *cmd,
enum rte_telemetry_legacy_data_req data_req,
telemetry_legacy_cb fn)
 {
+#ifndef RTE_EXEC_ENV_WINDOWS
+
if (fn == NULL)
return -EINVAL;
if (num_legacy_callbacks >= (int) RTE_DIM(callbacks))
@@ -71,8 +83,19 @@ rte_telemetry_legacy_register(const char *cmd,
rte_spinlock_unlock(&callback_sl);