Re: [dpdk-dev] [[PATCH v3 0/4] pdump HW Rx timestamps for mlx5

2020-10-07 Thread Morten Brørup
> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> Sent: Tuesday, October 6, 2020 6:26 PM
> 
> On 7/24/2020 9:23 PM, Patrick Keroulas wrote:
> > The intention is to produce a pcap with nanosecond precision when
> > Rx timestamp offloading is activated on mlx5 NIC.
> >
> > The packets forwarded by testpmd hold the raw counter but a pcap
> > requires a time unit. Assuming that the NIC clock is already synced
> > with external master clock, this patchset simply integrates the
> > nanosecond converter that derives from device frequency and start
> time.
> >
> > v2 -> v3:
> >  - replace ib_verbs nanosecond converter with more generic method
> >based on device frequency and start time.
> >
> > Patrick Keroulas (3):
> >net/mlx5: query device frequency
> >ethdev: add API to query device frequency
> >pdump: convert timestamp to nanoseconds on Rx path
> >
> > Vivien Didelot (1):
> >net/pcap: support hardware Tx timestamps
> >
> 
> We have three patch/patchset for same issue,
> 
> 1) Current one, https://patches.dpdk.org/user/todo/dpdk/?series=11294
> 2) Vivien's series:
> https://patches.dpdk.org/user/todo/dpdk/?series=10678
> 3) Vivien's pcap patch:
> https://patches.dpdk.org/user/todo/dpdk/?series=10403
> 
> And one related one from Slava,
> 4)
> https://patchwork.dpdk.org/project/dpdk/list/?series=10948&state=%2A&ar
> chive=both
> 
> I am replying to this one since this is the latest, but first can we
> clarify if
> all are still valid and can we combine the effort?
> 
> 
> Second, the problems to solve,
> 1) Device provided timestamp value has no unit, it needs to be
> converted to
> nanosecond.
> There are different approaches in above patches,
> - One adds '.convert_ts_to_ns' dev_ops to make PMD convert the
> timestamp
> - Other adds '.eth_get_clock_freq' dev_ops, to get frequency from
> device clock
>so that conversion can be done within the app.
> - I wonder why existing 'rte_eth_read_clock()' can't be enough for
> conversion,
>as described in its documentation:
> 
> https://doc.dpdk.org/api/rte__ethdev_8h.html#a4346bf07a0d302c9ba4fe06ba
> ffd3196
>  rte_eth_read_clock(port, start);
>  rte_delay_ms(100);
>  rte_eth_read_clock(port, end);
>  double freq = (end - start) * 10;
> 
> 2) Where to put the timestamps data, is it to the 'mbuf->timestamp' or
> dynamic
> filed 'RTE_MBUF_DYNFIELD_TIMESTAMP_NAME'? Using dynamic field of
> requires
> more work on registering and looking up the fields.
> 
> 3) Calculation in the datapath should be done in a performance
> optimized way,
> instead of using division.
> 
> 4) Should the timestamp value provided by the Rx device used, or should
> the time
> when the packet transmitted used. I can see current use case seems
> first one,
> but can there be cases we would like to record when packet exactly
> sent.
> 
> 5) Should we create a 'PKT_TX_TIMESTAMP' flag, instead of re-using the
> Rx one,
> to let the application explicitly define which packets to record
> timestamp.
> 
> Please add if I missing more.

Regarding RX timestamps:

I believe that it is on the roadmap to remove the timestamp field from the mbuf 
and make it a dynamic field instead.

Furthermore, I understand that some Mellanox NICs have accurate "wall clock" 
timestamping capability, possibly even PTP synchronized. So I propose to make 
two variants of the dynamic timestamp field, one with an accurate "wall clock" 
timestamp for NICs with that capability, and one with an NIC-specific unitless 
timestamp (like the current timestamp).

Regarding TX timestamps:

Do any NICs currently have a TX pipeline that puts a TX timestamp in the 
descriptor on transmission to the wire instead of just marking the descriptor 
as free? If not, then there is probably no need for TX timestamps at that level 
of accuracy for captured packets. The application can set the timestamp in the 
captured/mirrored packets while cloning the originals and passing them on to 
the NIC driver.

And if the dynamic mbuf field that instructs the NIC to transmit the packet at 
a specific time is present, the application might even use that timestamp, 
knowing that the NIC will transmit the packet at that time.


Med venlig hilsen / kind regards
- Morten Brørup



Re: [dpdk-dev] [PATCH v3 01/11] ethdev: add extensions attributes to IPv6 item

2020-10-07 Thread Dekel Peled
Thanks, PSB.

> -Original Message-
> From: Ajit Khaparde 
> Sent: Wednesday, October 7, 2020 2:44 AM
> To: Dekel Peled 
> Cc: Ori Kam ; NBU-Contact-Thomas Monjalon
> ; Ferruh Yigit ; Andrew
> Rybchenko ; Ananyev, Konstantin
> ; Olivier Matz ;
> Wenzhuo Lu ; Beilei Xing ;
> Iremonger, Bernard ; Matan Azrad
> ; Shahaf Shuler ; Slava
> Ovsiienko ; dpdk-dev 
> Subject: Re: [dpdk-dev] [PATCH v3 01/11] ethdev: add extensions attributes
> to IPv6 item
> 
> On Mon, Oct 5, 2020 at 1:37 AM Dekel Peled  wrote:
> >
> > Using the current implementation of DPDK, an application cannot match on
> > IPv6 packets, based on the existing extension headers, in a simple way.
> >
> > Field 'Next Header' in IPv6 header indicates type of the first extension
> > header only. Following extension headers can't be identified by
> > inspecting the IPv6 header.
> > As a result, the existence or absence of specific extension headers
> > can't be used for packet matching.
> >
> > For example, fragmented IPv6 packets contain a dedicated extension
> header
> > (which is implemented in a later patch of this series).
> > Non-fragmented packets don't contain the fragment extension header.
> > For an application to match on non-fragmented IPv6 packets, the current
> > implementation doesn't provide a suitable solution.
> > Matching on the Next Header field is not sufficient, since additional
> > extension headers might be present in the same packet.
> > To match on fragmented IPv6 packets, the same difficulty exists.
> >
> > This patch implements the update as detailed in RFC [1].
> > A set of additional values will be added to IPv6 header struct.
> > These values will indicate the existence of every defined extension
> > header type, providing simple means for identification of existing
> > extensions in the packet header.
> > Continuing the above example, fragmented packets can be identified using
> > the specific value indicating existence of fragment extension header.
> >
> > [1] https://mails.dpdk.org/archives/dev/2020-August/177257.html
> >
> > Signed-off-by: Dekel Peled 
> > Acked-by: Ori Kam 
> > ---
> >  doc/guides/prog_guide/rte_flow.rst | 16 +---
> >  lib/librte_ethdev/rte_flow.h   | 25 +++--
> >  2 files changed, 36 insertions(+), 5 deletions(-)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> > index 119b128..0b476da 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -946,11 +946,21 @@ Item: ``IPV6``
> >
> >  Matches an IPv6 header.
> >
> > -Note: IPv6 options are handled by dedicated pattern items, see `Item:
> > -IPV6_EXT`_.
> > +Dedicated flags indicate existence of specific extension headers.
> > +Every type of extension header can use a dedicated pattern item, or
> > +the generic `Item: IPV6_EXT`_.
> 
> Can you add how the PMD should interpret if these flags are not set
> by the application.
> There is a mention of some of that in the testpmd commit log.
> I think it is good to spell it out here.

Agree, I will add it in the git log for next version.

> 
> >
> >  - ``hdr``: IPv6 header definition (``rte_ip.h``).
> > -- Default ``mask`` matches source and destination addresses only.
> > +- ``hop_ext_exist``: Hop-by-Hop Options extension header exists.
> > +- ``rout_ext_exist``: Routing extension header exists.
> > +- ``frag_ext_exist``: Fragment extension header exists.
> > +- ``auth_ext_exist``: Authentication extension header exists.
> > +- ``esp_ext_exist``: Encapsulation Security Payload extension header
> exists.
> > +- ``dest_ext_exist``: Destination Options extension header exists.
> > +- ``mobil_ext_exist``: Mobility extension header exists.
> > +- ``hip_ext_exist``: Host Identity Protocol extension header exists.
> > +- ``shim6_ext_exist``: Shim6 Protocol extension header exists.
> > +- Default ``mask`` matches ``hdr`` source and destination addresses only.
> >
> >  Item: ``ICMP``
> >  ^^
> > diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> > index da8bfa5..5b5bed2 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -792,11 +792,32 @@ struct rte_flow_item_ipv4 {
> >   *
> >   * Matches an IPv6 header.
> >   *
> > - * Note: IPv6 options are handled by dedicated pattern items, see
> > - * RTE_FLOW_ITEM_TYPE_IPV6_EXT.
> > + * Dedicated flags indicate existence of specific extension headers.
> > + * Every type of extension header can use a dedicated pattern item, or
> > + * the generic item RTE_FLOW_ITEM_TYPE_IPV6_EXT.
> >   */
> >  struct rte_flow_item_ipv6 {
> > struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
> > +   uint32_t hop_ext_exist:1;
> > +   /**< Hop-by-Hop Options extension header exists. */
> > +   uint32_t rout_ext_exist:1;
> > +   /**< Routing extension header exists. */
> > +   uint32_t frag_ext_exist:1;
> > +   /**< Fragment extension header exist

[dpdk-dev] [PATCH v2] eal/windows: export all built functions for clang

2020-10-07 Thread Tal Shnaiderman
export for clang build all the functions currently built
on Windows and listed in rte_eal_version.map by adding
them to rte_eal_exports.def.

Signed-off-by: Tal Shnaiderman 
Acked-by: Ranjit Menon 
---
v2: rebase to master
---
---
 lib/librte_eal/rte_eal_exports.def | 156 +++--
 1 file changed, 151 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/rte_eal_exports.def 
b/lib/librte_eal/rte_eal_exports.def
index 7b35beb702..d7a47d0929 100644
--- a/lib/librte_eal/rte_eal_exports.def
+++ b/lib/librte_eal/rte_eal_exports.def
@@ -3,34 +3,83 @@ EXPORTS
per_lcore__lcore_id
per_lcore__rte_errno
per_lcore__thread_id
+   rte_bus_dump
+   rte_bus_find
+   rte_bus_find_by_device
+   rte_bus_find_by_name
+   rte_bus_get_iommu_class
+   rte_bus_probe
rte_bus_register
+   rte_bus_scan
+   rte_bus_unregister
rte_calloc
rte_calloc_socket
rte_cpu_get_flag_enabled
+   rte_cpu_get_flag_name
+   rte_ctrl_thread_create
+   rte_delay_us
+   rte_delay_us_block
+   rte_delay_us_callback_register
rte_dev_is_probed
+   rte_dev_probe
+   rte_dev_remove
+   rte_devargs_add
+   rte_devargs_dump
rte_devargs_insert
rte_devargs_next
rte_devargs_parse
+   rte_devargs_parsef
rte_devargs_remove
+   rte_devargs_type_count
+   rte_dump_physmem_layout
+   rte_dump_registers
+   rte_dump_stack
+   rte_dump_tailq
+   rte_eal_cleanup
+   rte_eal_get_lcore_state
+   rte_eal_get_physmem_size
+   rte_eal_get_runtime_dir
rte_eal_has_hugepages
rte_eal_has_pci
+   rte_eal_hotplug_add
+   rte_eal_hotplug_remove
rte_eal_init
rte_eal_iova_mode
+   rte_eal_lcore_role
rte_eal_mbuf_user_pool_ops
rte_eal_mp_remote_launch
rte_eal_mp_wait_lcore
rte_eal_process_type
rte_eal_remote_launch
-   rte_log
rte_eal_tailq_lookup
rte_eal_tailq_register
rte_eal_using_phys_addrs
+   rte_eal_wait_lcore
+   rte_exit
rte_free
+   rte_get_master_lcore
+   rte_get_next_lcore
rte_get_tsc_hz
rte_hexdump
+   rte_hypervisor_get
rte_intr_rx_ctl
+   rte_lcore_count
+   rte_lcore_has_role
+   rte_lcore_index
+   rte_lcore_is_enabled
+   rte_lcore_to_socket_id
+   rte_log
+   rte_log_cur_msg_loglevel
+   rte_log_cur_msg_logtype
+   rte_log_dump
+   rte_log_get_global_level
+   rte_log_get_level
+   rte_log_get_stream
rte_log_register
+   rte_log_set_global_level
rte_log_set_level
+   rte_log_set_level_pattern
+   rte_log_set_level_regexp
rte_malloc
rte_malloc_dump_stats
rte_malloc_get_socket_stats
@@ -53,6 +102,7 @@ EXPORTS
rte_mem_lock_page
rte_mem_virt2iova
rte_mem_virt2phy
+   rte_memdump
rte_memory_get_nchannel
rte_memory_get_nrank
rte_memzone_dump
@@ -62,16 +112,53 @@ EXPORTS
rte_memzone_reserve_aligned
rte_memzone_reserve_bounded
rte_memzone_walk
+   rte_openlog_stream
+   rte_realloc
+   rte_rtm_supported
+   rte_service_attr_get
+   rte_service_attr_reset_all
+   rte_service_component_register
+   rte_service_component_runstate_set
+   rte_service_component_unregister
+   rte_service_dump
+   rte_service_finalize
+   rte_service_get_by_name
+   rte_service_get_count
+   rte_service_get_name
+   rte_service_lcore_add
+   rte_service_lcore_attr_get
+   rte_service_lcore_attr_reset_all
+   rte_service_lcore_count
+   rte_service_lcore_count_services
+   rte_service_lcore_del
+   rte_service_lcore_list
+   rte_service_lcore_reset_all
+   rte_service_lcore_start
+   rte_service_lcore_stop
+   rte_service_map_lcore_get
+   rte_service_map_lcore_set
+   rte_service_may_be_active
+   rte_service_probe_capability
+   rte_service_run_iter_on_app_lcore
+   rte_service_runstate_get
+   rte_service_runstate_set
+   rte_service_set_runstate_mapped_check
+   rte_service_set_stats_enable
+   rte_service_start_with_defaults
+   rte_set_application_usage_hook
+   rte_socket_count
rte_socket_id
+   rte_socket_id_by_idx
rte_strerror
+   rte_strscpy
rte_strsplit
rte_sys_gettid
+   rte_thread_get_affinity
+   rte_thread_set_affinity
+   rte_thread_setname
rte_vfio_container_dma_map
rte_vfio_container_dma_unmap
rte_vlog
-   rte_realloc
-   rte_rtm_supported
-   rte_strscpy
rte_zmalloc
rte_zmalloc_socket
 
@@ -80,6 +167,8 @@ EXPORTS
rte_mp_reply
rte_mp_sendmsg
 
+   rte_dev_event_callback_register
+   rte_dev_event_callback_unregister
rte_fbarray_attach
  

Re: [dpdk-dev] [PATCH] ethdev: use rte_pktmbuf_free_bulk()

2020-10-07 Thread Morten Brørup
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Stephen Hemminger
> Sent: Wednesday, September 30, 2020 11:27 PM
> 
> The mbuf library now has routine to free multiple buffers.
> Loop is no longer needed.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/librte_ethdev/rte_ethdev.c | 10 ++
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/librte_ethdev/rte_ethdev.c
> b/lib/librte_ethdev/rte_ethdev.c
> index dfe5c1b488a0..307fbeb3a798 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -2179,10 +2179,7 @@ void
>  rte_eth_tx_buffer_drop_callback(struct rte_mbuf **pkts, uint16_t
> unsent,
>   void *userdata __rte_unused)
>  {
> - unsigned i;
> -
> - for (i = 0; i < unsent; i++)
> - rte_pktmbuf_free(pkts[i]);
> + rte_pktmbuf_free_bulk(pkts, unsent);
>  }
> 
>  void
> @@ -2190,11 +2187,8 @@ rte_eth_tx_buffer_count_callback(struct rte_mbuf
> **pkts, uint16_t unsent,
>   void *userdata)
>  {
>   uint64_t *count = userdata;
> - unsigned i;
> -
> - for (i = 0; i < unsent; i++)
> - rte_pktmbuf_free(pkts[i]);
> 
> + rte_pktmbuf_free_bulk(pkts, unsent);
>   *count += unsent;
>  }
> 
> --
> 2.27.0
> 

Reviewed-by: Morten Brørup 



[dpdk-dev] [PATCH] net/fm10k: fix memory leak when Tx thresh check fails

2020-10-07 Thread wangyunjian
From: Yunjian Wang 

In fm10k_tx_queue_setup(), we allocate memory for the queue
structure but not released when Tx thresh check fails.

Fixes: 98068e0e044e ("fm10k: add Tx queue setup/release")
Cc: sta...@dpdk.org

Signed-off-by: Yunjian Wang 
---
 drivers/net/fm10k/fm10k_ethdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 309637071..c4a6fdf7f 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -2024,8 +2024,10 @@ fm10k_tx_queue_setup(struct rte_eth_dev *dev, uint16_t 
queue_id,
q->ops = &def_txq_ops;
q->tail_ptr = (volatile uint32_t *)
&((uint32_t *)hw->hw_addr)[FM10K_TDT(queue_id)];
-   if (handle_txconf(q, conf))
+   if (handle_txconf(q, conf)) {
+   rte_free(q);
return -EINVAL;
+   }
 
/* allocate memory for the software ring */
q->sw_ring = rte_zmalloc_socket("fm10k sw ring",
-- 
2.23.0



Re: [dpdk-dev] [PATCH] maintainers: improve coverage of arch-specific files

2020-10-07 Thread David Marchand
On Tue, Oct 6, 2020 at 10:23 PM Thomas Monjalon  wrote:
>
> The sub-directories of config/ are maintained by
> different architecture maintainers.
>
> Some wildcards are used to describe the lib and drivers files
> which are specific to some architectures.

I noticed "some" misses, see below.

>
> The EAL Arm files have split responsibilities depending on 32/64 suffix,
> and the common files are shared between Armv7 and Armv8 sections.
>
> Signed-off-by: Thomas Monjalon 
> ---
>  MAINTAINERS | 23 +++
>  1 file changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 75a17d51c0..349fc9c38a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -256,32 +256,31 @@ F: app/test/test_rand_perf.c
>  ARM v7
>  M: Jan Viktorin 
>  M: Ruifeng Wang 
> +F: config/arm/
>  F: lib/librte_eal/arm/
> +X: lib/librte_eal/arm/include/*_64.h
>
>  ARM v8
>  M: Jerin Jacob 
>  M: Ruifeng Wang 
> -F: lib/librte_eal/arm/include/*_64.h
> -F: lib/librte_net/net_crc_neon.h
> -F: lib/librte_acl/acl_run_neon.*
> -F: lib/librte_bpf/bpf_jit_arm64.c
> -F: lib/librte_lpm/rte_lpm_neon.h
> -F: lib/librte_hash/rte*_arm64.h
> -F: lib/librte_efd/rte*_arm64.h
> -F: lib/librte_table/rte*_arm64.h
> -F: drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> -F: drivers/net/i40e/i40e_rxtx_vec_neon.c
> -F: drivers/net/virtio/virtio_rxtx_simple_neon.c
> +F: config/arm/
> +F: lib/librte_eal/arm/
> +X: lib/librte_eal/arm/include/*_32.h
> +F: lib/*/*_neon.*
> +F: lib/*/*_arm64.*
> +F: drivers/*/*/*_neon.*

app and examples directory are not caught in this:

./lib/librte_net/net_crc_neon.h
./lib/librte_acl/acl_run_neon.h
./lib/librte_acl/acl_run_neon.c
./lib/librte_lpm/rte_lpm_neon.h
./lib/librte_node/ip4_lookup_neon.h
./examples/l3fwd/l3fwd_lpm_neon.h
./examples/l3fwd/l3fwd_em_hlm_neon.h
./examples/l3fwd/l3fwd_neon.h
./app/test-pmd/macswap_neon.h
./drivers/net/hns3/hns3_rxtx_vec_neon.h
./drivers/net/bnxt/bnxt_rxtx_vec_neon.c
./drivers/net/mlx5/mlx5_rxtx_vec_neon.h
./drivers/net/virtio/virtio_rxtx_simple_neon.c
./drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
./drivers/net/i40e/i40e_rxtx_vec_neon.c

Can we use a wildcard at dpdk root?
*/*/*_neon.*

The drivers/*/*/*_neon.* entry would still be needed though.


>
>  IBM POWER (alpha)
>  M: David Christensen 
> +F: config/ppc/
>  F: lib/librte_eal/ppc/
> -F: drivers/net/i40e/i40e_rxtx_vec_altivec.c
> +F: drivers/*/*/*_altivec.*
>  F: examples/l3fwd/*altivec.h

Idem neon, lib files missing.

./lib/librte_eal/ppc/include/rte_altivec.h
./lib/librte_acl/acl_run_altivec.c
./lib/librte_acl/acl_run_altivec.h
./lib/librte_lpm/rte_lpm_altivec.h
./examples/l3fwd/l3fwd_altivec.h
./examples/l3fwd/l3fwd_lpm_altivec.h
./drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
./drivers/net/virtio/virtio_rxtx_simple_altivec.c
./drivers/net/i40e/i40e_rxtx_vec_altivec.c


>
>  Intel x86
>  M: Bruce Richardson 
>  M: Konstantin Ananyev 
> +F: config/x86/
>  F: lib/librte_eal/x86/

Nothing for avx/sse ?


>
>  Linux EAL (with overlaps)
> --
> 2.28.0
>


-- 
David Marchand



Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Maxime Coquelin
Hi Akhil,

Adding Thomas as he pulls your tree.

On 10/6/20 9:42 PM, Akhil Goyal wrote:
> 
>>
>> The series prefixes drivers APIs with rte__ in
>> order to avoid namespace pollution.
>>
>> These APIs are experimental, so no need to follow the
>> deprecation process.
>>
> Added Fixes commit in patch description.

Thanks for applying it to your tree.

I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
because I thought it is was not a good idea to backport API changes,
even if this is experimental it might be annoying for the user.

Thomas, do you confirm?

Thanks,
Maxime
> Series applied to dpdk-next-crypto
> 
> Thanks.
> 



Re: [dpdk-dev] [RFC v2 1/1] lib/ring: add scatter gather APIs

2020-10-07 Thread Olivier Matz
Hi Honnappa,

>From a quick walkthrough, I have some questions/comments, please
see below.

On Tue, Oct 06, 2020 at 08:29:05AM -0500, Honnappa Nagarahalli wrote:
> Add scatter gather APIs to avoid intermediate memcpy. Use cases
> that involve copying large amount of data to/from the ring
> can benefit from these APIs.
> 
> Signed-off-by: Honnappa Nagarahalli 
> ---
>  lib/librte_ring/meson.build|   3 +-
>  lib/librte_ring/rte_ring_elem.h|   1 +
>  lib/librte_ring/rte_ring_peek_sg.h | 552 +
>  3 files changed, 555 insertions(+), 1 deletion(-)
>  create mode 100644 lib/librte_ring/rte_ring_peek_sg.h
> 
> diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build
> index 31c0b4649..377694713 100644
> --- a/lib/librte_ring/meson.build
> +++ b/lib/librte_ring/meson.build
> @@ -12,4 +12,5 @@ headers = files('rte_ring.h',
>   'rte_ring_peek.h',
>   'rte_ring_peek_c11_mem.h',
>   'rte_ring_rts.h',
> - 'rte_ring_rts_c11_mem.h')
> + 'rte_ring_rts_c11_mem.h',
> + 'rte_ring_peek_sg.h')
> diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
> index 938b398fc..7d3933f15 100644
> --- a/lib/librte_ring/rte_ring_elem.h
> +++ b/lib/librte_ring/rte_ring_elem.h
> @@ -1079,6 +1079,7 @@ rte_ring_dequeue_burst_elem(struct rte_ring *r, void 
> *obj_table,
>  
>  #ifdef ALLOW_EXPERIMENTAL_API
>  #include 
> +#include 
>  #endif
>  
>  #include 
> diff --git a/lib/librte_ring/rte_ring_peek_sg.h 
> b/lib/librte_ring/rte_ring_peek_sg.h
> new file mode 100644
> index 0..97d5764a6
> --- /dev/null
> +++ b/lib/librte_ring/rte_ring_peek_sg.h
> @@ -0,0 +1,552 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + *
> + * Copyright (c) 2020 Arm
> + * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
> + * All rights reserved.
> + * Derived from FreeBSD's bufring.h
> + * Used as BSD-3 Licensed with permission from Kip Macy.
> + */
> +
> +#ifndef _RTE_RING_PEEK_SG_H_
> +#define _RTE_RING_PEEK_SG_H_
> +
> +/**
> + * @file
> + * @b EXPERIMENTAL: this API may change without prior notice
> + * It is not recommended to include this file directly.
> + * Please include  instead.
> + *
> + * Ring Peek Scatter Gather APIs

I am not fully convinced by the API name. To me, "scatter/gather" is
associated to iovecs, like for instance in [1]. The wikipedia definition
[2] may be closer though.

[1] https://www.gnu.org/software/libc/manual/html_node/Scatter_002dGather.html
[2] https://en.wikipedia.org/wiki/Gather-scatter_(vector_addressing)

What about "zero-copy"?

Also, the "peek" term looks also a bit confusing to me, but it existed
before your patch. I understand it for dequeue, but not for enqueue.

Or, what about replacing the existing experimental peek API by this one?
They look quite similar to me.

> + * Introduction of rte_ring with scatter gather serialized producer/consumer
> + * (HTS sync mode) makes it possible to split public enqueue/dequeue API
> + * into 3 phases:
> + * - enqueue/dequeue start
> + * - copy data to/from the ring
> + * - enqueue/dequeue finish
> + * Along with the advantages of the peek APIs, these APIs provide the ability
> + * to avoid copying of the data to temporary area.
> + *
> + * Note that right now this new API is available only for two sync modes:
> + * 1) Single Producer/Single Consumer (RTE_RING_SYNC_ST)
> + * 2) Serialized Producer/Serialized Consumer (RTE_RING_SYNC_MT_HTS).
> + * It is a user responsibility to create/init ring with appropriate sync
> + * modes selected.
> + *
> + * Example usage:
> + * // read 1 elem from the ring:

Comment should be "prepare enqueuing 32 objects"

> + * n = rte_ring_enqueue_sg_bulk_start(ring, 32, &sgd, NULL);
> + * if (n != 0) {
> + *   //Copy objects in the ring
> + *   memcpy (sgd->ptr1, obj, sgd->n1 * sizeof(uintptr_t));
> + *   if (n != sgd->n1)
> + *   //Second memcpy because of wrapround
> + *   n2 = n - sgd->n1;
> + *   memcpy (sgd->ptr2, obj[n2], n2 * sizeof(uintptr_t));

Missing { }

> + *   rte_ring_dequeue_sg_finish(ring, n);

Should be enqueue

> + * }
> + *
> + * Note that between _start_ and _finish_ none other thread can proceed
> + * with enqueue(/dequeue) operation till _finish_ completes.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include 
> +
> +/* Rock that needs to be passed between reserve and commit APIs */
> +struct rte_ring_sg_data {
> + /* Pointer to the first space in the ring */
> + void **ptr1;
> + /* Pointer to the second space in the ring if there is wrap-around */
> + void **ptr2;
> + /* Number of elements in the first pointer. If this is equal to
> +  * the number of elements requested, then ptr2 is NULL.
> +  * Otherwise, subtracting n1 from number of elements requested
> +  * will give the number of elements available at ptr2.
> +  */
> + unsigned int n1;
> +};

Would it be possible to simply return the o

Re: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device recovery event

2020-10-07 Thread Ophir Munk
Adding Ferruh and Ajit

From: Kalesh Anakkur Purayil 
Sent: Wednesday, October 7, 2020 7:47 AM
To: Ophir Munk 
Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon 
Subject: Re: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device recovery 
event

Hi Ophir,

Thank you for the comments. I will address them in the next version.

I will push these changes as Patches next time and not as an RFC. Hope that is 
OK.

Regards,
Kalesh

On Tue, Oct 6, 2020 at 10:55 PM Ophir Munk 
mailto:ophi...@nvidia.com>> wrote:
Hi Kalesh,
Please find a few comments.
The name you gave to the event (EVENT_RESET) is very close to an already 
existing one: "EVENT_INTR_RESET".
But they are different.
EVENT_INTR_RESET originates from a port reset. It requires application 
reaction. It is widely used. It is documented in *.rst files.
EVENT_RESET originates from FW error (or maybe any error). It requires no 
application reaction (PMD manages by itself). It is not documented.
I therefore suggest renaming it (maybe EVENT_ERR_RECOVERING) and please 
document it in *.rst files.
More comments below:

> -Original Message-
> From: dev mailto:dev-boun...@dpdk.org>> On Behalf Of 
> Kalesh A P
> Sent: Wednesday, September 30, 2020 3:33 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device recovery
> event
>
> From: Kalesh AP 
> mailto:kalesh-anakkur.pura...@broadcom.com>>
>
> Added code to handle device reset and recovery event in testpmd.
> This is an indication from the PMD that device has reset and recovered error
> condition.
>
> Signed-off-by: Kalesh AP 
> mailto:kalesh-anakkur.pura...@broadcom.com>>
> Reviewed-by: Ajit Kumar Khaparde 
> mailto:ajit.khapa...@broadcom.com>>
> ---
>  app/test-pmd/testpmd.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> fe6450c..1c8fb46 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = {
>   [RTE_ETH_EVENT_NEW] = "device probed",
>   [RTE_ETH_EVENT_DESTROY] = "device released",
>   [RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
> + [RTE_ETH_EVENT_RESET] = "device reset",

"device reset" is similar to the existing "reset" string. Can you suggest a 
different one? Maybe "error under recovery" ?

> + [RTE_ETH_EVENT_RECOVERED] = "device recovery",

Wouldn't you prefer "device recovered" ?

>   [RTE_ETH_EVENT_MAX] = NULL,
>  };
>
> @@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) <<
> RTE_ETH_EVENT_UNKNOWN) |
>   (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
>   (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
>   (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
> - (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
> + (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
> + (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
> + (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
>  /*
>   * Decide if all memory are locked for performance.
>   */
> --
> 2.10.1


--
Regards,
Kalesh A P


Re: [dpdk-dev] [PATCH v3] doc: add stack mempool guide

2020-10-07 Thread Olivier Matz
Hi Gage,

On Mon, Sep 21, 2020 at 03:42:28PM +, Eads, Gage wrote:
> Hi Olivier,
> 
> 
> 
> > > +Stack Mempool Driver
> > > +
> > > +
> > > +**rte_mempool_stack** is a pure software mempool driver based on the
> > > +``rte_stack`` DPDK library. A stack-based mempool is often better suited 
> > > to
> > > +packet-processing workloads than a ring-based mempool, since its LIFO
> > behavior
> > > +results in better temporal locality and a minimal memory footprint even 
> > > if the
> > > +mempool is over-provisioned.
> > 
> > Would it make sense to give an example of a use-case where the stack
> > driver should be used in place of the standard ring-based one?
> > 
> > In most run-to-completion applications, the mbufs stay in per-core
> > caches, so changing the mempool driver won't have a big impact. However,
> > I suspect that for applications using a pipeline model (ex: rx on core0,
> > tx on core1), the stack model would be more efficient. Is it something
> > that you measured? If yes, it would be useful to explain this in the
> > documentation.
> > 
> 
> Good point, I was overlooking the impact of the per-core caches. I've seen 
> data showing
> better overall packet throughput with the stack mempool, and indeed that was 
> a pipelined
> application. How about this re-write?
> 
> "
> **rte_mempool_stack** is a pure software mempool driver based on the
> ``rte_stack`` DPDK library. For run-to-completion workloads with sufficiently
> large per-lcore caches, the mbufs will likely stay in the per-lcore caches 
> and the
> mempool type (ring, stack, etc.) will have a negligible impact on 
> performance. However
> a stack-based mempool is often better suited to pipelined packet-processing 
> workloads
> (which allocate and free mbufs on different lcores) than a ring-based 
> mempool, since its
> LIFO behavior results in better temporal locality and a minimal memory 
> footprint even
> if the mempool is over-provisioned. Users are encouraged to benchmark with 
> multiple
> mempool types to determine which works best for their specific application.
> "

Yes, this is clear, thanks!

Olivier


Re: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device recovery event

2020-10-07 Thread Kalesh Anakkur Purayil
Hi Ophir,

Thank you for the comments. I will address them in the next version.

I will push these changes as Patches next time and not as an RFC. Hope that
is OK.

Regards,
Kalesh

On Tue, Oct 6, 2020 at 10:55 PM Ophir Munk  wrote:

> Hi Kalesh,
> Please find a few comments.
> The name you gave to the event (EVENT_RESET) is very close to an already
> existing one: "EVENT_INTR_RESET".
> But they are different.
> EVENT_INTR_RESET originates from a port reset. It requires application
> reaction. It is widely used. It is documented in *.rst files.
> EVENT_RESET originates from FW error (or maybe any error). It requires no
> application reaction (PMD manages by itself). It is not documented.
> I therefore suggest renaming it (maybe EVENT_ERR_RECOVERING) and please
> document it in *.rst files.
> More comments below:
>
> > -Original Message-
> > From: dev  On Behalf Of Kalesh A P
> > Sent: Wednesday, September 30, 2020 3:33 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device
> recovery
> > event
> >
> > From: Kalesh AP 
> >
> > Added code to handle device reset and recovery event in testpmd.
> > This is an indication from the PMD that device has reset and recovered
> error
> > condition.
> >
> > Signed-off-by: Kalesh AP 
> > Reviewed-by: Ajit Kumar Khaparde 
> > ---
> >  app/test-pmd/testpmd.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> > fe6450c..1c8fb46 100644
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = {
> >   [RTE_ETH_EVENT_NEW] = "device probed",
> >   [RTE_ETH_EVENT_DESTROY] = "device released",
> >   [RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
> > + [RTE_ETH_EVENT_RESET] = "device reset",
>
> "device reset" is similar to the existing "reset" string. Can you suggest
> a different one? Maybe "error under recovery" ?
>
> > + [RTE_ETH_EVENT_RECOVERED] = "device recovery",
>
> Wouldn't you prefer "device recovered" ?
>
> >   [RTE_ETH_EVENT_MAX] = NULL,
> >  };
> >
> > @@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) <<
> > RTE_ETH_EVENT_UNKNOWN) |
> >   (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
> >   (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
> >   (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
> > - (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
> > + (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
> > + (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
> > + (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
> >  /*
> >   * Decide if all memory are locked for performance.
> >   */
> > --
> > 2.10.1
>
>

-- 
Regards,
Kalesh A P


[dpdk-dev] [PATCH] doc: remove aes-gcm addition of j0 deprecation notice

2020-10-07 Thread Arek Kusztal
This patch removes information about deprecation of AES-GCM/GMAC
API for IV without J0.

Signed-off-by: Arek Kusztal 
---
 doc/guides/rel_notes/deprecation.rst | 6 --
 1 file changed, 6 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index f9b72acb8..281f38048 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -235,12 +235,6 @@ Deprecation Notices
   ``RTE_CRYPTO_AUTH_LIST_END`` from ``enum rte_crypto_auth_algorithm``
   will be removed.
 
-* cryptodev: support for using IV with all sizes is added, J0 still can
-  be used but only when IV length in following structs 
``rte_crypto_auth_xform``,
-  ``rte_crypto_aead_xform`` is set to zero. When IV length is greater or equal
-  to one it means it represents IV, when is set to zero it means J0 is used
-  directly, in this case 16 bytes of J0 need to be passed.
-
 * scheduler: The functions ``rte_cryptodev_scheduler_slave_attach``,
   ``rte_cryptodev_scheduler_slave_detach`` and
   ``rte_cryptodev_scheduler_slaves_get`` will be replaced in 20.11 by
-- 
2.17.1



[dpdk-dev] [PATCH] baseband/turbo_sw: fix memory leak in error path

2020-10-07 Thread wangyunjian
From: Yunjian Wang 

In q_setup() allocated memory for the queue data, we should free
it when error happens, otherwise it will lead to memory leak.

Fixes: b8cfe2c9aed2 ("bb/turbo_sw: add software turbo driver")
Cc: sta...@dpdk.org

Signed-off-by: Yunjian Wang 
---
 drivers/baseband/turbo_sw/bbdev_turbo_software.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c 
b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index a36099e91..e55b32927 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -302,7 +302,7 @@ q_setup(struct rte_bbdev *dev, uint16_t q_id,
rte_bbdev_log(ERR,
"Creating queue name for device %u queue %u 
failed",
dev->data->dev_id, q_id);
-   return -ENAMETOOLONG;
+   goto free_q;
}
q->enc_out = rte_zmalloc_socket(name,
((RTE_BBDEV_TURBO_MAX_TB_SIZE >> 3) + 3) *
@@ -322,7 +322,7 @@ q_setup(struct rte_bbdev *dev, uint16_t q_id,
rte_bbdev_log(ERR,
"Creating queue name for device %u queue %u 
failed",
dev->data->dev_id, q_id);
-   return -ENAMETOOLONG;
+   goto free_q;
}
q->enc_in = rte_zmalloc_socket(name,
(RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) * sizeof(*q->enc_in),
@@ -340,7 +340,7 @@ q_setup(struct rte_bbdev *dev, uint16_t q_id,
rte_bbdev_log(ERR,
"Creating queue name for device %u queue %u 
failed",
dev->data->dev_id, q_id);
-   return -ENAMETOOLONG;
+   goto free_q;
}
q->ag = rte_zmalloc_socket(name,
RTE_BBDEV_TURBO_MAX_CB_SIZE * 10 * sizeof(*q->ag),
@@ -358,7 +358,7 @@ q_setup(struct rte_bbdev *dev, uint16_t q_id,
rte_bbdev_log(ERR,
"Creating queue name for device %u queue %u 
failed",
dev->data->dev_id, q_id);
-   return -ENAMETOOLONG;
+   goto free_q;
}
q->code_block = rte_zmalloc_socket(name,
RTE_BBDEV_TURBO_MAX_CB_SIZE * sizeof(*q->code_block),
@@ -377,7 +377,7 @@ q_setup(struct rte_bbdev *dev, uint16_t q_id,
rte_bbdev_log(ERR,
"Creating queue name for device %u queue %u 
failed",
dev->data->dev_id, q_id);
-   return -ENAMETOOLONG;
+   goto free_q;
}
q->deint_input = rte_zmalloc_socket(name,
DEINT_INPUT_BUF_SIZE * sizeof(*q->deint_input),
@@ -396,7 +396,7 @@ q_setup(struct rte_bbdev *dev, uint16_t q_id,
rte_bbdev_log(ERR,
"Creating queue name for device %u queue %u 
failed",
dev->data->dev_id, q_id);
-   return -ENAMETOOLONG;
+   goto free_q;
}
q->deint_output = rte_zmalloc_socket(NULL,
DEINT_OUTPUT_BUF_SIZE * sizeof(*q->deint_output),
@@ -415,7 +415,7 @@ q_setup(struct rte_bbdev *dev, uint16_t q_id,
rte_bbdev_log(ERR,
"Creating queue name for device %u queue %u 
failed",
dev->data->dev_id, q_id);
-   return -ENAMETOOLONG;
+   goto free_q;
}
q->adapter_output = rte_zmalloc_socket(NULL,
ADAPTER_OUTPUT_BUF_SIZE * sizeof(*q->adapter_output),
@@ -433,7 +433,7 @@ q_setup(struct rte_bbdev *dev, uint16_t q_id,
rte_bbdev_log(ERR,
"Creating queue name for device %u queue %u 
failed",
dev->data->dev_id, q_id);
-   return -ENAMETOOLONG;
+   goto free_q;
}
q->processed_pkts = rte_ring_create(name, queue_conf->queue_size,
queue_conf->socket, RING_F_SP_ENQ | RING_F_SC_DEQ);
-- 
2.23.0



Re: [dpdk-dev] [EXT] Re: [PATCH 1/1] eal: increase TRACE CTF SIZE to recommended size

2020-10-07 Thread David Marchand
On Tue, Oct 6, 2020 at 11:58 AM Jerin Jacob  wrote:
>
> On Tue, Oct 6, 2020 at 3:10 PM David Marchand  
> wrote:
> >
> > On Tue, Oct 6, 2020 at 11:22 AM Sunil Kumar Kori  wrote:
> > > >On Mon, Oct 5, 2020 at 10:16 PM Timothy McDaniel
> > > > wrote:
> > > >>
> > > >> Increase TRACE_CTF_FIELD_SIZE to 448, the recommended size.
> > > >
> > > >Repeating the same sentence in the title and the commitlog does not give
> > > >much info.
> > > >
> > > >Plus, what is this "recommendation"?
> > > When analyzed this issue, only one more byte was needed to fix this issue 
> > > but in future similar issue can occur again.
> > > So increasing this value by 64 bytes which actually equals to a cache 
> > > line. That’s why we have suggested this size.
> >
> > 384 is aligned to both 64 and 128 bytes cache lines.
> > 448 is only aligned to 64 bytes.
> >
> > Should we care about 128 bytes cache lines systems?
>
> it is on a slow path. 448 is OK.

Ah yes, this is for the ctf description.
Could it be changed to rely on dynamic allocations and we simply
remove this limit?


>
> >
> > >
> > > >
> > > >
> > > >> Fixes "CTF field is too long" error when running with trace enabled.
> > > >
> > > >Ok, you hit this limit, but it would help to get some context here.
> > > >Looking at this patch in the future, we won't know why it was necessary.
> >
> > How about following commitlog:
> >
> > """
> > trace: increase trace point buffer size
> >
> > The current buffer size is not big enough to accomodate traces for new
> > additions in the eventdev subsystem.
> > Increase this buffer size by XXX for reason YYY.
> > """
>
> Looks good to me.

Well in this case, there is no actual reason.
The increased value is deemed "enough for now", unless we change this
to dynamic allocations.



-- 
David Marchand



Re: [dpdk-dev] [PATCH v3] mbuf: minor cleanup

2020-10-07 Thread Olivier Matz
Hi Morten,

Thanks for this cleanup. Please see some comments below.

On Wed, Sep 16, 2020 at 12:40:13PM +0200, Morten Brørup wrote:
> The mbuf header files had some commenting style errors that affected the
> API documentation.
> Also, the RTE_ prefix was missing on a macro and a definition.
> 
> Note: This patch does not touch the offload and attachment flags that are
> also missing the RTE_ prefix.
> 
> Changes only affecting documentation:
> * Removed the MBUF_INVALID_PORT definition from rte_mbuf.h; it is
>   already defined in rte_mbuf_core.h.
>   This removal also reestablished the description of the
>   rte_pktmbuf_reset() function.
> * Corrected the comment related to RTE_MBUF_MAX_NB_SEGS.
> * Corrected the comment related to PKT_TX_QINQ_PKT.
> 
> Changes regarding missing RTE_ prefix:
> * Converted the MBUF_RAW_ALLOC_CHECK() macro to an
>   __rte_mbuf_raw_sanity_check() inline function.
>   Added backwards compatible macro with the original name.
> * Renamed the MBUF_INVALID_PORT definition to RTE_MBUF_PORT_INVALID.
>   Added backwards compatible definition with the original name.
> 
> v2:
> * Use RTE_MBUF_PORT_INVALID instead of MBUF_INVALID_PORT in rte_mbuf.c.
> 
> v3:
> * The functions/macros used in __rte_mbuf_raw_sanity_check() require
>   RTE_ENABLE_ASSERT or RTE_LIBRTE_MBUF_DEBUG, or they don't use the mbuf
>   parameter, which generates a compiler waning. So mark the mbuf parameter
>   __rte_unused if none of them are defined.
> 
> Signed-off-by: Morten Brørup 
> ---
>  doc/guides/rel_notes/deprecation.rst |  7 
>  lib/librte_mbuf/rte_mbuf.c   |  4 +-
>  lib/librte_mbuf/rte_mbuf.h   | 55 +++-
>  lib/librte_mbuf/rte_mbuf_core.h  |  9 +++--
>  4 files changed, 45 insertions(+), 30 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 279eccb04..88d7d0761 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -294,13 +294,6 @@ Deprecation Notices
>- https://patches.dpdk.org/patch/71457/
>- https://patches.dpdk.org/patch/71456/
>  
> -* rawdev: The rawdev APIs which take a device-specific structure as
> -  parameter directly, or indirectly via a "private" pointer inside another
> -  structure, will be modified to take an additional parameter of the
> -  structure size. The affected APIs will include ``rte_rawdev_info_get``,
> -  ``rte_rawdev_configure``, ``rte_rawdev_queue_conf_get`` and
> -  ``rte_rawdev_queue_setup``.
> -
>  * acl: ``RTE_ACL_CLASSIFY_NUM`` enum value will be removed.
>This enum value is not used inside DPDK, while it prevents to add new
>classify algorithms without causing an ABI breakage.

I think this change is not related.

This makes me think that a deprecation notice could be done for the
old names without the RTE_ prefix, to be removed in 21.11.


> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index 8a456e5e6..53a015311 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -104,7 +104,7 @@ rte_pktmbuf_init(struct rte_mempool *mp,
>   /* init some constant fields */
>   m->pool = mp;
>   m->nb_segs = 1;
> - m->port = MBUF_INVALID_PORT;
> + m->port = RTE_MBUF_PORT_INVALID;
>   rte_mbuf_refcnt_set(m, 1);
>   m->next = NULL;
>  }
> @@ -207,7 +207,7 @@ __rte_pktmbuf_init_extmem(struct rte_mempool *mp,
>   /* init some constant fields */
>   m->pool = mp;
>   m->nb_segs = 1;
> - m->port = MBUF_INVALID_PORT;
> + m->port = RTE_MBUF_PORT_INVALID;
>   m->ol_flags = EXT_ATTACHED_MBUF;
>   rte_mbuf_refcnt_set(m, 1);
>   m->next = NULL;
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 7259575a7..406d3abb2 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -554,12 +554,36 @@ __rte_experimental
>  int rte_mbuf_check(const struct rte_mbuf *m, int is_header,
>  const char **reason);
>  
> -#define MBUF_RAW_ALLOC_CHECK(m) do { \
> - RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1);   \
> - RTE_ASSERT((m)->next == NULL);  \
> - RTE_ASSERT((m)->nb_segs == 1);  \
> - __rte_mbuf_sanity_check(m, 0);  \
> -} while (0)
> +#if defined(RTE_ENABLE_ASSERT) || defined(RTE_LIBRTE_MBUF_DEBUG)

I don't see why this #if is needed. Wouldn't it work to have only
one function definition with the __rte_unused attribute?

> +/**
> + * Sanity checks on a reinitialized mbuf.
> + *
> + * Check the consistency of the given reinitialized mbuf.
> + * The function will cause a panic if corruption is detected.
> + *
> + * Check that the mbuf is properly reinitialized (refcnt=1, next=NULL,
> + * nb_segs=1), as done by rte_pktmbuf_prefree_seg().
> + *

Maybe indicate that these checks are only done when debug is on.

> + * @param m
> + *   The

Re: [dpdk-dev] [PATCH v4 0/2] net: add CRC run-time checks and AVX512/VPCLMULQDQ based CRC

2020-10-07 Thread David Marchand
On Tue, Oct 6, 2020 at 6:23 PM Mairtin o Loingsigh
 wrote:
>
> This patchset makes two significant enhancements to the CRC modules of
> the rte_net library:
>
> 1) Adds run-time selection of the optimal architecture-specific CRC path.
>Previously the selection was solely made at compile-time, meaning it
>could only be built and run on the same generation of CPU. Adding
>run-time selection ability means this can be used from distro packages
>and/or DPDK can be compiled on an older CPU and run on a newer CPU.
> 2) Adds an optimized CRC implementation based on the AVX512 and
>VPCLMULQDQ instruction sets.
>
> For further details, please see the commit messages of the individual
> patches.

Reviews please?
Thanks.


-- 
David Marchand



[dpdk-dev] [PATCH] net/af_xdp: use snprintf instead of strncpy

2020-10-07 Thread Ciara Loftus
strncpy may leave the destination buffer not NULL terminated so use
snprintf instead.

Coverity issue: 362975
Fixes: 339b88c6a91f ("net/af_xdp: support multi-queue")
Signed-off-by: Ciara Loftus 
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index eaf2c9c873..52495cb8fb 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -1362,7 +1362,7 @@ xdp_get_channels_info(const char *if_name, int 
*max_queues,
 
channels.cmd = ETHTOOL_GCHANNELS;
ifr.ifr_data = (void *)&channels;
-   strncpy(ifr.ifr_name, if_name, IFNAMSIZ);
+   snprintf(ifr.ifr_name, IFNAMSIZ, "%s", if_name);
ret = ioctl(fd, SIOCETHTOOL, &ifr);
if (ret) {
if (errno == EOPNOTSUPP) {
-- 
2.17.1



Re: [dpdk-dev] [PATCH V15 3/3] app/testpmd: add FEC command

2020-10-07 Thread Ferruh Yigit

On 10/7/2020 1:15 AM, humin (Q) wrote:

HI,Ferruh,

      how about only considering the first patch:ethdev:add fec API. If this 
patch looks great to you, I wish it could be merged into 20.11.


       To that patch, app/testpmd add fec command, I will fix it later.


Hi Connor,

Better to get them together if possible, -rc1 has been postponed to October 16, 
can this additional week help to get a new version?





       thanks.

--
胡敏 Hu Min
Mobile: +86-13528728164 
Email: humi...@huawei.com 

*发件人:*Ferruh Yigit 
*收件人:*humin (Q) ;dev 
*抄 送:*konstantin.ananyev ;thomas 
;arybchenko ;Linuxarm 


*时 间:*2020-10-01 00:53:37
*主 题:*Re: [PATCH V15 3/3] app/testpmd: add FEC command

On 9/29/2020 2:03 AM, Min Hu (Connor) wrote:

This commit adds testpmd capability to query and config FEC
function of device. This includes:
- show FEC capabilities, example:
    testpmd> show port 0 fec capabilities
- show FEC mode, example:
    testpmd> show port 0 fec_mode
- config FEC mode, example:
    testpmd> set port  0 



I guess it is:
set port  fec_mode 


    where:

    auto|off|rs|baser are four kinds of FEC mode which dev
    support according to MAC link speed.

Signed-off-by: Min Hu (Connor) 
Reviewed-by: Wei Hu (Xavier) 
Reviewed-by: Chengwen Feng 
Reviewed-by: Chengchang Tang 
---
v12->v13:
change fec get capa interface.

---
v10->v11:
change mode to capa bitmask.

---
v8->v9:
added acked-by.

---
v6->v7:
used RTE_DIM(fec_mode_name) instead of RTE_ETH_FEC_NUM

---
v5->v6:
fixed code styles according to DPDK coding style.
added _eth prefix.

---
v4->v5:
Add RTE_ prefix for public FEC mode enum.

---
v3->v4:
adjust the display format of FEC mode

---
v2->v3:
adjust the display format of FEC capability.

---
   app/test-pmd/cmdline.c | 223 
+
   app/test-pmd/config.c  |  91 
   app/test-pmd/testpmd.h |   2 +
   3 files changed, 316 insertions(+)


Can you please update the testpmd documenatation for the new commands?

Also can add the new command to the --help output? ('cmd_help_long_parsed()')

<...>


+cmdline_parse_inst_t cmd_set_fec_mode = {
+ .f = cmd_set_port_fec_mode_parsed,
+ .data = NULL,
+ .help_str = "set port  fec_mode ",


Can you please update the help string as:
"set port  fec_mode auto|off|rs|baser"

'<>' is to define the variable name, like in '' you expect numbers like
0,1,2 .. but 'auto|off|rs|baser' are keywords, not variables.




Re: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device recovery event

2020-10-07 Thread Ferruh Yigit

On 10/7/2020 5:46 AM, Kalesh Anakkur Purayil wrote:

Hi Ophir,

Thank you for the comments. I will address them in the next version.

I will push these changes as Patches next time and not as an RFC. Hope that
is OK.

Regards,
Kalesh

On Tue, Oct 6, 2020 at 10:55 PM Ophir Munk  wrote:


Hi Kalesh,
Please find a few comments.
The name you gave to the event (EVENT_RESET) is very close to an already
existing one: "EVENT_INTR_RESET".
But they are different.
EVENT_INTR_RESET originates from a port reset. It requires application
reaction. It is widely used. It is documented in *.rst files.
EVENT_RESET originates from FW error (or maybe any error). It requires no
application reaction (PMD manages by itself). It is not documented.
I therefore suggest renaming it (maybe EVENT_ERR_RECOVERING) and please
document it in *.rst files.


+1 to renaming and documenting the event.

And agree to proceed as regular patch instead of RFC.



More comments below:


-Original Message-
From: dev  On Behalf Of Kalesh A P
Sent: Wednesday, September 30, 2020 3:33 PM
To: dev@dpdk.org
Subject: [dpdk-dev] [RFC PATCH v4 3/3] app/testpmd: handle device

recovery

event

From: Kalesh AP 

Added code to handle device reset and recovery event in testpmd.
This is an indication from the PMD that device has reset and recovered

error

condition.

Signed-off-by: Kalesh AP 
Reviewed-by: Ajit Kumar Khaparde 
---
  app/test-pmd/testpmd.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
fe6450c..1c8fb46 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -380,6 +380,8 @@ static const char * const eth_event_desc[] = {
   [RTE_ETH_EVENT_NEW] = "device probed",
   [RTE_ETH_EVENT_DESTROY] = "device released",
   [RTE_ETH_EVENT_FLOW_AGED] = "flow aged",
+ [RTE_ETH_EVENT_RESET] = "device reset",


"device reset" is similar to the existing "reset" string. Can you suggest
a different one? Maybe "error under recovery" ?


+ [RTE_ETH_EVENT_RECOVERED] = "device recovery",


Wouldn't you prefer "device recovered" ?


   [RTE_ETH_EVENT_MAX] = NULL,
  };

@@ -394,7 +396,9 @@ uint32_t event_print_mask = (UINT32_C(1) <<
RTE_ETH_EVENT_UNKNOWN) |
   (UINT32_C(1) << RTE_ETH_EVENT_IPSEC) |
   (UINT32_C(1) << RTE_ETH_EVENT_MACSEC) |
   (UINT32_C(1) << RTE_ETH_EVENT_INTR_RMV) |
- (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED);
+ (UINT32_C(1) << RTE_ETH_EVENT_FLOW_AGED) |
+ (UINT32_C(1) << RTE_ETH_EVENT_RESET) |
+ (UINT32_C(1) << RTE_ETH_EVENT_RECOVERED);
  /*
   * Decide if all memory are locked for performance.
   */
--
2.10.1









Re: [dpdk-dev] [PATCH] net/af_xdp: use snprintf instead of strncpy

2020-10-07 Thread Ferruh Yigit

On 10/7/2020 10:01 AM, Ciara Loftus wrote:

strncpy may leave the destination buffer not NULL terminated so use
snprintf instead.


What do you think using 'strlcpy'?



Coverity issue: 362975
Fixes: 339b88c6a91f ("net/af_xdp: support multi-queue")
Signed-off-by: Ciara Loftus 
---
  drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index eaf2c9c873..52495cb8fb 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -1362,7 +1362,7 @@ xdp_get_channels_info(const char *if_name, int 
*max_queues,
  
  	channels.cmd = ETHTOOL_GCHANNELS;

ifr.ifr_data = (void *)&channels;
-   strncpy(ifr.ifr_name, if_name, IFNAMSIZ);
+   snprintf(ifr.ifr_name, IFNAMSIZ, "%s", if_name);
ret = ioctl(fd, SIOCETHTOOL, &ifr);
if (ret) {
if (errno == EOPNOTSUPP) {





Re: [dpdk-dev] [PATCH v4 00/25] raw/ioat: enhancements and new hardware support

2020-10-07 Thread Bruce Richardson
On Tue, Oct 06, 2020 at 11:10:09PM +0200, Thomas Monjalon wrote:
> > This patchset adds some small enhancements, some rework and also support
> > for new hardware to the ioat rawdev driver. Most rework and enhancements
> > are largely self-explanatory from the individual patches.
> 
> Another ioat patch has been merged before.
> Please could you rebase?
> 
Absolutely, will do.


[dpdk-dev] [PATCH v2] net/af_xdp: use strlcpy instead of strncpy

2020-10-07 Thread Ciara Loftus
strncpy may leave the destination buffer not NULL terminated so use
strlcpy instead.

Coverity issue: 362975
Fixes: 339b88c6a91f ("net/af_xdp: support multi-queue")
Signed-off-by: Ciara Loftus 
---
v2:
* use strlcpy instead of snprintf

 drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index eaf2c9c873..ac00cbab8e 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -1362,7 +1362,7 @@ xdp_get_channels_info(const char *if_name, int 
*max_queues,
 
channels.cmd = ETHTOOL_GCHANNELS;
ifr.ifr_data = (void *)&channels;
-   strncpy(ifr.ifr_name, if_name, IFNAMSIZ);
+   strlcpy(ifr.ifr_name, if_name, IFNAMSIZ);
ret = ioctl(fd, SIOCETHTOOL, &ifr);
if (ret) {
if (errno == EOPNOTSUPP) {
-- 
2.17.1



Re: [dpdk-dev] [PATCH V1 2/3] mbuf: change free_cb interface to adapt to GSO case

2020-10-07 Thread Olivier Matz
Hi,

On Sun, Sep 27, 2020 at 01:55:21PM +0800, yang_y_yi wrote:
> Per GSO requirement, this is a must-have change, Jiayu, can you help review
> this series?

I can't ack this patch until I have a simple and clear test case (only
with mbuf functions, without GSO or vhost) showing the issue we have
today with current.

> Olivier, if you used the old interface, maybe you need to change your code to
> adapt this, I don't think we can have a better way to handle GSO case.

Sorry, I don't get your point. What do I need to change in which code?

(some more comments below)

> At 2020-08-04 09:31:19, "yang_y_yi"  wrote:
> 
> At 2020-08-03 20:34:25, "Olivier Matz"  wrote:
> >On Mon, Aug 03, 2020 at 05:42:13PM +0800, yang_y_yi wrote:
> >> At 2020-08-03 16:11:39, "Olivier Matz"  wrote:
> >> >On Mon, Aug 03, 2020 at 09:26:40AM +0800, yang_y_yi wrote:
> >> >> At 2020-08-03 04:29:07, "Olivier Matz"  wrote:
> >> >> >Hi,
> >> >> >
> >> >> >On Sun, Aug 02, 2020 at 07:12:36AM +0800, yang_y_yi wrote:
> >> >> >> 
> >> >> >> 
> >> >> >> At 2020-07-31 23:15:43, "Olivier Matz"  
> >> >> >> wrote:
> >> >> >> >Hi,
> >> >> >> >
> >> >> >> >On Thu, Jul 30, 2020 at 08:08:59PM +0800, yang_y...@163.com wrote:
> >> >> >> >> From: Yi Yang 
> >> >> >> >> 
> >> >> >> >> In GSO case, segmented mbufs are attached to original
> >> >> >> >> mbuf which can't be freed when it is external. The issue
> >> >> >> >> is free_cb doesn't know original mbuf and doesn't free
> >> >> >> >> it when refcnt of shinfo is 0.
> >> >> >> >> 
> >> >> >> >> Original mbuf can be freed by rte_pktmbuf_free segmented
> >> >> >> >> mbufs or by rte_pktmbuf_free original mbuf. Two kind of
> >> >> >> >> cases should have different behaviors. free_cb won't
> >> >> >> >> explicitly call rte_pktmbuf_free to free original mbuf
> >> >> >> >> if it is freed by rte_pktmbuf_free original mbuf, but it
> >> >> >> >> has to call rte_pktmbuf_free to free original mbuf if it
> >> >> >> >> is freed by rte_pktmbuf_free segmented mbufs.
> >> >> >> >> 
> >> >> >> >> In order to fix this issue, free_cb interface has to been
> >> >> >> >> changed, __rte_pktmbuf_free_extbuf must deliver called
> >> >> >> >> mbuf pointer to free_cb, argument opaque can be defined
> >> >> >> >> as a custom struct by user, it can includes original mbuf
> >> >> >> >> pointer, user-defined free_cb can compare caller mbuf with
> >> >> >> >> mbuf in opaque struct, free_cb should free original mbuf
> >> >> >> >> if they are not same, this corresponds to rte_pktmbuf_free
> >> >> >> >> segmented mbufs case, otherwise, free_cb won't free original
> >> >> >> >> mbuf because the caller explicitly called rte_pktmbuf_free
> >> >> >> >> to free it.
> >> >> >> >> 
> >> >> >> >> Here is pseduo code to show two kind of cases.
> >> >> >> >> 
> >> >> >> >> case 1. rte_pktmbuf_free segmented mbufs
> >> >> >> >> 
> >> >> >> >> nb_tx = rte_gso_segment(original_mbuf, /* original mbuf */
> >> >> >> >>   &gso_ctx,
> >> >> >> >>   /* segmented mbuf */
> >> >> >> >>   (struct rte_mbuf **)&gso_mbufs,
> >> >> >> >>   MAX_GSO_MBUFS);
> >> >> >> >
> >> >> >> >I'm sorry but it is not very clear to me what operations are done by
> >> >> >> >rte_gso_segment().
> >> >> >> >
> >> >> >> >In the current code, I only see calls to rte_pktmbuf_attach(),
> >> >> >> >which do not deal with external buffers. Am I missing something?
> >> >> >> >
> >> >> >> >Are you able to show the issue only with mbuf functions? It would
> >> >> >> >be helpful to understand what does not work.
> >> >> >> >
> >> >> >> >
> >> >> >> >Thanks,
> >> >> >> >Olivier
> >> >> >> >
> >> >> >> Oliver, thank you for comment, let me show you why it doesn't work 
> >> >> >> for my use case.  In OVS DPDK, VM uses vhostuserclient to send large 
> >> >> >> packets whose size is about 64K because we enabled TSO & UFO, these 
> >> >> >> large packets use rte_mbufs allocated by DPDK virtio_net function 
> >> >> >> virtio_dev_pktmbuf_alloc() (in file lib/librte_vhost/virtio_net.c. 
> >> >> >> Please refer to [PATCH V1 3/3], I changed free_cb as below, these 
> >> >> >> packets use the same allocate function and the same free_cb no 
> >> >> >> matter they are TCP packet or UDP packets, in case of VXLAN TSO, 
> >> >> >> most NICs can't support inner UDP fragment offload, so OVS DPDK has 
> >> >> >> to do it by software, for UDP case, the original rte_mbuf only can 
> >> >> >> be freed by segmented rte_mbufs which are output packets of 
> >> >> >> rte_gso_segment, i.e. the original rte_mbuf only can freed by 
> >> >> >> free_cb, you can see, it explicitly called 
> >> >> >> rte_pktmbuf_free(arg->mbuf), the condition statement "if (caller_m 
> >> >> >> != arg->mbuf)" is true for this case, this has no problem, but for 
> >> >> >> TCP case, the original mbuf is delivered to rte_eth_tx_burst() but 
> >> >> >> not segmented rte_mbufs output by rte_gso_segment, PMD driver will 
> >> >> >> call rte_pktmbuf_free(original_rt

Re: [dpdk-dev] [PATCH] net/af_xdp: use snprintf instead of strncpy

2020-10-07 Thread Olivier Matz
On Wed, Oct 07, 2020 at 10:40:32AM +0100, Ferruh Yigit wrote:
> On 10/7/2020 10:01 AM, Ciara Loftus wrote:
> > strncpy may leave the destination buffer not NULL terminated so use
> > snprintf instead.
> 
> What do you think using 'strlcpy'?

Or even better, rte_strscpy()
https://git.dpdk.org/dpdk/commit/?id=b0236c7cf761

> 
> > 
> > Coverity issue: 362975
> > Fixes: 339b88c6a91f ("net/af_xdp: support multi-queue")
> > Signed-off-by: Ciara Loftus 
> > ---
> >   drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
> > b/drivers/net/af_xdp/rte_eth_af_xdp.c
> > index eaf2c9c873..52495cb8fb 100644
> > --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> > +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> > @@ -1362,7 +1362,7 @@ xdp_get_channels_info(const char *if_name, int 
> > *max_queues,
> > channels.cmd = ETHTOOL_GCHANNELS;
> > ifr.ifr_data = (void *)&channels;
> > -   strncpy(ifr.ifr_name, if_name, IFNAMSIZ);
> > +   snprintf(ifr.ifr_name, IFNAMSIZ, "%s", if_name);
> > ret = ioctl(fd, SIOCETHTOOL, &ifr);
> > if (ret) {
> > if (errno == EOPNOTSUPP) {
> > 
> 


Re: [dpdk-dev] [PATCH] librte_eal: fix mcslock hang on weak memory

2020-10-07 Thread Diogo Behrens
Hi Thomas,

we are still waiting for the comments from Honnappa. In our understanding, the 
missing barrier is a bug according to the model. We reproduced the scenario in 
herd7, which represents the authoritative memory model: 
https://developer.arm.com/architectures/cpu-architecture/a-profile/memory-model-tool

Here is a litmus code that shows that the XCHG (when compiled to LDAXR and 
STLR) is not atomic wrt memory updates to other locations:
-
AArch64 XCHG-nonatomic
{
0:X1=locked; 0:X3=next;
1:X1=locked; 1:X3=next; 1:X5=tail;
}
 P0 | P1;
 LDR W0, [X3]   | MOV W0, #1;
 CBZ W0, end| STR W0, [X1]; (* init locked *) 
 MOV W2, #2 | MOV W2, #0;
 STR W2, [X1]   | xchg:;
 end:   | LDAXR W6, [X5];
 NOP| STLXR W4, W0, [X5];
 NOP| CBNZ W4, xchg;
 NOP| STR W0, [X3]; (* set next *) 
exists
(0:X2=2 /\ locked=1)
-
(web version of herd7: http://diy.inria.fr/www/?record=aarch64)

P1 is trying to acquire the lock:
- initializes locked
- does the xchg on the tail of the mcslock
- sets the next

P0 is releasing the lock:
- if next is not set, just terminates
- if next is set, stores 2 in locked

The initialization of locked should never overwrite the store 2 to locked, but 
it does.
To avoid that reordering to happen, one should make the last store of P1 to 
have a "release" barrier, ie, STLR.

This is equivalent to the reordering occurring in the mcslock of librte_eal.

Best regards,
-Diogo

-Original Message-
From: Thomas Monjalon [mailto:tho...@monjalon.net] 
Sent: Tuesday, October 6, 2020 11:50 PM
To: Phil Yang ; Diogo Behrens ; 
Honnappa Nagarahalli 
Cc: dev@dpdk.org; nd 
Subject: Re: [dpdk-dev] [PATCH] librte_eal: fix mcslock hang on weak memory

31/08/2020 20:45, Honnappa Nagarahalli:
> 
> Hi Diogo,
> 
> Thanks for your explanation.
> 
> As documented in https://developer.arm.com/documentation/ddi0487/fc  B2.9.5 
> Load-Exclusive and Store-Exclusive instruction usage restrictions:
> " Between the Load-Exclusive and the Store-Exclusive, there are no 
> explicit memory accesses, preloads, direct or indirect System register 
> writes, address translation instructions, cache or TLB maintenance 
> instructions, exception generating instructions, exception returns, or 
> indirect branches."
> [Honnappa] This is a requirement on the software, not on the 
> micro-architecture.
> We are having few discussions internally, will get back soon.
> 
> So it is not allowed to insert (1) & (4) between (2, 3). The cmpxchg 
> operation is atomic.


Please what is the conclusion?





Re: [dpdk-dev] [dpdk-stable] [PATCH] net/mlx5: disable flow dump for Verbs flow

2020-10-07 Thread Ferruh Yigit

On 10/6/2020 4:24 PM, Raslan Darawsheh wrote:

Hi,


-Original Message-
From: dev  On Behalf Of Xueming Li
Sent: Thursday, September 10, 2020 7:25 AM
To: Matan Azrad ; Slava Ovsiienko

Cc: dev@dpdk.org; Asaf Penso ; sta...@dpdk.org
Subject: [dpdk-dev] [PATCH] net/mlx5: disable flow dump for Verbs flow

There was a segment fault when dump flows with device argument of
dv_flow_en=0. In such case, Verbs flow engine was enabled and fdb
resources were not initialized. It's suggested to use mlx_fs_dump
for Verbs flow dump.

This patch adds verbs engine check, prints warning message and return
gracefully.

Cc: sta...@dpdk.org
Signed-off-by: Xueming Li 


Can you please provide the fixed commit information?
That is required to figure out which stable tress to backport.


---
  drivers/net/mlx5/mlx5_flow.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 4c29898203..5a28b80ee4 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -6268,6 +6268,11 @@ mlx5_flow_dev_dump(struct rte_eth_dev *dev,
struct mlx5_priv *priv = dev->data->dev_private;
struct mlx5_dev_ctx_shared *sh = priv->sh;

+   if (!priv->config.dv_flow_en) {
+   if (fputs("device dv flow disabled\n", file) <= 0)
+   return -errno;
+   return -ENOTSUP;
+   }
return mlx5_devx_cmd_flow_dump(sh->fdb_domain, sh-

rx_domain,

   sh->tx_domain, file);
  }
--
2.25.1


Patch applied to next-net-mlx,


Kindest regards,
Raslan Darawsheh





[dpdk-dev] [PATCH v2] net/memif: use abstract socket address

2020-10-07 Thread Jakub Grajciar
Abstract socket address has no connection with
filesystem pathnames and the socket dissapears
once all open references are closed.

Memif pmd will use abstract socket address by default.
For backwards compatibility use new argument
'socket-abstract=no'

Signed-off-by: Jakub Grajciar 
---
 doc/guides/nics/memif.rst |  1 +
 drivers/net/memif/memif_socket.c  | 28 +
 drivers/net/memif/rte_eth_memif.c | 34 ++-
 drivers/net/memif/rte_eth_memif.h | 10 +
 4 files changed, 60 insertions(+), 13 deletions(-)

v2:
- fixed coding style issues

diff --git a/doc/guides/nics/memif.rst b/doc/guides/nics/memif.rst
index ddeebed25..8e80105a4 100644
--- a/doc/guides/nics/memif.rst
+++ b/doc/guides/nics/memif.rst
@@ -43,6 +43,7 @@ client.
"bsize=1024", "Size of single packet buffer", "2048", "uint16_t"
"rsize=11", "Log2 of ring size. If rsize is 10, actual ring size is 1024", 
"10", "1-14"
"socket=/tmp/memif.sock", "Socket filename", "/tmp/memif.sock", "string len 
108"
+   "socket-abstract=no", "Set usage of abstract socket address", "yes", 
"yes|no"
"mac=01:23:45:ab:cd:ef", "Mac address", "01:ab:23:cd:45:ef", ""
"secret=abc123", "Secret is an optional security option, which if 
specified, must be matched by peer", "", "string len 24"
"zero-copy=yes", "Enable/disable zero-copy slave mode. Only relevant to 
slave, requires '--single-file-segments' eal argument", "no", "yes|no"
diff --git a/drivers/net/memif/memif_socket.c b/drivers/net/memif/memif_socket.c
index 67794cb6f..8441f5397 100644
--- a/drivers/net/memif/memif_socket.c
+++ b/drivers/net/memif/memif_socket.c
@@ -862,10 +862,10 @@ memif_listener_handler(void *arg)
 }
 
 static struct memif_socket *
-memif_socket_create(char *key, uint8_t listener)
+memif_socket_create(char *key, uint8_t listener, bool is_abstract)
 {
struct memif_socket *sock;
-   struct sockaddr_un un;
+   struct sockaddr_un un = { 0 };
int sockfd;
int ret;
int on = 1;
@@ -886,7 +886,13 @@ memif_socket_create(char *key, uint8_t listener)
goto error;
 
un.sun_family = AF_UNIX;
-   strlcpy(un.sun_path, sock->filename, MEMIF_SOCKET_UN_SIZE);
+   if (is_abstract) {
+   /* abstract address */
+   un.sun_path[0] = '\0';
+   strlcpy(un.sun_path + 1, sock->filename, 
MEMIF_SOCKET_UN_SIZE - 1);
+   } else {
+   strlcpy(un.sun_path, sock->filename, 
MEMIF_SOCKET_UN_SIZE);
+   }
 
ret = setsockopt(sockfd, SOL_SOCKET, SO_PASSCRED, &on,
 sizeof(on));
@@ -963,7 +969,8 @@ memif_socket_init(struct rte_eth_dev *dev, const char 
*socket_filename)
ret = rte_hash_lookup_data(hash, key, (void **)&socket);
if (ret < 0) {
socket = memif_socket_create(key,
-(pmd->role == MEMIF_ROLE_SLAVE) ? 
0 : 1);
+   (pmd->role == MEMIF_ROLE_SLAVE) ? 0 : 1,
+   pmd->flags & ETH_MEMIF_FLAG_SOCKET_ABSTRACT);
if (socket == NULL)
return -1;
ret = rte_hash_add_key_data(hash, key, socket);
@@ -1025,7 +1032,7 @@ memif_socket_remove_device(struct rte_eth_dev *dev)
/* remove socket, if this was the last device using it */
if (TAILQ_EMPTY(&socket->dev_queue)) {
rte_hash_del_key(hash, socket->filename);
-   if (socket->listener) {
+   if (socket->listener && !(pmd->flags & 
ETH_MEMIF_FLAG_SOCKET_ABSTRACT)) {
/* remove listener socket file,
 * so we can create new one later.
 */
@@ -1054,7 +1061,7 @@ memif_connect_slave(struct rte_eth_dev *dev)
 {
int sockfd;
int ret;
-   struct sockaddr_un sun;
+   struct sockaddr_un sun = { 0 };
struct pmd_internals *pmd = dev->data->dev_private;
 
memset(pmd->local_disc_string, 0, ETH_MEMIF_DISC_STRING_SIZE);
@@ -1068,8 +1075,13 @@ memif_connect_slave(struct rte_eth_dev *dev)
}
 
sun.sun_family = AF_UNIX;
-
-   memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) - 1);
+   if (pmd->flags & ETH_MEMIF_FLAG_SOCKET_ABSTRACT) {
+   /* abstract address */
+   sun.sun_path[0] = '\0';
+   memcpy(sun.sun_path + 1, pmd->socket_filename, 
sizeof(sun.sun_path) - 2);
+   } else {
+   memcpy(sun.sun_path, pmd->socket_filename, sizeof(sun.sun_path) 
- 1);
+   }
 
ret = connect(sockfd, (struct sockaddr *)&sun,
  sizeof(struct sockaddr_un));
diff --git a/drivers/net/memif/rte_eth_memif.c 
b/drivers/net/memif/rte_eth_memif.c
index c1c7e9f8d..55c865d44 100644
--- a/drivers/net/memif/rte_eth_memif.c
+++ b/drivers/net/memif/rte_eth_memi

[dpdk-dev] [PATCH v3] net/af_xdp: use rte_strscpy instead of strncpy

2020-10-07 Thread Ciara Loftus
strncpy may leave the destination buffer not NULL terminated so use
rte_strscpy instead.

Coverity issue: 362975
Fixes: 339b88c6a91f ("net/af_xdp: support multi-queue")
Signed-off-by: Ciara Loftus 
---
v3:
* use rte_strscpy instead of strlcpy

v2:
* use strlcpy instead of snprintf

 drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index eaf2c9c873..ac00cbab8e 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -1362,7 +1362,7 @@ xdp_get_channels_info(const char *if_name, int 
*max_queues,
 
channels.cmd = ETHTOOL_GCHANNELS;
ifr.ifr_data = (void *)&channels;
-   strncpy(ifr.ifr_name, if_name, IFNAMSIZ);
+   strlcpy(ifr.ifr_name, if_name, IFNAMSIZ);
ret = ioctl(fd, SIOCETHTOOL, &ifr);
if (ret) {
if (errno == EOPNOTSUPP) {
-- 
2.17.1



[dpdk-dev] [PATCH v4] net/af_xdp: use rte_strscpy instead of strncpy

2020-10-07 Thread Ciara Loftus
strncpy may leave the destination buffer not NULL terminated so use
rte_strscpy instead.

Coverity issue: 362975
Fixes: 339b88c6a91f ("net/af_xdp: support multi-queue")
Signed-off-by: Ciara Loftus 
---
v4:
* actually use rte_strscpy instead of strlcpy

v3:
* use rte_strscpy instead of strlcpy

v2:
* use strlcpy instead of snprintf

 drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index eaf2c9c873..f394c57a74 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -1362,7 +1362,7 @@ xdp_get_channels_info(const char *if_name, int 
*max_queues,
 
channels.cmd = ETHTOOL_GCHANNELS;
ifr.ifr_data = (void *)&channels;
-   strncpy(ifr.ifr_name, if_name, IFNAMSIZ);
+   rte_strscpy(ifr.ifr_name, if_name, IFNAMSIZ);
ret = ioctl(fd, SIOCETHTOOL, &ifr);
if (ret) {
if (errno == EOPNOTSUPP) {
-- 
2.17.1



Re: [dpdk-dev] [PATCH] maintainers: improve coverage of arch-specific files

2020-10-07 Thread Thomas Monjalon
07/10/2020 09:44, David Marchand:
> On Tue, Oct 6, 2020 at 10:23 PM Thomas Monjalon  wrote:
> >
> > The sub-directories of config/ are maintained by
> > different architecture maintainers.
> >
> > Some wildcards are used to describe the lib and drivers files
> > which are specific to some architectures.
> 
> I noticed "some" misses, see below.
> 
> >
> > The EAL Arm files have split responsibilities depending on 32/64 suffix,
> > and the common files are shared between Armv7 and Armv8 sections.
> >
> > Signed-off-by: Thomas Monjalon 
> > ---
> >  MAINTAINERS | 23 +++
> >  1 file changed, 11 insertions(+), 12 deletions(-)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 75a17d51c0..349fc9c38a 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -256,32 +256,31 @@ F: app/test/test_rand_perf.c
> >  ARM v7
> >  M: Jan Viktorin 
> >  M: Ruifeng Wang 
> > +F: config/arm/
> >  F: lib/librte_eal/arm/
> > +X: lib/librte_eal/arm/include/*_64.h
> >
> >  ARM v8
> >  M: Jerin Jacob 
> >  M: Ruifeng Wang 
> > -F: lib/librte_eal/arm/include/*_64.h
> > -F: lib/librte_net/net_crc_neon.h
> > -F: lib/librte_acl/acl_run_neon.*
> > -F: lib/librte_bpf/bpf_jit_arm64.c
> > -F: lib/librte_lpm/rte_lpm_neon.h
> > -F: lib/librte_hash/rte*_arm64.h
> > -F: lib/librte_efd/rte*_arm64.h
> > -F: lib/librte_table/rte*_arm64.h
> > -F: drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> > -F: drivers/net/i40e/i40e_rxtx_vec_neon.c
> > -F: drivers/net/virtio/virtio_rxtx_simple_neon.c
> > +F: config/arm/
> > +F: lib/librte_eal/arm/
> > +X: lib/librte_eal/arm/include/*_32.h
> > +F: lib/*/*_neon.*
> > +F: lib/*/*_arm64.*
> > +F: drivers/*/*/*_neon.*
> 
> app and examples directory are not caught in this:
> 
> ./lib/librte_net/net_crc_neon.h
> ./lib/librte_acl/acl_run_neon.h
> ./lib/librte_acl/acl_run_neon.c
> ./lib/librte_lpm/rte_lpm_neon.h
> ./lib/librte_node/ip4_lookup_neon.h
> ./examples/l3fwd/l3fwd_lpm_neon.h
> ./examples/l3fwd/l3fwd_em_hlm_neon.h
> ./examples/l3fwd/l3fwd_neon.h
> ./app/test-pmd/macswap_neon.h
> ./drivers/net/hns3/hns3_rxtx_vec_neon.h
> ./drivers/net/bnxt/bnxt_rxtx_vec_neon.c
> ./drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> ./drivers/net/virtio/virtio_rxtx_simple_neon.c
> ./drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
> ./drivers/net/i40e/i40e_rxtx_vec_neon.c
> 
> Can we use a wildcard at dpdk root?
> */*/*_neon.*
> 
> The drivers/*/*/*_neon.* entry would still be needed though.
> 
> 
> >
> >  IBM POWER (alpha)
> >  M: David Christensen 
> > +F: config/ppc/
> >  F: lib/librte_eal/ppc/
> > -F: drivers/net/i40e/i40e_rxtx_vec_altivec.c
> > +F: drivers/*/*/*_altivec.*
> >  F: examples/l3fwd/*altivec.h
> 
> Idem neon, lib files missing.
> 
> ./lib/librte_eal/ppc/include/rte_altivec.h
> ./lib/librte_acl/acl_run_altivec.c
> ./lib/librte_acl/acl_run_altivec.h
> ./lib/librte_lpm/rte_lpm_altivec.h
> ./examples/l3fwd/l3fwd_altivec.h
> ./examples/l3fwd/l3fwd_lpm_altivec.h
> ./drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
> ./drivers/net/virtio/virtio_rxtx_simple_altivec.c
> ./drivers/net/i40e/i40e_rxtx_vec_altivec.c
> 
> 
> >
> >  Intel x86
> >  M: Bruce Richardson 
> >  M: Konstantin Ananyev 
> > +F: config/x86/
> >  F: lib/librte_eal/x86/
> 
> Nothing for avx/sse ?

I was not extending the scope too much.
I think you're right, we can go further in apps and x86-specific files.




Re: [dpdk-dev] [EXT] Re: [PATCH 1/1] eal: increase TRACE CTF SIZE to recommended size

2020-10-07 Thread Sunil Kumar Kori


Regards
Sunil Kumar Kori

>-Original Message-
>From: David Marchand 
>Sent: Wednesday, October 7, 2020 2:34 PM
>To: Jerin Jacob 
>Cc: Sunil Kumar Kori ; Timothy McDaniel
>; Jerin Jacob Kollanukkaran
>; dev ; Erik Gabriel Carrillo
>; Gage Eads ; Van Haaren
>Harry 
>Subject: Re: [dpdk-dev] [EXT] Re: [PATCH 1/1] eal: increase TRACE CTF SIZE to
>recommended size
>
>On Tue, Oct 6, 2020 at 11:58 AM Jerin Jacob  wrote:
>>
>> On Tue, Oct 6, 2020 at 3:10 PM David Marchand
> wrote:
>> >
>> > On Tue, Oct 6, 2020 at 11:22 AM Sunil Kumar Kori 
>wrote:
>> > > >On Mon, Oct 5, 2020 at 10:16 PM Timothy McDaniel
>> > > > wrote:
>> > > >>
>> > > >> Increase TRACE_CTF_FIELD_SIZE to 448, the recommended size.
>> > > >
>> > > >Repeating the same sentence in the title and the commitlog does
>> > > >not give much info.
>> > > >
>> > > >Plus, what is this "recommendation"?
>> > > When analyzed this issue, only one more byte was needed to fix this
>issue but in future similar issue can occur again.
>> > > So increasing this value by 64 bytes which actually equals to a cache 
>> > > line.
>That’s why we have suggested this size.
>> >
>> > 384 is aligned to both 64 and 128 bytes cache lines.
>> > 448 is only aligned to 64 bytes.
>> >
>> > Should we care about 128 bytes cache lines systems?
>>
>> it is on a slow path. 448 is OK.
>
>Ah yes, this is for the ctf description.
>Could it be changed to rely on dynamic allocations and we simply remove this
>limit?
Changing it to dynamic allocation is difficult because if we do this then every 
time memory is to reallocated.
So IMO, It is okay to keep it static with enough size.
>
>
>>
>> >
>> > >
>> > > >
>> > > >
>> > > >> Fixes "CTF field is too long" error when running with trace enabled.
>> > > >
>> > > >Ok, you hit this limit, but it would help to get some context here.
>> > > >Looking at this patch in the future, we won't know why it was
>necessary.
>> >
>> > How about following commitlog:
>> >
>> > """
>> > trace: increase trace point buffer size
>> >
>> > The current buffer size is not big enough to accomodate traces for
>> > new additions in the eventdev subsystem.
>> > Increase this buffer size by XXX for reason YYY.
>> > """
>>
>> Looks good to me.
>
>Well in this case, there is no actual reason.
>The increased value is deemed "enough for now", unless we change this to
>dynamic allocations.
>
>
>
>--
>David Marchand



Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Thomas Monjalon
07/10/2020 10:22, Maxime Coquelin:
> Hi Akhil,
> 
> Adding Thomas as he pulls your tree.
> 
> On 10/6/20 9:42 PM, Akhil Goyal wrote:
> > 
> >>
> >> The series prefixes drivers APIs with rte__ in
> >> order to avoid namespace pollution.
> >>
> >> These APIs are experimental, so no need to follow the
> >> deprecation process.
> >>
> > Added Fixes commit in patch description.
> 
> Thanks for applying it to your tree.
> 
> I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
> because I thought it is was not a good idea to backport API changes,
> even if this is experimental it might be annoying for the user.
> 
> Thomas, do you confirm?

Absolutely: API must not change in a stable branch.

If an API is changed, it must be in the release notes.
But stable branches are not allowed to update such sections
of the release notes.




Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Maxime Coquelin



On 10/7/20 12:09 PM, Thomas Monjalon wrote:
> 07/10/2020 10:22, Maxime Coquelin:
>> Hi Akhil,
>>
>> Adding Thomas as he pulls your tree.
>>
>> On 10/6/20 9:42 PM, Akhil Goyal wrote:
>>>

 The series prefixes drivers APIs with rte__ in
 order to avoid namespace pollution.

 These APIs are experimental, so no need to follow the
 deprecation process.

>>> Added Fixes commit in patch description.
>>
>> Thanks for applying it to your tree.
>>
>> I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
>> because I thought it is was not a good idea to backport API changes,
>> even if this is experimental it might be annoying for the user.
>>
>> Thomas, do you confirm?
> 
> Absolutely: API must not change in a stable branch.
> 
> If an API is changed, it must be in the release notes.

Ok, even for experimental APIs? I thought not.

> But stable branches are not allowed to update such sections
> of the release notes.
> 
> 



Re: [dpdk-dev] [PATCH 1/2] app/test: uninit vdevs in event eth Rx adapter autotest

2020-10-07 Thread Rao, Nikhil



> -Original Message-
> From: Jayatheerthan, Jay 
> Sent: Saturday, October 3, 2020 2:36 PM
> To: jer...@marvell.com; tho...@monjalon.net; Rao, Nikhil
> 
> Cc: dev@dpdk.org; sta...@dpdk.org; Jayatheerthan, Jay
> 
> Subject: [PATCH 1/2] app/test: uninit vdevs in event eth Rx adapter autotest
> 
> From: "Jay Jayatheerthan" 
> 
> adapter_multi_eth_add_del() does vdev init but doesn't uninit them.
> This causes issues when running event_eth_rx_adapter_autotest multiple
> times.
> 
> The fix does vdev uninit before exiting the test.
> 
> Signed-off-by: Jay Jayatheerthan 
> ---
>  app/test/test_event_eth_rx_adapter.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/app/test/test_event_eth_rx_adapter.c
> b/app/test/test_event_eth_rx_adapter.c
> index dd3bce71b..71c946164 100644
> --- a/app/test/test_event_eth_rx_adapter.c
> +++ b/app/test/test_event_eth_rx_adapter.c
> @@ -464,7 +464,7 @@ adapter_multi_eth_add_del(void)
>   int err;
>   struct rte_event ev;
> 
> - uint16_t port_index, drv_id = 0;
> + uint16_t port_index, port_index_base, drv_id = 0;
>   char driver_name[50];
> 
>   struct rte_event_eth_rx_adapter_queue_conf queue_config; @@ -
> 484,6 +484,7 @@ adapter_multi_eth_add_del(void)
> 
>   /* add the max port for rx_adapter */
>   port_index = rte_eth_dev_count_total();
> + port_index_base = port_index;
>   for (; port_index < RTE_MAX_ETHPORTS; port_index += 1) {
>   snprintf(driver_name, sizeof(driver_name), "%s%u",
> "net_null",
>   drv_id);
> @@ -513,6 +514,17 @@ adapter_multi_eth_add_del(void)
>   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
>   }
> 
> + /* delete vdev ports */
> + for (drv_id = 0, port_index = port_index_base;
> +  port_index < RTE_MAX_ETHPORTS;
> +  drv_id += 1, port_index += 1) {
> + snprintf(driver_name, sizeof(driver_name), "%s%u",
> "net_null",
> + drv_id);
> + err = rte_vdev_uninit(driver_name);
> + TEST_ASSERT(err == 0, "Failed driver %s got %d",
> + driver_name, err);
> + }
> +
>   return TEST_SUCCESS;
>  }
> 
> --
> 2.17.1
Reviewed-by: Nikhil Rao 


Re: [dpdk-dev] [PATCH 2/2] app/test: add net null dev creation in Rx adapter autotest

2020-10-07 Thread Rao, Nikhil
> -Original Message-
> From: Jayatheerthan, Jay 
> Sent: Saturday, October 3, 2020 2:36 PM
> To: jer...@marvell.com; tho...@monjalon.net; Rao, Nikhil
> 
> Cc: dev@dpdk.org; sta...@dpdk.org; Jayatheerthan, Jay
> 
> Subject: [PATCH 2/2] app/test: add net null dev creation in Rx adapter
> autotest
> 
> From: "Jay Jayatheerthan" 
> 
> Allows creation of net_null if vdev EAL option is not specified and uninit 
> vdev
> created in the test. The change also adds error checks for vdev init and 
> uninit.
> 
> Signed-off-by: Jay Jayatheerthan 
> ---
>  app/test/test_event_eth_rx_adapter.c | 61 +++-
>  1 file changed, 59 insertions(+), 2 deletions(-)
> 
> diff --git a/app/test/test_event_eth_rx_adapter.c
> b/app/test/test_event_eth_rx_adapter.c
> index 71c946164..dbf85be35 100644
> --- a/app/test/test_event_eth_rx_adapter.c
> +++ b/app/test/test_event_eth_rx_adapter.c
> @@ -30,6 +30,8 @@ struct event_eth_rx_adapter_test_params {  };
> 
>  static struct event_eth_rx_adapter_test_params default_params;
> +static bool event_dev_created;
> +static bool eth_dev_created;
> 
>  static inline int
>  port_init_common(uint16_t port, const struct rte_eth_conf *port_conf, @@ -
> 202,7 +204,10 @@ testsuite_setup(void)
>   if (!count) {
>   printf("Failed to find a valid event device,"
>   " testing with event_skeleton device\n");
> - rte_vdev_init("event_skeleton", NULL);
> + err = rte_vdev_init("event_skeleton", NULL);
> + TEST_ASSERT(err == 0, "Failed to create event_skeleton.
> err=%d",
> + err);
> + event_dev_created = true;
>   }
> 
>   struct rte_event_dev_config config = { @@ -222,6 +227,15 @@
> testsuite_setup(void)
>   TEST_ASSERT(err == 0, "Event device initialization failed err %d\n",
>   err);
> 
> + count = rte_eth_dev_count_total();
> + if (!count) {
> + printf("Testing with net_null device\n");
> + err = rte_vdev_init("net_null", NULL);
> + TEST_ASSERT(err == 0, "Failed to create net_null. err=%d",
> + err);
> + eth_dev_created = true;
> + }
> +
>   /*
>* eth devices like octeontx use event device to receive packets
>* so rte_eth_dev_start invokes rte_event_dev_start internally, so
> @@ -249,7 +263,10 @@ testsuite_setup_rx_intr(void)
>   if (!count) {
>   printf("Failed to find a valid event device,"
>   " testing with event_skeleton device\n");
> - rte_vdev_init("event_skeleton", NULL);
> + err = rte_vdev_init("event_skeleton", NULL);
> + TEST_ASSERT(err == 0, "Failed to create event_skeleton.
> err=%d",
> + err);
> + event_dev_created = true;
>   }
> 
>   struct rte_event_dev_config config = { @@ -270,6 +287,15 @@
> testsuite_setup_rx_intr(void)
>   TEST_ASSERT(err == 0, "Event device initialization failed err %d\n",
>   err);
> 
> + count = rte_eth_dev_count_total();
> + if (!count) {
> + printf("Testing with net_null device\n");
> + err = rte_vdev_init("net_null", NULL);
> + TEST_ASSERT(err == 0, "Failed to create net_null. err=%d",
> + err);
> + eth_dev_created = true;
> + }
> +
>   /*
>* eth devices like octeontx use event device to receive packets
>* so rte_eth_dev_start invokes rte_event_dev_start internally, so
> @@ -292,21 +318,52 @@ testsuite_setup_rx_intr(void)  static void
>  testsuite_teardown(void)
>  {
> + int err;
>   uint32_t i;
>   RTE_ETH_FOREACH_DEV(i)
>   rte_eth_dev_stop(i);
> 
> + if (eth_dev_created) {
> + err = rte_vdev_uninit("net_null");
> + if (err)
> + printf("Failed to delete net_null. err=%d", err);
> + eth_dev_created = false;
> + }
> +
>   rte_mempool_free(default_params.mp);
> + if (event_dev_created) {
> + err = rte_vdev_uninit("event_skeleton");
> + if (err)
> + printf("Failed to delete event_skeleton. err=%d", err);
> + event_dev_created = false;
> + }
> +
> + memset(&default_params, 0, sizeof(default_params));
>  }
> 
>  static void
>  testsuite_teardown_rx_intr(void)
>  {
> + int err;
>   if (!default_params.rx_intr_port_inited)
>   return;
> 
>   rte_eth_dev_stop(default_params.rx_intr_port);
> + if (eth_dev_created) {
> + err = rte_vdev_uninit("net_null");
> + if (err)
> + printf("Failed to delete net_null. err=%d", err);
> + eth_dev_created = false;
> + }
>   rte_mempool_free(default_params.mp);
> + if (event_dev_created) {
> + err = rte_vdev_uninit("event_skeleton");
> + if (err)
> +   

Re: [dpdk-dev] [PATCH] net/af_xdp: use snprintf instead of strncpy

2020-10-07 Thread Bruce Richardson
On Wed, Oct 07, 2020 at 11:51:31AM +0200, Olivier Matz wrote:
> On Wed, Oct 07, 2020 at 10:40:32AM +0100, Ferruh Yigit wrote:
> > On 10/7/2020 10:01 AM, Ciara Loftus wrote:
> > > strncpy may leave the destination buffer not NULL terminated so use
> > > snprintf instead.
> > 
> > What do you think using 'strlcpy'?
> 
> Or even better, rte_strscpy()
> https://git.dpdk.org/dpdk/commit/?id=b0236c7cf761
> 
I think this is largely a matter of preference, and unless there is a good
reason not to, I tend towards strlcpy as the older and more common (till
now) interface. The main thing is just to use a function that will
guarantee dest is null-terminated here, and both strlcpy and strscpy meet
that criteria.

/Bruce


Re: [dpdk-dev] [PATCH] net/af_xdp: use snprintf instead of strncpy

2020-10-07 Thread Bruce Richardson
On Wed, Oct 07, 2020 at 11:26:38AM +0100, Bruce Richardson wrote:
> On Wed, Oct 07, 2020 at 11:51:31AM +0200, Olivier Matz wrote:
> > On Wed, Oct 07, 2020 at 10:40:32AM +0100, Ferruh Yigit wrote:
> > > On 10/7/2020 10:01 AM, Ciara Loftus wrote:
> > > > strncpy may leave the destination buffer not NULL terminated so use
> > > > snprintf instead.
> > > 
> > > What do you think using 'strlcpy'?
> > 
> > Or even better, rte_strscpy()
> > https://git.dpdk.org/dpdk/commit/?id=b0236c7cf761
> > 
> I think this is largely a matter of preference, and unless there is a good
> reason not to, I tend towards strlcpy as the older and more common (till
> now) interface. The main thing is just to use a function that will
> guarantee dest is null-terminated here, and both strlcpy and strscpy meet
> that criteria.
> 
I'd also add that strlcpy is more likely to be recognised by tools like
coverity, compared to rte_strscpy which is DPDK-specific.

/Bruce


Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Thomas Monjalon
07/10/2020 12:12, Maxime Coquelin:
> On 10/7/20 12:09 PM, Thomas Monjalon wrote:
> > 07/10/2020 10:22, Maxime Coquelin:
> >> On 10/6/20 9:42 PM, Akhil Goyal wrote:
> 
>  The series prefixes drivers APIs with rte__ in
>  order to avoid namespace pollution.
> 
>  These APIs are experimental, so no need to follow the
>  deprecation process.
> 
> >>> Added Fixes commit in patch description.
> >>
> >> Thanks for applying it to your tree.
> >>
> >> I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
> >> because I thought it is was not a good idea to backport API changes,
> >> even if this is experimental it might be annoying for the user.
> >>
> >> Thomas, do you confirm?
> > 
> > Absolutely: API must not change in a stable branch.
> > 
> > If an API is changed, it must be in the release notes.
> 
> Ok, even for experimental APIs? I thought not.

Yes, experimental means it can change in the main branch
without prior notice. But it must be noted when it's changed.





Re: [dpdk-dev] [PATCH v2] net/af_xdp: use strlcpy instead of strncpy

2020-10-07 Thread Bruce Richardson
On Wed, Oct 07, 2020 at 09:20:50AM +, Ciara Loftus wrote:
> strncpy may leave the destination buffer not NULL terminated so use
> strlcpy instead.
> 
> Coverity issue: 362975
> Fixes: 339b88c6a91f ("net/af_xdp: support multi-queue")
> Signed-off-by: Ciara Loftus 
> ---

Acked-by: Bruce Richardson 

If the community prefers using rte_strscpy, this ack can also apply to v4
too. :-)

> v2:
> * use strlcpy instead of snprintf
> 
>  drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index eaf2c9c873..ac00cbab8e 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -1362,7 +1362,7 @@ xdp_get_channels_info(const char *if_name, int 
> *max_queues,
>  
>   channels.cmd = ETHTOOL_GCHANNELS;
>   ifr.ifr_data = (void *)&channels;
> - strncpy(ifr.ifr_name, if_name, IFNAMSIZ);
> + strlcpy(ifr.ifr_name, if_name, IFNAMSIZ);
>   ret = ioctl(fd, SIOCETHTOOL, &ifr);
>   if (ret) {
>   if (errno == EOPNOTSUPP) {
> -- 
> 2.17.1
> 


Re: [dpdk-dev] [PATCH] cryptodev: revert ABI compatibility for ChaCha20-Poly1305

2020-10-07 Thread Doherty, Declan



On 06/10/2020 1:32 PM, David Marchand wrote:

For the title, I would suggest: "cryptodev: remove v20 ABI compatibility"

You did this change using a revert, but still, we can avoid restoring
coding style issues, see nits below.


On Fri, Aug 14, 2020 at 12:00 PM Adam Dybkowski
 wrote:

This reverts commit a0f0de06d457753c94688d551a6e8659b4d4e041 as the
rte_cryptodev_info_get function versioning was a temporary solution
to maintain ABI compatibility for ChaCha20-Poly1305 and is not
needed in 20.11.


...


  int
  rte_cryptodev_callback_register(uint8_t dev_id,
diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index 7b3ebc20f..26abd0c52 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -219,14 +219,6 @@ struct rte_cryptodev_asym_capability_idx {
   *   - Return NULL if the capability not exist.
   */
  const struct rte_cryptodev_symmetric_capability *
-rte_cryptodev_sym_capability_get_v20(uint8_t dev_id,
-   const struct rte_cryptodev_sym_capability_idx *idx);
-
-const struct rte_cryptodev_symmetric_capability *
-rte_cryptodev_sym_capability_get_v21(uint8_t dev_id,
-   const struct rte_cryptodev_sym_capability_idx *idx);
-
-const struct rte_cryptodev_symmetric_capability *
  rte_cryptodev_sym_capability_get(uint8_t dev_id,
 const struct rte_cryptodev_sym_capability_idx *idx);

@@ -789,33 +781,9 @@ rte_cryptodev_stats_reset(uint8_t dev_id);
   * the last valid element has it's op field set to
   * RTE_CRYPTO_OP_TYPE_UNDEFINED.
   */
-
-void
+extern void

Nit: no need for extern.
Hey David, I think the cryptodev API consistently uses extern on nearly 
all it's function declarations. I'd proposed we do a separate patchset 
which removes extern on all function declarations to make it more 
consistent with the rest of DPDKs libraries.

  /**
   * Register a callback function for specific device id.

...

Thanks for working on this.
Note to others watching ABI, with this, it should be the last patch
about DPDK_20 ABI.




Re: [dpdk-dev] [PATCH v1] event/sw: performance improvements

2020-10-07 Thread Nicolau, Radu



On 10/6/2020 11:13 AM, Ananyev, Konstantin wrote:

-Original Message-
From: Jerin Jacob 
Sent: Monday, October 5, 2020 5:35 PM
To: Nicolau, Radu 
Cc: Honnappa Nagarahalli ; Richardson, Bruce
; Ananyev, Konstantin
; Van Haaren, Harry
; dev@dpdk.org; jer...@marvell.com; nd

Subject: Re: [dpdk-dev] [PATCH v1] event/sw: performance improvements

On Tue, Sep 29, 2020 at 2:32 PM Nicolau, Radu  wrote:


On 9/28/2020 5:02 PM, Honnappa Nagarahalli wrote:



Add minimum burst throughout the scheduler pipeline and a flush counter.
Replace ring API calls with local single threaded implementation where
possible.

Signed-off-by: Radu Nicolau mailto:radu.nico...@intel.com

Thanks for the patch, a few comments inline.

Why not make these APIs part of the rte_ring library? You could further

optimize them by keeping the indices on the same cacheline.

I'm not sure there is any need for non thread-safe rings outside this

particular case.

[Honnappa] I think if we add the APIs, we will find the use cases.
But, more than that, I understand that rte_ring structure is exposed to the

application. The reason for doing that is the inline functions that rte_ring
provides. IMO, we should still maintain modularity and should not use the
internals of the rte_ring structure outside of the library.

+1 to that.

BTW, is there any real perf benefit from such micor-optimisation?

I'd tend to view these as use-case specific, and I'm not sure we should clutter
up the ring library with yet more functions, especially since they can't be
mixed with the existing enqueue/dequeue functions, since they don't use
the head pointers.

IMO, the ring library is pretty organized with the recent addition of HTS/RTS

modes. This can be one of the modes and should allow us to use the existing
functions (though additional functions are required as well).

The other concern I have is, this implementation can be further optimized by

using a single cache line for the pointers. It uses 2 cache lines just because 
of the
layout of the rte_ring structure.

There was a question earlier about the performance improvements of this

patch? Are there any % performance improvements that can be shared?

It is also possible to change the above functions to use the head/tail pointers

from producer or the consumer cache line alone to check for perf differences.

I don't have a % for the final improvement for this change alone, but
there was some improvement in the memory overhead measurable during
development, which very likely resulted in the whole optimization having
more headroom.

I agree that this may be further optimized, maybe by having a local
implementation of a ring-like container instead.

Have we decided on the next steps for this patch? Is the plan to
supersede this patch and have different
one in rte_ring subsystem,

My preference is to merge this version of the patch;
1) The ring helper functions are stripped to the SW PMD usage, and not valid to 
use in the general.
2) Adding static inline APIs in an LTS without extensive doesn't seem a good 
idea.

If Honnappa is OK with the above solution for 20.11, we can see about moving 
the rings part of the
code to rte_ring library location in 21.02, and give ourselves some time to 
settle the usage/API before
the next LTS.


As ring library maintainer I share Honnappa concern that another library not 
uses public ring API,
but instead accesses ring internals directly. Obviously such coding practice is 
not welcomed
as it makes harder to maintain/extend ring library in future.
About 2) - these new API can(/shoud) be marked an experimental anyway.
As another thing - it is still unclear what a performance gain we are talking 
about here.
Is it really worth it comparing to just using SP/SC?


The change itself came after I analyzed the memory bound sections of the 
code, and I just did a quick test, I got about 3.5% improvement in 
throughput,  maybe not so much but significant for such a small change, 
and depending on the usecase it may be more.


As for the implementation itself, I would favour having a custom ring 
like container in the PMD code, this will solve the issue of using 
rte_ring internals while still allow for full optimisation. If this is 
acceptable, I will follow up by tomorrow.




Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth

2020-10-07 Thread Power, Ciara
Hi Olivier,

Thanks for reviewing, some comments below.


>-Original Message-
>From: Olivier Matz 
>Sent: Tuesday 6 October 2020 10:32
>To: Power, Ciara 
>Cc: dev@dpdk.org; Ray Kinsella ; Neil Horman
>
>Subject: Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth
>
>Hi Ciara,
>
>Please find some comments below.
>
>On Wed, Sep 30, 2020 at 02:03:57PM +0100, Ciara Power wrote:
>> This patch adds a max SIMD bitwidth EAL configuration. The API allows
>> for an app to set this value. It can also be set using EAL argument
>> --force-max-simd-bitwidth, which will lock the value and override any
>> modifications made by the app.
>>
>> Signed-off-by: Ciara Power 
>>
>> ---
>> v3:
>>   - Added enum value to essentially disable using max SIMD to choose
>> paths, intended for use by ARM SVE.
>>   - Fixed parsing bitwidth argument to return an error for values
>> greater than uint16_t.
>> v2: Added to Doxygen comment for API.
>> ---



>>
>> +uint16_t
>> +rte_get_max_simd_bitwidth(void)
>> +{
>> +const struct internal_config *internal_conf =
>> +eal_get_internal_configuration();
>> +return internal_conf->max_simd_bitwidth.bitwidth;
>> +}
>
>Should the return value be enum rte_max_simd_t?
>If not, do we really need the enum definition?
>

I kept the return value and param value below as uint16_t to allow for 
arbitrary values,
and will allow it be more flexible for future additions as new enums won't need 
to be added.
For the set function below, this is used when a user passes the EAL command 
line flag, which
passes an integer value rather than an enum one.
The enums are useful when checking the max_simd_bitwidth in drivers/libs, for 
example using
"RTE_MAX_256_SIMD" instead of "256" in the condition checks.

>> +
>> +int
>> +rte_set_max_simd_bitwidth(uint16_t bitwidth) {
>> +struct internal_config *internal_conf =
>> +eal_get_internal_configuration();
>> +if (internal_conf->max_simd_bitwidth.locked) {
>> +RTE_LOG(NOTICE, EAL, "Cannot set max SIMD bitwidth - user
>runtime override enabled");
>> +return -EPERM;
>> +}
>> +
>> +if (bitwidth != RTE_MAX_SIMD_DISABLE && (bitwidth <
>RTE_NO_SIMD ||
>> +!rte_is_power_of_2(bitwidth))) {
>> +RTE_LOG(ERR, EAL, "Invalid bitwidth value!\n");
>> +return -EINVAL;
>> +}
>> +internal_conf->max_simd_bitwidth.bitwidth = bitwidth;
>> +return 0;
>> +}
>
>Same question, should the parameter be enum rte_max_simd_t?
>



>> +enum rte_max_simd_t {
>> +RTE_NO_SIMD = 64,
>> +RTE_MAX_128_SIMD = 128,
>> +RTE_MAX_256_SIMD = 256,
>> +RTE_MAX_512_SIMD = 512,
>> +RTE_MAX_SIMD_DISABLE = UINT16_MAX,
>> +};
>
>What is the difference between RTE_NO_SIMD and
>RTE_MAX_SIMD_DISABLE?

RTE_NO_SIMD has value 64 to limit paths to scalar only.
RTE_MAX_SIMD_DISABLE sets the highest value possible,
so essentially disables the limit affecting which vector paths are taken.
This disable option was added to allow for ARM SVE which will be later added,
Discussed with Honnappa on a previous version: 
https://patchwork.dpdk.org/patch/76097/ 

>The default value in internal_config is 0, so in my understanding
>rte_get_max_simd_bitwidth() will return 0 if --force-max-simd-bitwidth is
>not passed. Is it expected?
>
>Maybe I'm missing something, but I don't understand why the value in
>internal_config is not set to the maximum supported SIMD bitwidth by
>default, and optionally overriden by the command line argument, or by the
>API.
>

The default value for max_simd_bitwidth is set depending on the architecture, 
256 for x86/ppc,
and UINT16_MAX for ARM. So for example the default on x86 allows for AVX2 and 
under.
The defaults can be seen in patch 2: https://patchwork.dpdk.org/patch/79339/ 



Thanks,
Ciara


Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth

2020-10-07 Thread Bruce Richardson
On Wed, Oct 07, 2020 at 11:47:34AM +0100, Power, Ciara wrote:
> Hi Olivier,
> 
> Thanks for reviewing, some comments below.
> 
> 
> >-Original Message-
> >From: Olivier Matz 
> >Sent: Tuesday 6 October 2020 10:32
> >To: Power, Ciara 
> >Cc: dev@dpdk.org; Ray Kinsella ; Neil Horman
> >
> >Subject: Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth
> >
> >Hi Ciara,
> >
> >Please find some comments below.
> >
> >On Wed, Sep 30, 2020 at 02:03:57PM +0100, Ciara Power wrote:
> >> This patch adds a max SIMD bitwidth EAL configuration. The API allows
> >> for an app to set this value. It can also be set using EAL argument
> >> --force-max-simd-bitwidth, which will lock the value and override any
> >> modifications made by the app.
> >>
> >> Signed-off-by: Ciara Power 
> >>
> >> ---
> >> v3:
> >>   - Added enum value to essentially disable using max SIMD to choose
> >> paths, intended for use by ARM SVE.
> >>   - Fixed parsing bitwidth argument to return an error for values
> >> greater than uint16_t.
> >> v2: Added to Doxygen comment for API.
> >> ---
> 
> 
> 
> >>
> >> +uint16_t
> >> +rte_get_max_simd_bitwidth(void)
> >> +{
> >> +const struct internal_config *internal_conf =
> >> +eal_get_internal_configuration();
> >> +return internal_conf->max_simd_bitwidth.bitwidth;
> >> +}
> >
> >Should the return value be enum rte_max_simd_t?
> >If not, do we really need the enum definition?
> >
> 
> I kept the return value and param value below as uint16_t to allow for 
> arbitrary values,
> and will allow it be more flexible for future additions as new enums won't 
> need to be added.
> For the set function below, this is used when a user passes the EAL command 
> line flag, which
> passes an integer value rather than an enum one.
> The enums are useful when checking the max_simd_bitwidth in drivers/libs, for 
> example using
> "RTE_MAX_256_SIMD" instead of "256" in the condition checks.
> 
So basically these enum values are #defines for readability, just in enum
form, right?


Re: [dpdk-dev] [PATCH v1] event/sw: performance improvements

2020-10-07 Thread Ananyev, Konstantin
> On 10/6/2020 11:13 AM, Ananyev, Konstantin wrote:
> >>> -Original Message-
> >>> From: Jerin Jacob 
> >>> Sent: Monday, October 5, 2020 5:35 PM
> >>> To: Nicolau, Radu 
> >>> Cc: Honnappa Nagarahalli ; Richardson, Bruce
> >>> ; Ananyev, Konstantin
> >>> ; Van Haaren, Harry
> >>> ; dev@dpdk.org; jer...@marvell.com; nd
> >>> 
> >>> Subject: Re: [dpdk-dev] [PATCH v1] event/sw: performance improvements
> >>>
> >>> On Tue, Sep 29, 2020 at 2:32 PM Nicolau, Radu  
> >>> wrote:
> 
>  On 9/28/2020 5:02 PM, Honnappa Nagarahalli wrote:
> > 
> >>> Add minimum burst throughout the scheduler pipeline and a flush 
> >>> counter.
> >>> Replace ring API calls with local single threaded implementation where
> >>> possible.
> >>>
> >>> Signed-off-by: Radu Nicolau mailto:radu.nico...@intel.com
> >>>
> >>> Thanks for the patch, a few comments inline.
> >>>
> >>> Why not make these APIs part of the rte_ring library? You could 
> >>> further
> >> optimize them by keeping the indices on the same cacheline.
> >>> I'm not sure there is any need for non thread-safe rings outside this
> >> particular case.
> >>> [Honnappa] I think if we add the APIs, we will find the use cases.
> >>> But, more than that, I understand that rte_ring structure is exposed 
> >>> to the
> >> application. The reason for doing that is the inline functions that 
> >> rte_ring
> >> provides. IMO, we should still maintain modularity and should not use 
> >> the
> >> internals of the rte_ring structure outside of the library.
> >>> +1 to that.
> >>>
> >>> BTW, is there any real perf benefit from such micor-optimisation?
> >> I'd tend to view these as use-case specific, and I'm not sure we 
> >> should clutter
> >> up the ring library with yet more functions, especially since they 
> >> can't be
> >> mixed with the existing enqueue/dequeue functions, since they don't use
> >> the head pointers.
> > IMO, the ring library is pretty organized with the recent addition of 
> > HTS/RTS
> >>> modes. This can be one of the modes and should allow us to use the 
> >>> existing
> >>> functions (though additional functions are required as well).
> > The other concern I have is, this implementation can be further 
> > optimized by
> >>> using a single cache line for the pointers. It uses 2 cache lines just 
> >>> because of the
> >>> layout of the rte_ring structure.
> > There was a question earlier about the performance improvements of this
> >>> patch? Are there any % performance improvements that can be shared?
> > It is also possible to change the above functions to use the head/tail 
> > pointers
> >>> from producer or the consumer cache line alone to check for perf 
> >>> differences.
>  I don't have a % for the final improvement for this change alone, but
>  there was some improvement in the memory overhead measurable during
>  development, which very likely resulted in the whole optimization having
>  more headroom.
> 
>  I agree that this may be further optimized, maybe by having a local
>  implementation of a ring-like container instead.
> >>> Have we decided on the next steps for this patch? Is the plan to
> >>> supersede this patch and have different
> >>> one in rte_ring subsystem,
> >> My preference is to merge this version of the patch;
> >> 1) The ring helper functions are stripped to the SW PMD usage, and not 
> >> valid to use in the general.
> >> 2) Adding static inline APIs in an LTS without extensive doesn't seem a 
> >> good idea.
> >>
> >> If Honnappa is OK with the above solution for 20.11, we can see about 
> >> moving the rings part of the
> >> code to rte_ring library location in 21.02, and give ourselves some time 
> >> to settle the usage/API before
> >> the next LTS.
> >>
> > As ring library maintainer I share Honnappa concern that another library 
> > not uses public ring API,
> > but instead accesses ring internals directly. Obviously such coding 
> > practice is not welcomed
> > as it makes harder to maintain/extend ring library in future.
> > About 2) - these new API can(/shoud) be marked an experimental anyway.
> > As another thing - it is still unclear what a performance gain we are 
> > talking about here.
> > Is it really worth it comparing to just using SP/SC?
> 
> The change itself came after I analyzed the memory bound sections of the
> code, and I just did a quick test, I got about 3.5% improvement in
> throughput,  maybe not so much but significant for such a small change,
> and depending on the usecase it may be more.
> 
> As for the implementation itself, I would favour having a custom ring
> like container in the PMD code, this will solve the issue of using
> rte_ring internals while still allow for full optimisation. If this is
> acceptable, I will follow up by tomorrow.

Sounds ok to me.
Thanks
Konstantin



Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Maxime Coquelin



On 10/7/20 12:29 PM, Thomas Monjalon wrote:
> 07/10/2020 12:12, Maxime Coquelin:
>> On 10/7/20 12:09 PM, Thomas Monjalon wrote:
>>> 07/10/2020 10:22, Maxime Coquelin:
 On 10/6/20 9:42 PM, Akhil Goyal wrote:
>>
>> The series prefixes drivers APIs with rte__ in
>> order to avoid namespace pollution.
>>
>> These APIs are experimental, so no need to follow the
>> deprecation process.
>>
> Added Fixes commit in patch description.

 Thanks for applying it to your tree.

 I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
 because I thought it is was not a good idea to backport API changes,
 even if this is experimental it might be annoying for the user.

 Thomas, do you confirm?
>>>
>>> Absolutely: API must not change in a stable branch.
>>>
>>> If an API is changed, it must be in the release notes.
>>
>> Ok, even for experimental APIs? I thought not.
> 
> Yes, experimental means it can change in the main branch
> without prior notice. But it must be noted when it's changed.
> 
> 
> 

Ok, do you want me to send add-on patches that you will squash when
pulling Akhil's branch?



[dpdk-dev] [PATCH v4 00/11] support match on L3 fragmented packets

2020-10-07 Thread Dekel Peled
This series implements support of matching on packets based on the
fragmentation attribute of the packet, i.e. if packet is a fragment
of a larger packet, or the opposite - packet is not a fragment.

In ethdev, add API to support IPv6 extension headers, and specifically
the IPv6 fragment extension header item.
In MLX5 PMD, support match on IPv4 fragmented packets, IPv6 fragmented
packets, and IPv6 fragment extension header item.
Testpmd CLI is updated accordingly.
Documentation is updated accordingly.

---
v2: add patch 'net/mlx5: enforce limitation on IPv6 next proto'
v3: update patch 'ethdev: add IPv6 fragment extension header item' to avoid ABI 
breakage.
v4: update rte_flow documentation to clarify use of IPv6 extension header flags.
---

Dekel Peled (11):
  ethdev: add extensions attributes to IPv6 item
  ethdev: add IPv6 fragment extension header item
  app/testpmd: support IPv4 fragments
  app/testpmd: support IPv6 fragments
  app/testpmd: support IPv6 fragment extension item
  net/mlx5: remove handling of ICMP fragmented packets
  net/mlx5: support match on IPv4 fragment packets
  net/mlx5: support match on IPv6 fragment packets
  net/mlx5: support match on IPv6 fragment ext. item
  doc: update release notes for MLX5 L3 frag support
  net/mlx5: enforce limitation on IPv6 next proto

 app/test-pmd/cmdline_flow.c|  53 +
 doc/guides/nics/mlx5.rst   |   7 +
 doc/guides/prog_guide/rte_flow.rst |  34 ++-
 doc/guides/rel_notes/release_20_11.rst |  10 +
 drivers/net/mlx5/mlx5_flow.c   |  62 --
 drivers/net/mlx5/mlx5_flow.h   |  14 ++
 drivers/net/mlx5/mlx5_flow_dv.c| 382 +
 drivers/net/mlx5/mlx5_flow_verbs.c |   9 +-
 lib/librte_ethdev/rte_flow.c   |   1 +
 lib/librte_ethdev/rte_flow.h   |  45 +++-
 lib/librte_ip_frag/rte_ip_frag.h   |  26 +--
 lib/librte_net/rte_ip.h|  26 ++-
 12 files changed, 579 insertions(+), 90 deletions(-)

-- 
1.8.3.1



[dpdk-dev] [PATCH v4 02/11] ethdev: add IPv6 fragment extension header item

2020-10-07 Thread Dekel Peled
Applications handling fragmented IPv6 packets need to match on IPv6
fragment extension header, in order to identify the fragments order
and location in the packet.
This patch introduces the IPv6 fragment extension header item,
proposed in [1].

Relevant definitions are moved from lib/librte_ip_frag/rte_ip_frag.h
to lib/librte_net/rte_ip.h, as they are needed for IPv6 header handling.
struct ipv6_extension_fragment renamed to rte_ipv6_fragment_ext to
adapt it to the common naming convention.

Default mask is not defined, since all fields are optional.

[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html

Signed-off-by: Dekel Peled 
---
 doc/guides/prog_guide/rte_flow.rst | 16 ++--
 lib/librte_ethdev/rte_flow.c   |  1 +
 lib/librte_ethdev/rte_flow.h   | 20 
 lib/librte_ip_frag/rte_ip_frag.h   | 26 ++
 lib/librte_net/rte_ip.h| 26 --
 5 files changed, 61 insertions(+), 28 deletions(-)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index ae090db..02b1d58 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -947,8 +947,8 @@ Item: ``IPV6``
 Matches an IPv6 header.
 
 Dedicated flags indicate existence of specific extension headers.
-Every type of extension header can use a dedicated pattern item, or
-the generic `Item: IPV6_EXT`_.
+Every type of extension header can use a dedicated pattern item,
+for example `Item: IPV6_FRAG_EXT`_, or the generic `Item: IPV6_EXT`_.
 To match on packets containing a specific extension header, an application
 should match on the dedicated flag set to 1.
 To match on packets not containing a specific extension header, an application
@@ -1193,6 +1193,18 @@ Normally preceded by any of:
 - `Item: IPV6`_
 - `Item: IPV6_EXT`_
 
+Item: ``IPV6_FRAG_EXT``
+^^^
+
+Matches the presence of IPv6 fragment extension header.
+
+- ``hdr``: IPv6 fragment extension header definition (``rte_ip.h``).
+
+Normally preceded by any of:
+
+- `Item: IPV6`_
+- `Item: IPV6_EXT`_
+
 Item: ``ICMP6``
 ^^^
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index f8fdd68..c1f3132 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -72,6 +72,7 @@ struct rte_flow_desc_data {
MK_FLOW_ITEM(VXLAN_GPE, sizeof(struct rte_flow_item_vxlan_gpe)),
MK_FLOW_ITEM(ARP_ETH_IPV4, sizeof(struct rte_flow_item_arp_eth_ipv4)),
MK_FLOW_ITEM(IPV6_EXT, sizeof(struct rte_flow_item_ipv6_ext)),
+   MK_FLOW_ITEM(IPV6_FRAG_EXT, sizeof(struct rte_flow_item_ipv6_frag_ext)),
MK_FLOW_ITEM(ICMP6, sizeof(struct rte_flow_item_icmp6)),
MK_FLOW_ITEM(ICMP6_ND_NS, sizeof(struct rte_flow_item_icmp6_nd_ns)),
MK_FLOW_ITEM(ICMP6_ND_NA, sizeof(struct rte_flow_item_icmp6_nd_na)),
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 5b5bed2..85376a3 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -537,6 +537,12 @@ enum rte_flow_item_type {
 */
RTE_FLOW_ITEM_TYPE_ECPRI,
 
+   /**
+* Matches the presence of IPv6 fragment extension header.
+*
+* See struct rte_flow_item_ipv6_frag_ext.
+*/
+   RTE_FLOW_ITEM_TYPE_IPV6_FRAG_EXT,
 };
 
 /**
@@ -1188,6 +1194,20 @@ struct rte_flow_item_ipv6_ext 
rte_flow_item_ipv6_ext_mask = {
 #endif
 
 /**
+ * RTE_FLOW_ITEM_TYPE_IPV6_FRAG_EXT
+ *
+ * Matches the presence of IPv6 fragment extension header.
+ *
+ * Preceded by any of:
+ *
+ * - RTE_FLOW_ITEM_TYPE_IPV6
+ * - RTE_FLOW_ITEM_TYPE_IPV6_EXT
+ */
+struct rte_flow_item_ipv6_frag_ext {
+   struct rte_ipv6_fragment_ext hdr;
+};
+
+/**
  * RTE_FLOW_ITEM_TYPE_ICMP6
  *
  * Matches any ICMPv6 header.
diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h
index 66edd7e..0bfe64b 100644
--- a/lib/librte_ip_frag/rte_ip_frag.h
+++ b/lib/librte_ip_frag/rte_ip_frag.h
@@ -110,30 +110,8 @@ struct rte_ip_frag_tbl {
__extension__ struct ip_frag_pkt pkt[0]; /**< hash table. */
 };
 
-/** IPv6 fragment extension header */
-#defineRTE_IPV6_EHDR_MF_SHIFT  0
-#defineRTE_IPV6_EHDR_MF_MASK   1
-#defineRTE_IPV6_EHDR_FO_SHIFT  3
-#defineRTE_IPV6_EHDR_FO_MASK   (~((1 << 
RTE_IPV6_EHDR_FO_SHIFT) - 1))
-#defineRTE_IPV6_EHDR_FO_ALIGN  (1 << 
RTE_IPV6_EHDR_FO_SHIFT)
-
-#define RTE_IPV6_FRAG_USED_MASK\
-   (RTE_IPV6_EHDR_MF_MASK | RTE_IPV6_EHDR_FO_MASK)
-
-#define RTE_IPV6_GET_MF(x) ((x) & 
RTE_IPV6_EHDR_MF_MASK)
-#define RTE_IPV6_GET_FO(x) ((x) >> 
RTE_IPV6_EHDR_FO_SHIFT)
-
-#define RTE_IPV6_SET_FRAG_DATA(fo, mf) \
-   (((fo) & RTE_IPV6_EHDR_FO_MASK) | ((mf) & RTE_IPV6_EHDR_MF_MASK))
-
-struct ipv6_extension_fragment {
-   uint8_t 

[dpdk-dev] [PATCH v4 03/11] app/testpmd: support IPv4 fragments

2020-10-07 Thread Dekel Peled
This patch updates testpmd CLI to support fragment_offset field of
IPv4 header item.

To match on non-fragmented IPv4 packets, need to use pattern:
... ipv4 fragment_offset spec 0 fragment_offset mask 0x3fff ...
To match on fragmented IPv4 packets, need to use pattern:
... ipv4 fragment_offset spec 1 fragment_offset last 0x3fff
fragment_offset mask 0x3fff ...
(Use the full available range 1 to 0x3fff to include all possible
values.)
To match on any IPv4 packets, fragmented and non-fragmented,
the fragment_offset field should not be specified for match.

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 app/test-pmd/cmdline_flow.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 6e04d53..a9bf29f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -129,6 +129,7 @@ enum index {
ITEM_VLAN_INNER_TYPE,
ITEM_IPV4,
ITEM_IPV4_TOS,
+   ITEM_IPV4_FRAGMENT_OFFSET,
ITEM_IPV4_TTL,
ITEM_IPV4_PROTO,
ITEM_IPV4_SRC,
@@ -873,6 +874,7 @@ struct parse_action_priv {
 
 static const enum index item_ipv4[] = {
ITEM_IPV4_TOS,
+   ITEM_IPV4_FRAGMENT_OFFSET,
ITEM_IPV4_TTL,
ITEM_IPV4_PROTO,
ITEM_IPV4_SRC,
@@ -2097,6 +2099,13 @@ static int comp_set_raw_index(struct context *, const 
struct token *,
.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
 hdr.type_of_service)),
},
+   [ITEM_IPV4_FRAGMENT_OFFSET] = {
+   .name = "fragment_offset",
+   .help = "fragmentation flags and fragment offset",
+   .next = NEXT(item_ipv4, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv4,
+hdr.fragment_offset)),
+   },
[ITEM_IPV4_TTL] = {
.name = "ttl",
.help = "time to live",
-- 
1.8.3.1



[dpdk-dev] [PATCH v4 01/11] ethdev: add extensions attributes to IPv6 item

2020-10-07 Thread Dekel Peled
Using the current implementation of DPDK, an application cannot match on
IPv6 packets, based on the existing extension headers, in a simple way.

Field 'Next Header' in IPv6 header indicates type of the first extension
header only. Following extension headers can't be identified by
inspecting the IPv6 header.
As a result, the existence or absence of specific extension headers
can't be used for packet matching.

For example, fragmented IPv6 packets contain a dedicated extension header
(which is implemented in a later patch of this series).
Non-fragmented packets don't contain the fragment extension header.
For an application to match on non-fragmented IPv6 packets, the current
implementation doesn't provide a suitable solution.
Matching on the Next Header field is not sufficient, since additional
extension headers might be present in the same packet.
To match on fragmented IPv6 packets, the same difficulty exists.

This patch implements the update as detailed in RFC [1].
A set of additional values will be added to IPv6 header struct.
These values will indicate the existence of every defined extension
header type, providing simple means for identification of existing
extensions in the packet header.
Continuing the above example, fragmented packets can be identified using
the specific value indicating existence of fragment extension header.
To match on non-fragmented IPv6 packets, need to use frag_ext_exist 0.
To match on fragmented IPv6 packets, need to use frag_ext_exist 1.
To match on any IPv6 packets, the frag_ext_exist field should
not be specified for match.

[1] https://mails.dpdk.org/archives/dev/2020-August/177257.html

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 doc/guides/prog_guide/rte_flow.rst | 22 +++---
 lib/librte_ethdev/rte_flow.h   | 25 +++--
 2 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index 119b128..ae090db 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -946,11 +946,27 @@ Item: ``IPV6``
 
 Matches an IPv6 header.
 
-Note: IPv6 options are handled by dedicated pattern items, see `Item:
-IPV6_EXT`_.
+Dedicated flags indicate existence of specific extension headers.
+Every type of extension header can use a dedicated pattern item, or
+the generic `Item: IPV6_EXT`_.
+To match on packets containing a specific extension header, an application
+should match on the dedicated flag set to 1.
+To match on packets not containing a specific extension header, an application
+should match on the dedicated flag clear to 0.
+In case application doesn't care about the existence of a specific extension
+header, it should not specify the dedicated flag for matching.
 
 - ``hdr``: IPv6 header definition (``rte_ip.h``).
-- Default ``mask`` matches source and destination addresses only.
+- ``hop_ext_exist``: Hop-by-Hop Options extension header exists.
+- ``rout_ext_exist``: Routing extension header exists.
+- ``frag_ext_exist``: Fragment extension header exists.
+- ``auth_ext_exist``: Authentication extension header exists.
+- ``esp_ext_exist``: Encapsulation Security Payload extension header exists.
+- ``dest_ext_exist``: Destination Options extension header exists.
+- ``mobil_ext_exist``: Mobility extension header exists.
+- ``hip_ext_exist``: Host Identity Protocol extension header exists.
+- ``shim6_ext_exist``: Shim6 Protocol extension header exists.
+- Default ``mask`` matches ``hdr`` source and destination addresses only.
 
 Item: ``ICMP``
 ^^
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..5b5bed2 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -792,11 +792,32 @@ struct rte_flow_item_ipv4 {
  *
  * Matches an IPv6 header.
  *
- * Note: IPv6 options are handled by dedicated pattern items, see
- * RTE_FLOW_ITEM_TYPE_IPV6_EXT.
+ * Dedicated flags indicate existence of specific extension headers.
+ * Every type of extension header can use a dedicated pattern item, or
+ * the generic item RTE_FLOW_ITEM_TYPE_IPV6_EXT.
  */
 struct rte_flow_item_ipv6 {
struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
+   uint32_t hop_ext_exist:1;
+   /**< Hop-by-Hop Options extension header exists. */
+   uint32_t rout_ext_exist:1;
+   /**< Routing extension header exists. */
+   uint32_t frag_ext_exist:1;
+   /**< Fragment extension header exists. */
+   uint32_t auth_ext_exist:1;
+   /**< Authentication extension header exists. */
+   uint32_t esp_ext_exist:1;
+   /**< Encapsulation Security Payload extension header exists. */
+   uint32_t dest_ext_exist:1;
+   /**< Destination Options extension header exists. */
+   uint32_t mobil_ext_exist:1;
+   /**< Mobility extension header exists. */
+   uint32_t hip_ext_exist:1;
+   /**< Host Identity Protocol extension header exists. */
+   

[dpdk-dev] [PATCH v4 04/11] app/testpmd: support IPv6 fragments

2020-10-07 Thread Dekel Peled
rte_flow update, following RFC [1], introduced frag_ext_exist field for
IPv6 header item, used to indicate match on fragmented/non-fragmented
packets.
This patch updates testpmd CLI to support the new field.

To match on non-fragmented IPv6 packets, need to use pattern:
... ipv6 frag_ext_exist spec 0 frag_ext_exist mask 1 ...
To match on fragmented IPv6 packets, need to use pattern:
... ipv6 frag_ext_exist spec 1 frag_ext_exist mask 1 ...
To match on any IPv6 packets, the frag_ext_exist field should
not be specified for match.

[1] https://mails.dpdk.org/archives/dev/2020-August/177257.html

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 app/test-pmd/cmdline_flow.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a9bf29f..b078095 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -141,6 +141,7 @@ enum index {
ITEM_IPV6_HOP,
ITEM_IPV6_SRC,
ITEM_IPV6_DST,
+   ITEM_IPV6_FRAG_EXT_EXIST,
ITEM_ICMP,
ITEM_ICMP_TYPE,
ITEM_ICMP_CODE,
@@ -890,6 +891,7 @@ struct parse_action_priv {
ITEM_IPV6_HOP,
ITEM_IPV6_SRC,
ITEM_IPV6_DST,
+   ITEM_IPV6_FRAG_EXT_EXIST,
ITEM_NEXT,
ZERO,
 };
@@ -2185,6 +2187,13 @@ static int comp_set_raw_index(struct context *, const 
struct token *,
.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6,
 hdr.dst_addr)),
},
+   [ITEM_IPV6_FRAG_EXT_EXIST] = {
+   .name = "frag_ext_exist",
+   .help = "fragment packet attribute",
+   .next = NEXT(item_ipv6, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_BF(struct rte_flow_item_ipv6,
+  frag_ext_exist, 1)),
+   },
[ITEM_ICMP] = {
.name = "icmp",
.help = "match ICMP header",
-- 
1.8.3.1



[dpdk-dev] [PATCH v4 07/11] net/mlx5: support match on IPv4 fragment packets

2020-10-07 Thread Dekel Peled
This patch adds to MLX5 PMD the support of matching on IPv4
fragmented and non-fragmented packets, using the IPv4 header
fragment_offset field.

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 drivers/net/mlx5/mlx5_flow.c   |  48 
 drivers/net/mlx5/mlx5_flow.h   |  10 +++
 drivers/net/mlx5/mlx5_flow_dv.c| 156 +++--
 drivers/net/mlx5/mlx5_flow_verbs.c |   9 ++-
 4 files changed, 178 insertions(+), 45 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index a94f630..bce7c18 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -474,6 +474,8 @@ struct mlx5_flow_tunnel_info {
  *   Bit-masks covering supported fields by the NIC to compare with user mask.
  * @param[in] size
  *   Bit-masks size in bytes.
+ * @param[in] range_accepted
+ *   True if range of values is accepted for specific fields, false otherwise.
  * @param[out] error
  *   Pointer to error structure.
  *
@@ -485,6 +487,7 @@ struct mlx5_flow_tunnel_info {
  const uint8_t *mask,
  const uint8_t *nic_mask,
  unsigned int size,
+ bool range_accepted,
  struct rte_flow_error *error)
 {
unsigned int i;
@@ -502,7 +505,7 @@ struct mlx5_flow_tunnel_info {
  RTE_FLOW_ERROR_TYPE_ITEM, item,
  "mask/last without a spec is not"
  " supported");
-   if (item->spec && item->last) {
+   if (item->spec && item->last && !range_accepted) {
uint8_t spec[size];
uint8_t last[size];
unsigned int i;
@@ -1277,7 +1280,8 @@ struct mlx5_flow_tunnel_info {
ret = mlx5_flow_item_acceptable
(item, (const uint8_t *)mask,
 (const uint8_t *)&rte_flow_item_icmp6_mask,
-sizeof(struct rte_flow_item_icmp6), error);
+sizeof(struct rte_flow_item_icmp6),
+MLX5_ITEM_RANGE_NOT_ACCEPTED, error);
if (ret < 0)
return ret;
return 0;
@@ -1329,7 +1333,8 @@ struct mlx5_flow_tunnel_info {
ret = mlx5_flow_item_acceptable
(item, (const uint8_t *)mask,
 (const uint8_t *)&rte_flow_item_icmp_mask,
-sizeof(struct rte_flow_item_icmp), error);
+sizeof(struct rte_flow_item_icmp),
+MLX5_ITEM_RANGE_NOT_ACCEPTED, error);
if (ret < 0)
return ret;
return 0;
@@ -1384,7 +1389,7 @@ struct mlx5_flow_tunnel_info {
ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
(const uint8_t *)&nic_mask,
sizeof(struct rte_flow_item_eth),
-   error);
+   MLX5_ITEM_RANGE_NOT_ACCEPTED, error);
return ret;
 }
 
@@ -1438,7 +1443,7 @@ struct mlx5_flow_tunnel_info {
ret = mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
(const uint8_t *)&nic_mask,
sizeof(struct rte_flow_item_vlan),
-   error);
+   MLX5_ITEM_RANGE_NOT_ACCEPTED, error);
if (ret)
return ret;
if (!tunnel && mask->tci != RTE_BE16(0x0fff)) {
@@ -1502,6 +1507,7 @@ struct mlx5_flow_tunnel_info {
 uint64_t last_item,
 uint16_t ether_type,
 const struct rte_flow_item_ipv4 *acc_mask,
+bool range_accepted,
 struct rte_flow_error *error)
 {
const struct rte_flow_item_ipv4 *mask = item->mask;
@@ -1572,7 +1578,7 @@ struct mlx5_flow_tunnel_info {
acc_mask ? (const uint8_t *)acc_mask
 : (const uint8_t *)&nic_mask,
sizeof(struct rte_flow_item_ipv4),
-   error);
+   range_accepted, error);
if (ret < 0)
return ret;
return 0;
@@ -1592,6 +1598,8 @@ struct mlx5_flow_tunnel_info {
  * @param[in] acc_mask
  *   Acceptable mask, if NULL default internal default mask
  *   will be used to check whether item fields are supported.
+ * @param[in] range_accepted
+ *   True if range of values is accepted for specific fields, false otherwise.
  * @param[out] error
  *   Pointer to error structure.
  *
@@ -1671,7 +1679,7 @@ struct mlx5_flow_tunnel_info {
acc_mask ? (const uint8_t *)acc_mask
 : (const uint8_t *)&nic_mas

[dpdk-dev] [PATCH v4 05/11] app/testpmd: support IPv6 fragment extension item

2020-10-07 Thread Dekel Peled
rte_flow update, following RFC [1], added to ethdev the rte_flow item
ipv6_frag_ext.
This patch updates testpmd CLI to support the new item and its fields.

To match on fragmented IPv6 packets, this item is added to pattern:
... ipv6 / ipv6_frag_ext ...

[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 app/test-pmd/cmdline_flow.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index b078095..1f800eb 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -195,6 +195,9 @@ enum index {
ITEM_ARP_ETH_IPV4_TPA,
ITEM_IPV6_EXT,
ITEM_IPV6_EXT_NEXT_HDR,
+   ITEM_IPV6_FRAG_EXT,
+   ITEM_IPV6_FRAG_EXT_NEXT_HDR,
+   ITEM_IPV6_FRAG_EXT_FRAG_DATA,
ITEM_ICMP6,
ITEM_ICMP6_TYPE,
ITEM_ICMP6_CODE,
@@ -786,6 +789,7 @@ struct parse_action_priv {
ITEM_VXLAN_GPE,
ITEM_ARP_ETH_IPV4,
ITEM_IPV6_EXT,
+   ITEM_IPV6_FRAG_EXT,
ITEM_ICMP6,
ITEM_ICMP6_ND_NS,
ITEM_ICMP6_ND_NA,
@@ -1007,6 +1011,13 @@ struct parse_action_priv {
ZERO,
 };
 
+static const enum index item_ipv6_frag_ext[] = {
+   ITEM_IPV6_FRAG_EXT_NEXT_HDR,
+   ITEM_IPV6_FRAG_EXT_FRAG_DATA,
+   ITEM_NEXT,
+   ZERO,
+};
+
 static const enum index item_icmp6[] = {
ITEM_ICMP6_TYPE,
ITEM_ICMP6_CODE,
@@ -2578,6 +2589,30 @@ static int comp_set_raw_index(struct context *, const 
struct token *,
.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6_ext,
 next_hdr)),
},
+   [ITEM_IPV6_FRAG_EXT] = {
+   .name = "ipv6_frag_ext",
+   .help = "match presence of IPv6 fragment extension header",
+   .priv = PRIV_ITEM(IPV6_FRAG_EXT,
+   sizeof(struct rte_flow_item_ipv6_frag_ext)),
+   .next = NEXT(item_ipv6_frag_ext),
+   .call = parse_vc,
+   },
+   [ITEM_IPV6_FRAG_EXT_NEXT_HDR] = {
+   .name = "next_hdr",
+   .help = "next header",
+   .next = NEXT(item_ipv6_frag_ext, NEXT_ENTRY(UNSIGNED),
+item_param),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_ipv6_frag_ext,
+   hdr.next_header)),
+   },
+   [ITEM_IPV6_FRAG_EXT_FRAG_DATA] = {
+   .name = "frag_data",
+   .help = "Fragment flags and offset",
+   .next = NEXT(item_ipv6_frag_ext, NEXT_ENTRY(UNSIGNED),
+item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ipv6_frag_ext,
+hdr.frag_data)),
+   },
[ITEM_ICMP6] = {
.name = "icmp6",
.help = "match any ICMPv6 header",
-- 
1.8.3.1



[dpdk-dev] [PATCH v4 06/11] net/mlx5: remove handling of ICMP fragmented packets

2020-10-07 Thread Dekel Peled
Commit [1] forced setting of match on 'frag' bit mask 1 and value 0.
Previous patch in this series added support of match on fragmented and
non-fragmented packets on L3 items, so this setting is now redundant.

This patch removes the changes done in [1].

[1] commit 85407db9f60d ("net/mlx5: fix matching for ICMP fragments")

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 drivers/net/mlx5/mlx5_flow_dv.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 79fdf34..0a0a5a4 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -7345,12 +7345,6 @@ struct field_modify_info modify_tcp[] = {
return;
if (!icmp6_m)
icmp6_m = &rte_flow_item_icmp6_mask;
-   /*
-* Force flow only to match the non-fragmented IPv6 ICMPv6 packets.
-* If only the protocol is specified, no need to match the frag.
-*/
-   MLX5_SET(fte_match_set_lyr_2_4, headers_m, frag, 1);
-   MLX5_SET(fte_match_set_lyr_2_4, headers_v, frag, 0);
MLX5_SET(fte_match_set_misc3, misc3_m, icmpv6_type, icmp6_m->type);
MLX5_SET(fte_match_set_misc3, misc3_v, icmpv6_type,
 icmp6_v->type & icmp6_m->type);
@@ -7398,12 +7392,6 @@ struct field_modify_info modify_tcp[] = {
return;
if (!icmp_m)
icmp_m = &rte_flow_item_icmp_mask;
-   /*
-* Force flow only to match the non-fragmented IPv4 ICMP packets.
-* If only the protocol is specified, no need to match the frag.
-*/
-   MLX5_SET(fte_match_set_lyr_2_4, headers_m, frag, 1);
-   MLX5_SET(fte_match_set_lyr_2_4, headers_v, frag, 0);
MLX5_SET(fte_match_set_misc3, misc3_m, icmp_type,
 icmp_m->hdr.icmp_type);
MLX5_SET(fte_match_set_misc3, misc3_v, icmp_type,
-- 
1.8.3.1



[dpdk-dev] [PATCH v4 08/11] net/mlx5: support match on IPv6 fragment packets

2020-10-07 Thread Dekel Peled
This patch adds to MLX5 PMD the support of matching on IPv6
fragmented and non-fragmented packets, using the new field
frag_ext_exist, added to rte_flow following RFC [1].

[1] https://mails.dpdk.org/archives/dev/2020-August/177257.html

Signed-off-by: Dekel Peled 
---
 drivers/net/mlx5/mlx5_flow_dv.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 3379caf..4403abc 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -5211,6 +5211,7 @@ struct field_modify_info modify_tcp[] = {
.proto = 0xff,
.hop_limits = 0xff,
},
+   .frag_ext_exist = 1,
};
const struct rte_flow_item_ecpri nic_ecpri_mask = {
.hdr = {
@@ -6519,6 +6520,10 @@ struct field_modify_info modify_tcp[] = {
 ipv6_m->hdr.hop_limits);
MLX5_SET(fte_match_set_lyr_2_4, headers_v, ip_ttl_hoplimit,
 ipv6_v->hdr.hop_limits & ipv6_m->hdr.hop_limits);
+   MLX5_SET(fte_match_set_lyr_2_4, headers_m, frag,
+!!(ipv6_m->frag_ext_exist));
+   MLX5_SET(fte_match_set_lyr_2_4, headers_v, frag,
+!!(ipv6_v->frag_ext_exist & ipv6_m->frag_ext_exist));
 }
 
 /**
-- 
1.8.3.1



[dpdk-dev] [PATCH v4 09/11] net/mlx5: support match on IPv6 fragment ext. item

2020-10-07 Thread Dekel Peled
rte_flow update, following RFC [1], added to ethdev the rte_flow item
ipv6_frag_ext.
This patch adds to MLX5 PMD the option to match on this item type.

[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 drivers/net/mlx5/mlx5_flow.h|   4 +
 drivers/net/mlx5/mlx5_flow_dv.c | 209 
 2 files changed, 213 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 1e30c93..376519f 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -122,6 +122,10 @@ enum mlx5_feature_name {
 /* Pattern eCPRI Layer bit. */
 #define MLX5_FLOW_LAYER_ECPRI (UINT64_C(1) << 29)
 
+/* IPv6 Fragment Extension Header bit. */
+#define MLX5_FLOW_LAYER_OUTER_L3_IPV6_FRAG_EXT (1u << 30)
+#define MLX5_FLOW_LAYER_INNER_L3_IPV6_FRAG_EXT (1u << 31)
+
 /* Outer Masks. */
 #define MLX5_FLOW_LAYER_OUTER_L3 \
(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 4403abc..eb1db12 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1901,6 +1901,120 @@ struct field_modify_info modify_tcp[] = {
 }
 
 /**
+ * Validate IPV6 fragment extension item.
+ *
+ * @param[in] item
+ *   Item specification.
+ * @param[in] item_flags
+ *   Bit-fields that holds the items detected until now.
+ * @param[out] error
+ *   Pointer to error structure.
+ *
+ * @return
+ *   0 on success, a negative errno value otherwise and rte_errno is set.
+ */
+static int
+flow_dv_validate_item_ipv6_frag_ext(const struct rte_flow_item *item,
+   uint64_t item_flags,
+   struct rte_flow_error *error)
+{
+   const struct rte_flow_item_ipv6_frag_ext *spec = item->spec;
+   const struct rte_flow_item_ipv6_frag_ext *last = item->last;
+   const struct rte_flow_item_ipv6_frag_ext *mask = item->mask;
+   rte_be16_t frag_data_spec = 0;
+   rte_be16_t frag_data_last = 0;
+   const int tunnel = !!(item_flags & MLX5_FLOW_LAYER_TUNNEL);
+   const uint64_t l4m = tunnel ? MLX5_FLOW_LAYER_INNER_L4 :
+ MLX5_FLOW_LAYER_OUTER_L4;
+   int ret = 0;
+   struct rte_flow_item_ipv6_frag_ext nic_mask = {
+   .hdr = {
+   .next_header = 0xff,
+   .frag_data = RTE_BE16(0x),
+   },
+   };
+
+   if (item_flags & l4m)
+   return rte_flow_error_set(error, EINVAL,
+ RTE_FLOW_ERROR_TYPE_ITEM, item,
+ "ipv6 fragment extension item cannot "
+ "follow L4 item.");
+   if ((tunnel && !(item_flags & MLX5_FLOW_LAYER_INNER_L3_IPV6)) ||
+   (!tunnel && !(item_flags & MLX5_FLOW_LAYER_OUTER_L3_IPV6)))
+   return rte_flow_error_set(error, EINVAL,
+ RTE_FLOW_ERROR_TYPE_ITEM, item,
+ "ipv6 fragment extension item must "
+ "follow ipv6 item");
+   if (spec && mask)
+   frag_data_spec = spec->hdr.frag_data & mask->hdr.frag_data;
+   if (!frag_data_spec)
+   return 0;
+   /*
+* spec and mask are valid, enforce using full mask to make sure the
+* complete value is used correctly.
+*/
+   if ((mask->hdr.frag_data & RTE_BE16(RTE_IPV6_FRAG_USED_MASK)) !=
+   RTE_BE16(RTE_IPV6_FRAG_USED_MASK))
+   return rte_flow_error_set(error, EINVAL,
+ RTE_FLOW_ERROR_TYPE_ITEM_MASK,
+ item, "must use full mask for"
+ " frag_data");
+   /*
+* Match on frag_data 0x1 means M is 1 and frag-offset is 0.
+* This is 1st fragment of fragmented packet.
+*/
+   if (frag_data_spec == RTE_BE16(RTE_IPV6_EHDR_MF_MASK))
+   return rte_flow_error_set(error, ENOTSUP,
+ RTE_FLOW_ERROR_TYPE_ITEM, item,
+ "match on first fragment not "
+ "supported");
+   if (frag_data_spec && !last)
+   return rte_flow_error_set(error, EINVAL,
+ RTE_FLOW_ERROR_TYPE_ITEM, item,
+ "specified value not supported");
+   ret = mlx5_flow_item_acceptable
+   (item, (const uint8_t *)mask,
+(const uint8_t *)&nic_mask,
+sizeof(struct rte_flow_item_ipv6_frag_ext),
+MLX5_ITEM_RANGE_ACCEPTED, error);
+   if (ret

[dpdk-dev] [PATCH v4 10/11] doc: update release notes for MLX5 L3 frag support

2020-10-07 Thread Dekel Peled
This patch updates 20.11 release notes with the changes included in
patches of this series:
1) MLX5 support of matching on IPv4/IPv6 fragmented/non-fragmented
   packets.
2) ABI change in ethdev struct rte_flow_item_ipv6.

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 doc/guides/rel_notes/release_20_11.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/doc/guides/rel_notes/release_20_11.rst 
b/doc/guides/rel_notes/release_20_11.rst
index 0b2a370..f39f13b 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -109,6 +109,11 @@ New Features
   * Extern objects and functions can be plugged into the pipeline.
   * Transaction-oriented table updates.
 
+* **Updated Mellanox mlx5 driver.**
+
+  Updated Mellanox mlx5 driver with new features and improvements, including:
+
+  * Added support for matching on fragmented/non-fragmented IPv4/IPv6 packets.
 
 Removed Items
 -
@@ -240,6 +245,11 @@ ABI Changes
 
   * ``ethdev`` internal functions are marked with ``__rte_internal`` tag.
 
+  * Added extensions' attributes to struct ``rte_flow_item_ipv6``.
+A set of additional values added to struct, indicating the existence of
+every defined extension header type.
+Applications should use the new values for identification of existing
+extensions in the packet header.
 
 Known Issues
 
-- 
1.8.3.1



[dpdk-dev] [PATCH v4 11/11] net/mlx5: enforce limitation on IPv6 next proto

2020-10-07 Thread Dekel Peled
Due to PRM requirement, the IPv6 header item 'proto' field, indicating
the next header protocol, should not be set as extension header.
This patch adds the relevant validation, and documents the limitation.

Signed-off-by: Dekel Peled 
Acked-by: Ori Kam 
---
 doc/guides/nics/mlx5.rst |  7 +++
 drivers/net/mlx5/mlx5_flow.c | 14 --
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index b0614ae..3c7b998 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -311,6 +311,13 @@ Limitations
 for some NICs (such as ConnectX-6 Dx and BlueField 2).
 The capability bit ``scatter_fcs_w_decap_disable`` shows NIC support.
 
+- IPv6 header item 'proto' field, indicating the next header protocol, should
+  not be set as extension header.
+  In case the next header is an extension header, it should not be specified in
+  IPv6 header item 'proto' field.
+  The last extension header item 'next header' field can specify the following
+  header protocol type.
+
 Statistics
 --
 
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index bce7c18..e411e42 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1645,9 +1645,9 @@ struct mlx5_flow_tunnel_info {
  RTE_FLOW_ERROR_TYPE_ITEM, item,
  "IPv6 cannot follow L2/VLAN layer "
  "which ether type is not IPv6");
+   if (mask && spec)
+   next_proto = mask->hdr.proto & spec->hdr.proto;
if (item_flags & MLX5_FLOW_LAYER_IPV6_ENCAP) {
-   if (mask && spec)
-   next_proto = mask->hdr.proto & spec->hdr.proto;
if (next_proto == IPPROTO_IPIP || next_proto == IPPROTO_IPV6)
return rte_flow_error_set(error, EINVAL,
  RTE_FLOW_ERROR_TYPE_ITEM,
@@ -1655,6 +1655,16 @@ struct mlx5_flow_tunnel_info {
  "multiple tunnel "
  "not supported");
}
+   if (next_proto == IPPROTO_HOPOPTS  ||
+   next_proto == IPPROTO_ROUTING  ||
+   next_proto == IPPROTO_FRAGMENT ||
+   next_proto == IPPROTO_ESP  ||
+   next_proto == IPPROTO_AH   ||
+   next_proto == IPPROTO_DSTOPTS)
+   return rte_flow_error_set(error, EINVAL,
+ RTE_FLOW_ERROR_TYPE_ITEM, item,
+ "IPv6 proto (next header) should "
+ "not be set as extension header");
if (item_flags & MLX5_FLOW_LAYER_IPIP)
return rte_flow_error_set(error, EINVAL,
  RTE_FLOW_ERROR_TYPE_ITEM, item,
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth

2020-10-07 Thread Power, Ciara
Hi Maxime,

Thanks for reviewing, some comments below.

 
>-Original Message-
>From: Maxime Coquelin 
>Sent: Tuesday 6 October 2020 12:50
>To: Power, Ciara ; dev@dpdk.org
>Cc: Ray Kinsella ; Neil Horman 
>Subject: Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth
>
>
>
>On 9/30/20 3:03 PM, Ciara Power wrote:
>> This patch adds a max SIMD bitwidth EAL configuration. The API allows
>> for an app to set this value. It can also be set using EAL argument
>> --force-max-simd-bitwidth, which will lock the value and override any
>> modifications made by the app.
>>
>> Signed-off-by: Ciara Power 
>>
>> ---
>> v3:
>>   - Added enum value to essentially disable using max SIMD to choose
>> paths, intended for use by ARM SVE.
>>   - Fixed parsing bitwidth argument to return an error for values
>> greater than uint16_t.
>> v2: Added to Doxygen comment for API.
>> ---

>> @@ -1903,6 +1939,33 @@ eal_check_common_options(struct
>internal_config *internal_cfg)
>>  return 0;
>>  }
>>
>> +uint16_t
>
>shouldn't it return an enum rte_max_simd_t?

I kept the return value as uint16_t to allow for arbitrary values,
and will allow it be more flexible for future additions as new enums won't need 
to be added.
The enums are really used for readability when checking the bitwidth limit in 
drivers/libs, so
essentially #defines in enum form.

>
>> +rte_get_max_simd_bitwidth(void)
>> +{
>> +const struct internal_config *internal_conf =
>> +eal_get_internal_configuration();
>> +return internal_conf->max_simd_bitwidth.bitwidth;
>
>What is the default value if not set?
>

The default value for max_simd_bitwidth is set depending on the architecture, 
256 for x86/ppc,
and UINT16_MAX for ARM. So for example the default on x86 allows for AVX2 and 
under.
The defaults can be seen in patch 2: https://patchwork.dpdk.org/patch/79339/ 



Thanks,
Ciara



Re: [dpdk-dev] [PATCH v5 3/3] app/testpmd: reduce tunnel parsing code duplication

2020-10-07 Thread Ophir Munk

> -Original Message-
> From: Ferruh Yigit 
> Sent: Tuesday, October 6, 2020 5:30 PM
...
> Subject: Re: [PATCH v5 3/3] app/testpmd: reduce tunnel parsing code
> duplication
...
> 'RTE_VXLAN_DEFAULT_PORT' seems defined but '4789' still used as
> hardcoded in this file, since you are touching these, can you please update
> that usage too?

Done in v6


Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Thomas Monjalon
07/10/2020 12:52, Maxime Coquelin:
> On 10/7/20 12:29 PM, Thomas Monjalon wrote:
> > 07/10/2020 12:12, Maxime Coquelin:
> >> On 10/7/20 12:09 PM, Thomas Monjalon wrote:
> >>> 07/10/2020 10:22, Maxime Coquelin:
>  On 10/6/20 9:42 PM, Akhil Goyal wrote:
> >>
> >> The series prefixes drivers APIs with rte__ in
> >> order to avoid namespace pollution.
> >>
> >> These APIs are experimental, so no need to follow the
> >> deprecation process.
> >>
> > Added Fixes commit in patch description.
> 
>  Thanks for applying it to your tree.
> 
>  I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
>  because I thought it is was not a good idea to backport API changes,
>  even if this is experimental it might be annoying for the user.
> 
>  Thomas, do you confirm?
> >>>
> >>> Absolutely: API must not change in a stable branch.
> >>>
> >>> If an API is changed, it must be in the release notes.
> >>
> >> Ok, even for experimental APIs? I thought not.
> > 
> > Yes, experimental means it can change in the main branch
> > without prior notice. But it must be noted when it's changed.
> 
> Ok, do you want me to send add-on patches that you will squash when
> pulling Akhil's branch?

Yes please





Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Thomas Monjalon
07/10/2020 13:06, Thomas Monjalon:
> 07/10/2020 12:52, Maxime Coquelin:
> > On 10/7/20 12:29 PM, Thomas Monjalon wrote:
> > > 07/10/2020 12:12, Maxime Coquelin:
> > >> On 10/7/20 12:09 PM, Thomas Monjalon wrote:
> > >>> 07/10/2020 10:22, Maxime Coquelin:
> >  On 10/6/20 9:42 PM, Akhil Goyal wrote:
> > >>
> > >> The series prefixes drivers APIs with rte__ in
> > >> order to avoid namespace pollution.
> > >>
> > >> These APIs are experimental, so no need to follow the
> > >> deprecation process.
> > >>
> > > Added Fixes commit in patch description.
> > 
> >  Thanks for applying it to your tree.
> > 
> >  I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
> >  because I thought it is was not a good idea to backport API changes,
> >  even if this is experimental it might be annoying for the user.
> > 
> >  Thomas, do you confirm?
> > >>>
> > >>> Absolutely: API must not change in a stable branch.
> > >>>
> > >>> If an API is changed, it must be in the release notes.
> > >>
> > >> Ok, even for experimental APIs? I thought not.
> > > 
> > > Yes, experimental means it can change in the main branch
> > > without prior notice. But it must be noted when it's changed.
> > 
> > Ok, do you want me to send add-on patches that you will squash when
> > pulling Akhil's branch?
> 
> Yes please

Better: Akhil can squash in his tree and remove the Cc: stable tag.





Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Maxime Coquelin



On 10/7/20 1:06 PM, Thomas Monjalon wrote:
> 07/10/2020 12:52, Maxime Coquelin:
>> On 10/7/20 12:29 PM, Thomas Monjalon wrote:
>>> 07/10/2020 12:12, Maxime Coquelin:
 On 10/7/20 12:09 PM, Thomas Monjalon wrote:
> 07/10/2020 10:22, Maxime Coquelin:
>> On 10/6/20 9:42 PM, Akhil Goyal wrote:

 The series prefixes drivers APIs with rte__ in
 order to avoid namespace pollution.

 These APIs are experimental, so no need to follow the
 deprecation process.

>>> Added Fixes commit in patch description.
>>
>> Thanks for applying it to your tree.
>>
>> I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
>> because I thought it is was not a good idea to backport API changes,
>> even if this is experimental it might be annoying for the user.
>>
>> Thomas, do you confirm?
>
> Absolutely: API must not change in a stable branch.
>
> If an API is changed, it must be in the release notes.

 Ok, even for experimental APIs? I thought not.
>>>
>>> Yes, experimental means it can change in the main branch
>>> without prior notice. But it must be noted when it's changed.
>>
>> Ok, do you want me to send add-on patches that you will squash when
>> pulling Akhil's branch?
> 
> Yes please

OK, will do that later today.

Thanks,
Maxime



Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth

2020-10-07 Thread Power, Ciara
Hi Bruce,

>-Original Message-
>From: Bruce Richardson 
>Sent: Wednesday 7 October 2020 11:52
>To: Power, Ciara 
>Cc: Olivier Matz ; dev@dpdk.org; Ray Kinsella
>; Neil Horman 
>Subject: Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth
>
>On Wed, Oct 07, 2020 at 11:47:34AM +0100, Power, Ciara wrote:
>> Hi Olivier,
>>
>> Thanks for reviewing, some comments below.
>>
>>
>> >-Original Message-
>> >From: Olivier Matz 
>> >Sent: Tuesday 6 October 2020 10:32
>> >To: Power, Ciara 
>> >Cc: dev@dpdk.org; Ray Kinsella ; Neil Horman
>> >
>> >Subject: Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth
>> >
>> >Hi Ciara,
>> >
>> >Please find some comments below.
>> >
>> >On Wed, Sep 30, 2020 at 02:03:57PM +0100, Ciara Power wrote:
>> >> This patch adds a max SIMD bitwidth EAL configuration. The API
>> >> allows for an app to set this value. It can also be set using EAL
>> >> argument --force-max-simd-bitwidth, which will lock the value and
>> >> override any modifications made by the app.
>> >>
>> >> Signed-off-by: Ciara Power 
>> >>
>> >> ---
>> >> v3:
>> >>   - Added enum value to essentially disable using max SIMD to choose
>> >> paths, intended for use by ARM SVE.
>> >>   - Fixed parsing bitwidth argument to return an error for values
>> >> greater than uint16_t.
>> >> v2: Added to Doxygen comment for API.
>> >> ---
>>
>> 
>>
>> >>
>> >> +uint16_t
>> >> +rte_get_max_simd_bitwidth(void)
>> >> +{
>> >> +const struct internal_config *internal_conf =
>> >> +eal_get_internal_configuration();
>> >> +return internal_conf->max_simd_bitwidth.bitwidth;
>> >> +}
>> >
>> >Should the return value be enum rte_max_simd_t?
>> >If not, do we really need the enum definition?
>> >
>>
>> I kept the return value and param value below as uint16_t to allow for
>> arbitrary values, and will allow it be more flexible for future additions as
>new enums won't need to be added.
>> For the set function below, this is used when a user passes the EAL
>> command line flag, which passes an integer value rather than an enum one.
>> The enums are useful when checking the max_simd_bitwidth in
>> drivers/libs, for example using "RTE_MAX_256_SIMD" instead of "256" in
>the condition checks.
>>
>So basically these enum values are #defines for readability, just in enum
>form, right?

Yes, that's exactly right.

Thanks,
Ciara


Re: [dpdk-dev] [PATCH v4 00/11] support match on L3 fragmented packets

2020-10-07 Thread Ori Kam
Hi Dekel,

> -Original Message-
> From: Dekel Peled 
> Sent: Wednesday, October 7, 2020 1:54 PM
> Subject: [PATCH v4 00/11] support match on L3 fragmented packets
> 
> This series implements support of matching on packets based on the
> fragmentation attribute of the packet, i.e. if packet is a fragment
> of a larger packet, or the opposite - packet is not a fragment.
> 
> In ethdev, add API to support IPv6 extension headers, and specifically
> the IPv6 fragment extension header item.
> In MLX5 PMD, support match on IPv4 fragmented packets, IPv6 fragmented
> packets, and IPv6 fragment extension header item.
> Testpmd CLI is updated accordingly.
> Documentation is updated accordingly.
> 
> ---
> v2: add patch 'net/mlx5: enforce limitation on IPv6 next proto'
> v3: update patch 'ethdev: add IPv6 fragment extension header item' to avoid
> ABI breakage.
> v4: update rte_flow documentation to clarify use of IPv6 extension header
> flags.
> ---
> 
> Dekel Peled (11):
>   ethdev: add extensions attributes to IPv6 item
>   ethdev: add IPv6 fragment extension header item
>   app/testpmd: support IPv4 fragments
>   app/testpmd: support IPv6 fragments
>   app/testpmd: support IPv6 fragment extension item
>   net/mlx5: remove handling of ICMP fragmented packets
>   net/mlx5: support match on IPv4 fragment packets
>   net/mlx5: support match on IPv6 fragment packets
>   net/mlx5: support match on IPv6 fragment ext. item
>   doc: update release notes for MLX5 L3 frag support
>   net/mlx5: enforce limitation on IPv6 next proto
> 
>  app/test-pmd/cmdline_flow.c|  53 +
>  doc/guides/nics/mlx5.rst   |   7 +
>  doc/guides/prog_guide/rte_flow.rst |  34 ++-
>  doc/guides/rel_notes/release_20_11.rst |  10 +
>  drivers/net/mlx5/mlx5_flow.c   |  62 --
>  drivers/net/mlx5/mlx5_flow.h   |  14 ++
>  drivers/net/mlx5/mlx5_flow_dv.c| 382
> +
>  drivers/net/mlx5/mlx5_flow_verbs.c |   9 +-
>  lib/librte_ethdev/rte_flow.c   |   1 +
>  lib/librte_ethdev/rte_flow.h   |  45 +++-
>  lib/librte_ip_frag/rte_ip_frag.h   |  26 +--
>  lib/librte_net/rte_ip.h|  26 ++-
>  12 files changed, 579 insertions(+), 90 deletions(-)
> 
> --
> 1.8.3.1

Series-acked-by:  Ori Kam 
Thanks,
Ori



Re: [dpdk-dev] [PATCH v3 17/18] net: add checks for max SIMD bitwidth

2020-10-07 Thread Power, Ciara
Hi Olivier,

 
>-Original Message-
>From: Olivier Matz 
>Sent: Tuesday 6 October 2020 11:01
>To: Power, Ciara 
>Cc: Coyle, David ; Singh, Jasvinder
>; dev@dpdk.org; O'loingsigh, Mairtin
>; Ryan, Brendan ;
>Richardson, Bruce 
>Subject: Re: [dpdk-dev] [PATCH v3 17/18] net: add checks for max SIMD
>bitwidth
>
>Hi,
>
>On Thu, Oct 01, 2020 at 02:19:37PM +, Power, Ciara wrote:
>> Hi David,
>>
>> Thanks for reviewing,
>>
>> >-Original Message-
>> >From: Coyle, David 
>> >Sent: Thursday 1 October 2020 15:17
>> >To: Singh, Jasvinder ; Power, Ciara
>> >; dev@dpdk.org
>> >Cc: Power, Ciara ; Olivier Matz
>> >; O'loingsigh, Mairtin
>> >; Ryan, Brendan
>> >; Richardson, Bruce
>> >
>> >Subject: RE: [dpdk-dev] [PATCH v3 17/18] net: add checks for max SIMD
>> >bitwidth
>> >
>> >Hi Jasvinder/Ciara
>> >
>> >> -Original Message-
>> >> From: Singh, Jasvinder 
>> >> >
>> >> > > From: dev  On Behalf Of Ciara Power When
>> >> > > choosing a vector path to take, an extra condition must be
>> >> > > satisfied to ensure the max SIMD bitwidth allows for the CPU
>> >> > > enabled
>> >path.
>> >> > >
>> >> > > The vector path was initially chosen in RTE_INIT, however this
>> >> > > is no longer suitable as we cannot check the max SIMD bitwidth
>> >> > > at that
>> >time.
>> >> > > The default chosen in RTE_INIT is now scalar. For best
>> >> > > performance and to use vector paths, apps must explicitly call
>> >> > > the set algorithm function before using other functions from
>> >> > > this library, as this is where vector handlers are now chosen.
>> >> >
>> >> > [DC] Has it been decided that it is ok to now require
>> >> > applications to pick the CRC algorithm they want to use?
>> >> >
>> >> > An application which previously automatically got SSE4.2 CRC, for
>> >> > example, will now automatically only get scalar.
>> >> >
>> >> > If this is ok, this should probably be called out explicitly in
>> >> > release notes as it may not be Immediately noticeable to users
>> >> > that they now need to select the CRC algo.
>> >> >
>> >> > Actually, in general, the release notes need to be updated for
>> >> > this
>> >> patchset.
>> >>
>> >> The decision to move rte_set_alg() out of RTE_INIT was taken to
>> >> avoid check on max_simd_bitwidth in data path for every single time
>> >> when
>> >> crc_calc() api is invoked. Based on my understanding,
>> >> max_simd_bitwidth is set after eal init, and when used in
>> >> crc_calc(), it might override the default crc algo set during
>> >> RTE_INIT. Therefore, to avoid extra check on max_simd_bitwidth in
>> >> data path,  better option will be to use this static configuration
>> >> one time after eal init in the set_algo
>> >API.
>> >
>> >[DC] Yes that is a good change to have made to avoid extra datapath
>checks.
>> >
>> >Based on off-list discussion, I now also know the reason behind now
>> >defaulting to scalar CRC in RTE_INIT. If a higher bitwidth CRC was
>> >chosen by RTE_INIT (e.g.
>> >SSE4.2 CRC) but the max_simd_bitwidth was then set to RTE_NO_SIMD
>> >(64) through the EAL parameter or call to
>> >rte_set_max_simd_bitwidth(), then there is a mismatch if
>> >rte_net_crc_set_alg() is not then called to reconfigure the CRC.
>> >Defaulting to scalar avoids this mismatch and works on all archs
>> >
>> >As I mentioned before, I think this needs to be called out in release
>> >notes, as it's an under-the-hood change which could cause app
>> >performance to drop if app developers aren't aware of it - the API
>> >itself hasn't changed, so they may not read the doxygen :)
>> >
>>
>> Yes that is a good point, I can add to the release notes for this to call it 
>> out.
>
>I don't think it is a good idea to have the scalar crc by default.
>To me, the fastest available CRC has to be enabled by default.
>
>I understand the technical reason why you did it like this however: the SIMD
>bitwidth may not be known at the time the
>RTE_INIT(rte_net_crc_init) function is called.
>
>A simple approach to solve this issue would be to initialize the
>rte_net_crc_handler pointer to a handlers_default. The first time a crc is
>called, the rte_crc32_*_default_handler() function would check the
>configured SIMD bitwidth, and set the handler to the correct one, to avoid to
>do the test for next time.

Thanks for this suggestion, will try this for the next version, it seems it 
will work quite well, thanks.

>This approach still does not solve the case where the SIMD bitwidth is
>modified during the life of the application. In this case, a callback would 
>have
>to be registered to notify SIMD bitwidth changes... but I don't think it is 
>worth
>to do it. Instead, it can be documented that
>rte_set_max_simd_bitwidth() has to be called early, before rte_eal_init().
>

Yes, It is documented in the Doxygen comment for the 
rte_set_max_simd_bitwidth() function
 that it should be called early, as you mentioned.



Thanks,
Ciara


Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth

2020-10-07 Thread Olivier Matz
Hi Ciara,

On Wed, Oct 07, 2020 at 10:47:34AM +, Power, Ciara wrote:
> Hi Olivier,
> 
> Thanks for reviewing, some comments below.
> 
> 
> >-Original Message-
> >From: Olivier Matz 
> >Sent: Tuesday 6 October 2020 10:32
> >To: Power, Ciara 
> >Cc: dev@dpdk.org; Ray Kinsella ; Neil Horman
> >
> >Subject: Re: [dpdk-dev] [PATCH v3 01/18] eal: add max SIMD bitwidth
> >
> >Hi Ciara,
> >
> >Please find some comments below.
> >
> >On Wed, Sep 30, 2020 at 02:03:57PM +0100, Ciara Power wrote:
> >> This patch adds a max SIMD bitwidth EAL configuration. The API allows
> >> for an app to set this value. It can also be set using EAL argument
> >> --force-max-simd-bitwidth, which will lock the value and override any
> >> modifications made by the app.
> >>
> >> Signed-off-by: Ciara Power 
> >>
> >> ---
> >> v3:
> >>   - Added enum value to essentially disable using max SIMD to choose
> >> paths, intended for use by ARM SVE.
> >>   - Fixed parsing bitwidth argument to return an error for values
> >> greater than uint16_t.
> >> v2: Added to Doxygen comment for API.
> >> ---
> 
> 
> 
> >>
> >> +uint16_t
> >> +rte_get_max_simd_bitwidth(void)
> >> +{
> >> +  const struct internal_config *internal_conf =
> >> +  eal_get_internal_configuration();
> >> +  return internal_conf->max_simd_bitwidth.bitwidth;
> >> +}
> >
> >Should the return value be enum rte_max_simd_t?
> >If not, do we really need the enum definition?
> >
> 
> I kept the return value and param value below as uint16_t to allow for 
> arbitrary values,
> and will allow it be more flexible for future additions as new enums won't 
> need to be added.
> For the set function below, this is used when a user passes the EAL command 
> line flag, which
> passes an integer value rather than an enum one.
> The enums are useful when checking the max_simd_bitwidth in drivers/libs, for 
> example using
> "RTE_MAX_256_SIMD" instead of "256" in the condition checks.
> 
> >> +
> >> +int
> >> +rte_set_max_simd_bitwidth(uint16_t bitwidth) {
> >> +  struct internal_config *internal_conf =
> >> +  eal_get_internal_configuration();
> >> +  if (internal_conf->max_simd_bitwidth.locked) {
> >> +  RTE_LOG(NOTICE, EAL, "Cannot set max SIMD bitwidth - user
> >runtime override enabled");
> >> +  return -EPERM;
> >> +  }
> >> +
> >> +  if (bitwidth != RTE_MAX_SIMD_DISABLE && (bitwidth <
> >RTE_NO_SIMD ||
> >> +  !rte_is_power_of_2(bitwidth))) {
> >> +  RTE_LOG(ERR, EAL, "Invalid bitwidth value!\n");
> >> +  return -EINVAL;
> >> +  }
> >> +  internal_conf->max_simd_bitwidth.bitwidth = bitwidth;
> >> +  return 0;
> >> +}
> >
> >Same question, should the parameter be enum rte_max_simd_t?
> >
> 
> 
> 
> >> +enum rte_max_simd_t {
> >> +  RTE_NO_SIMD = 64,
> >> +  RTE_MAX_128_SIMD = 128,
> >> +  RTE_MAX_256_SIMD = 256,
> >> +  RTE_MAX_512_SIMD = 512,
> >> +  RTE_MAX_SIMD_DISABLE = UINT16_MAX,
> >> +};
> >
> >What is the difference between RTE_NO_SIMD and
> >RTE_MAX_SIMD_DISABLE?
> 
> RTE_NO_SIMD has value 64 to limit paths to scalar only.
> RTE_MAX_SIMD_DISABLE sets the highest value possible,
> so essentially disables the limit affecting which vector paths are taken.
> This disable option was added to allow for ARM SVE which will be later added,
> Discussed with Honnappa on a previous version: 
> https://patchwork.dpdk.org/patch/76097/ 

Ok, so RTE_MAX_SIMD_DISABLE means "disable the max limit", right?

I feel the name is a bit confusing. What about something like this:

enum rte_simd {
RTE_SIMD_DISABLED = 0,
RTE_SIMD_128 = 128,
RTE_SIMD_256 = 256,
RTE_SIMD_512 = 512,
RTE_SIMD_MAX = UINT16_MAX,
};


> 
> >The default value in internal_config is 0, so in my understanding
> >rte_get_max_simd_bitwidth() will return 0 if --force-max-simd-bitwidth is
> >not passed. Is it expected?
> >
> >Maybe I'm missing something, but I don't understand why the value in
> >internal_config is not set to the maximum supported SIMD bitwidth by
> >default, and optionally overriden by the command line argument, or by the
> >API.
> >
> 
> The default value for max_simd_bitwidth is set depending on the architecture, 
> 256 for x86/ppc,
> and UINT16_MAX for ARM. So for example the default on x86 allows for AVX2 and 
> under.
> The defaults can be seen in patch 2: https://patchwork.dpdk.org/patch/79339/ 

Ok, I was expecting to have a runtime check for this. For instance, on
intel architecture, it is not known at compilation, it depends on the
target which can support up to AVX, AVX2, or AVX512.

> 
> 
> 
> Thanks,
> Ciara


Re: [dpdk-dev] [PATCH 1/4] ethdev: add hairpin bind and unbind APIs

2020-10-07 Thread Bing Zhao
Hi Ori,

> -Original Message-
> From: Ori Kam 
> Sent: Sunday, October 4, 2020 5:20 PM
> To: Bing Zhao ; NBU-Contact-Thomas Monjalon
> ; ferruh.yi...@intel.com;
> arybche...@solarflare.com; m...@ashroe.eu; nhor...@tuxdriver.com;
> bernard.iremon...@intel.com; beilei.x...@intel.com;
> wenzhuo...@intel.com
> Cc: dev@dpdk.org
> Subject: RE: [PATCH 1/4] ethdev: add hairpin bind and unbind APIs
> 
> Hi Bing,
> 
> PSB,
> 
> Thanks,
> Ori
> > -Original Message-
> > From: Bing Zhao 
> > Sent: Thursday, October 1, 2020 3:26 AM
> > Cc: dev@dpdk.org
> > Subject: [PATCH 1/4] ethdev: add hairpin bind and unbind APIs
> >
> > In single port hairpin mode, all the hairpin TX and RX queues
> belong
> > to the same device. After the queues are set up properly, there is
> no
> > other dependency between the TX queue and its RX peer queue. The
> > binding process that connected the TX and RX queues together from
> > hardware level will be done automatically during the device start
> > procedure. Everything required is configured and initialized
> already
> > for the binding process.
> >
> > But in two ports hairpin mode, there will be some cross-
> dependences
> > between two different ports. Usually, the ports will be
> initialized
> > serially by the main thread but not in parallel. The earlier port
> will
> > not be able to enable the bind if the following peer port is not
> yet
> > configured with HW resources. What's more, if one port is detached
> /
> > attached dynamically, it would introduce more trouble for the
> hairpin
> > binding.
> >
> > To overcome these, new APIs for binding and unbinding are added.
> > During startup, only the hairpin TX and RX peer queues will be set
> up.
> > Nothing will be done when starting the device if the queues are
> > without auto-bind attribute. Only after the required ports pair
> > started, the `rte_eth_hairpin_bind()` API can be called to bind
> the
> > all TX queues of the egress port to the RX queues of the peer port.
> > Then the connection between the egress and ingress ports pair will
> be
> > established.
> >
> > The `rte_eth_hairpin_unbind()` API could be used to disconnect the
> > egress and the peer ingress ports. This should only be called
> before
> > the device is closed if needed. When doing the clean up, all the
> > egress and ingress pairs related to a single port should be taken
> into
> > consideration.
> >
> > Signed-off-by: Bing Zhao 
> > ---
> >  lib/librte_ethdev/rte_ethdev.c   | 107
> > +++
> >  lib/librte_ethdev/rte_ethdev.h   |  51 +++
> >  lib/librte_ethdev/rte_ethdev_driver.h|  52 +++
> >  lib/librte_ethdev/rte_ethdev_version.map |   2 +
> >  4 files changed, 212 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > b/lib/librte_ethdev/rte_ethdev.c index dfe5c1b..72f567b 100644
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > @@ -2175,6 +2175,113 @@ rte_eth_tx_hairpin_queue_setup(uint16_t
> > port_id, uint16_t tx_queue_id,
> > return eth_err(port_id, ret);
> >  }
> >
> > +int
> > +rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port) {
> > +   struct rte_eth_dev *dev;
> > +   struct rte_eth_dev *rdev;
> > +   uint16_t p;
> > +   uint16_t rp;
> > +   int ret = 0;
> > +
> > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(tx_port, -EINVAL);
> > +   dev = &rte_eth_devices[tx_port];
> > +   if (!dev->data->dev_started) {
> > +   RTE_ETHDEV_LOG(ERR, "TX port %d is not started",
> tx_port);
> > +   return -EBUSY;
> > +   }
> > +
> > +   /*
> > +* If the all the ports probed belong to two or more separate
> NICs, it
> > +* is recommended that each pair is bound independently but
> not in the
> > +* loop to bind all ports.
> > +*/
> 
> I don't understand your comment.

I mean, in the following loops, if the application wants to bind one TX port to 
all the other RX ports, it would not be supported between different PMDs or 
different NICs. Then the roll-back actions will be done and no hairpin port 
peers will work successfully.

> 
> > +   if (rx_port == RTE_MAX_ETHPORTS) {
> 
> I think maybe this should be done in the tx queue. Since if the bind
> don't need some port why do we care if it is started?

Do you mean the checking in the RX ports loop below? At first, I intended to 
make sure this API would be called only after all the ports are started.
But yes, there is no need if some port is not being used.

> So either add a new function to get all peer ports from the tx port,
> or move this logic to the Target PMD.

There would be one more API to be introduced. To my understanding, it would be 
more appropriate if we add a new API here.
But it would keep RTE simpler if we move such logic into the PMD.

> 
> > +   RTE_ETH_FOREACH_DEV(p) {
> > +   rdev = &rte_eth_devices[p];
> > +   if (!rdev->data->dev_started) {
> > +   RTE_ETHDEV_LO

Re: [dpdk-dev] [PATCH 2/4] ethdev: add new attributes to hairpin config

2020-10-07 Thread Bing Zhao
Hi Ori,

> -Original Message-
> From: Ori Kam 
> Sent: Sunday, October 4, 2020 5:22 PM
> To: Bing Zhao ; NBU-Contact-Thomas Monjalon
> ; ferruh.yi...@intel.com;
> arybche...@solarflare.com; m...@ashroe.eu; nhor...@tuxdriver.com;
> bernard.iremon...@intel.com; beilei.x...@intel.com;
> wenzhuo...@intel.com
> Cc: dev@dpdk.org
> Subject: RE: [PATCH 2/4] ethdev: add new attributes to hairpin
> config
> 
> Hi Bing,
> 
> PSB,
> Best,
> Ori
> 
> > -Original Message-
> > From: Bing Zhao 
> > Sent: Thursday, October 1, 2020 3:26 AM
> > Subject: [PATCH 2/4] ethdev: add new attributes to hairpin config
> >
> > To support two ports hairpin mode and keep the backward
> compatibility
> > for the application, two new attribute members of hairpin queue
> config
> > structure are added.
> >
> > `tx_explicit` means if the application itself will insert the TX
> part
> > flow rules. If not set, PMD will insert the rules implicitly.
> > `manual_bind` means if the hairpin TX queue and peer RX queue will
> be
> > bound automatically during device start stage.
> >
> > Different TX and RX queue pairs could have different values, but
> it is
> > highly recommend that all paired queues between one egress and its
> > peer ingress ports have the same values, in order not to bring any
> > chaos to the system. The actual support of these attribute
> parameters
> > will be checked and decided by the PMD driver.
> >
> > In a single port hairpin, if both are zero without any setting,
> the
> > behavior will remain the same as before. It means no bind API
> needs to
> > be called and no TX flow rules need to be inserted manually by the
> > application.
> >
> > Signed-off-by: Bing Zhao 
> > ---
> >  lib/librte_ethdev/rte_ethdev.h | 18 ++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > b/lib/librte_ethdev/rte_ethdev.h index c3fb684..0cabff0 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -1027,6 +1027,21 @@ struct rte_eth_hairpin_cap {
> >
> >  #define RTE_ETH_MAX_HAIRPIN_PEERS 32
> >
> > +/*
> > + * Hairpin queue attribute parameters.
> > + * Each TX queue and peer RX queue should have the same value.
> > + * Default value 0 is for backward-compatibility, the same
> behaviors
> > +should
> > + * remain if the value is not set (0).
> > + */
> > +/**< Hairpin queues will be bound automatically */
> > +#define RTE_ETH_HAIRPIN_BIND_AUTO  (0)
> > +/**< Hairpin queues will be bound manually with bind API */
> > +#define RTE_ETH_HAIRPIN_BIND_MANUAL(1)
> > +/**< Hairpin TX part flow rule will be inserted implicitly by PMD
> */
> > +#define RTE_ETH_HAIRPIN_TXRULE_IMPLICIT(0)
> > +/**< Hairpin TX part flow rule will be inserted explicitly by APP
> */
> > +#define RTE_ETH_HAIRPIN_TXRULE_EXPLICIT(1)
> > +
> 
> Why do you need those defines if you are using bit fields?

I will remove this and add the description of the modes in the document.

> 
> >  /**
> >   * @warning
> >   * @b EXPERIMENTAL: this API may change, or be removed, without
> prior
> > notice @@ -1046,6 +1061,9 @@ struct rte_eth_hairpin_peer {
> >   */
> >  struct rte_eth_hairpin_conf {
> > uint16_t peer_count; /**< The number of peers. */
> > +   uint32_t reserved : 30; /**< Reserved bits. */
> > +   uint32_t tx_explicit : 1; /**< Explicit TX flow rule mode. */
> > +   uint32_t manual_bind : 1; /**< Manually bind hairpin queues.
> */
> 
> Why not place the new bits at the end?

By using uint16_t bit fields, there will be some warnings by the compiler and 
it is not standard.
I prefer to change the "uint16_t peer_count" into "uint32_t peer_count:16" to 
use the gap before the next structure.
Or yes, I can move it to the end of this current structure.

> Also why do you place the reserved first?

Thanks, I will move it to the end of the u32 like other structure definitions.

> 
> > struct rte_eth_hairpin_peer
> peers[RTE_ETH_MAX_HAIRPIN_PEERS];  };
> >
> > --
> > 2.5.5

Thanks



Re: [dpdk-dev] [PATCH v2] eal/windows: export all built functions for clang

2020-10-07 Thread John Alexander
> -Original Message-
> From: dev  On Behalf Of Tal Shnaiderman
> Sent: 07 October 2020 08:20
> To: dev@dpdk.org
> Cc: tho...@monjalon.net; pallavi.ka...@intel.com;
> dmitry.kozl...@gmail.com; ranjit.me...@intel.com;
> navas...@linux.microsoft.com; dmit...@microsoft.com;
> ophi...@nvidia.com
> Subject: [dpdk-dev] [PATCH v2] eal/windows: export all built functions for
> clang
> 
> CAUTION: This email originated from outside of the organization. Do not click
> links or open attachments unless you recognize the sender and know the
> content is safe.
> 
> export for clang build all the functions currently built on Windows and listed
> in rte_eal_version.map by adding them to rte_eal_exports.def.
> 
> Signed-off-by: Tal Shnaiderman 
> Acked-by: Ranjit Menon 
> ---
> v2: rebase to master
> ---
> ---
>  lib/librte_eal/rte_eal_exports.def | 156
> +++--
>  1 file changed, 151 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eal/rte_eal_exports.def
> b/lib/librte_eal/rte_eal_exports.def
> index 7b35beb702..d7a47d0929 100644
> --- a/lib/librte_eal/rte_eal_exports.def
> +++ b/lib/librte_eal/rte_eal_exports.def
> @@ -3,34 +3,83 @@ EXPORTS
> per_lcore__lcore_id
> per_lcore__rte_errno
> per_lcore__thread_id
> +   rte_bus_dump
> +   rte_bus_find
> +   rte_bus_find_by_device
> +   rte_bus_find_by_name
> +   rte_bus_get_iommu_class
> +   rte_bus_probe
> rte_bus_register
> +   rte_bus_scan
> +   rte_bus_unregister
> rte_calloc
> rte_calloc_socket
> rte_cpu_get_flag_enabled
> +   rte_cpu_get_flag_name
> +   rte_ctrl_thread_create
> +   rte_delay_us
> +   rte_delay_us_block
> +   rte_delay_us_callback_register
> rte_dev_is_probed
> +   rte_dev_probe
> +   rte_dev_remove
> +   rte_devargs_add
> +   rte_devargs_dump
> rte_devargs_insert
> rte_devargs_next
> rte_devargs_parse
> +   rte_devargs_parsef
> rte_devargs_remove
> +   rte_devargs_type_count
> +   rte_dump_physmem_layout
> +   rte_dump_registers
> +   rte_dump_stack
> +   rte_dump_tailq
> +   rte_eal_cleanup
> +   rte_eal_get_lcore_state
> +   rte_eal_get_physmem_size
> +   rte_eal_get_runtime_dir
> rte_eal_has_hugepages
> rte_eal_has_pci
> +   rte_eal_hotplug_add
> +   rte_eal_hotplug_remove
> rte_eal_init
> rte_eal_iova_mode
> +   rte_eal_lcore_role
> rte_eal_mbuf_user_pool_ops
> rte_eal_mp_remote_launch
> rte_eal_mp_wait_lcore
> rte_eal_process_type
> rte_eal_remote_launch
> -   rte_log
> rte_eal_tailq_lookup
> rte_eal_tailq_register
> rte_eal_using_phys_addrs
> +   rte_eal_wait_lcore
> +   rte_exit
> rte_free
> +   rte_get_master_lcore
> +   rte_get_next_lcore
> rte_get_tsc_hz
> rte_hexdump
> +   rte_hypervisor_get
> rte_intr_rx_ctl
> +   rte_lcore_count
> +   rte_lcore_has_role
> +   rte_lcore_index
> +   rte_lcore_is_enabled
> +   rte_lcore_to_socket_id
> +   rte_log
> +   rte_log_cur_msg_loglevel
> +   rte_log_cur_msg_logtype
> +   rte_log_dump
> +   rte_log_get_global_level
> +   rte_log_get_level
> +   rte_log_get_stream
> rte_log_register
> +   rte_log_set_global_level
> rte_log_set_level
> +   rte_log_set_level_pattern
> +   rte_log_set_level_regexp
> rte_malloc
> rte_malloc_dump_stats
> rte_malloc_get_socket_stats
> @@ -53,6 +102,7 @@ EXPORTS
> rte_mem_lock_page
> rte_mem_virt2iova
> rte_mem_virt2phy
> +   rte_memdump
> rte_memory_get_nchannel
> rte_memory_get_nrank
> rte_memzone_dump
> @@ -62,16 +112,53 @@ EXPORTS
> rte_memzone_reserve_aligned
> rte_memzone_reserve_bounded
> rte_memzone_walk
> +   rte_openlog_stream
> +   rte_realloc
> +   rte_rtm_supported
> +   rte_service_attr_get
> +   rte_service_attr_reset_all
> +   rte_service_component_register
> +   rte_service_component_runstate_set
> +   rte_service_component_unregister
> +   rte_service_dump
> +   rte_service_finalize
> +   rte_service_get_by_name
> +   rte_service_get_count
> +   rte_service_get_name
> +   rte_service_lcore_add
> +   rte_service_lcore_attr_get
> +   rte_service_lcore_attr_reset_all
> +   rte_service_lcore_count
> +   rte_service_lcore_count_services
> +   rte_service_lcore_del
> +   rte_service_lcore_list
> +   rte_service_lcore_reset_all
> +   rte_service_lcore_start
> +   rte_service_lcore_stop
> +   rte_service_map_lcore_get
> +   rte_service_map_lcore_set
> +   rte_service_may_be_active
> +   rte_service_probe_capability
> +   rte_service_run_iter_on_app_lcore
> +   rte_service_ru

Re: [dpdk-dev] [PATCH 3/4] ethdev: add APIs for hairpin queue operation

2020-10-07 Thread Bing Zhao
Hi Ori,

> -Original Message-
> From: Ori Kam 
> Sent: Sunday, October 4, 2020 5:35 PM
> To: Bing Zhao ; NBU-Contact-Thomas Monjalon
> ; ferruh.yi...@intel.com;
> arybche...@solarflare.com; m...@ashroe.eu; nhor...@tuxdriver.com;
> bernard.iremon...@intel.com; beilei.x...@intel.com;
> wenzhuo...@intel.com
> Cc: dev@dpdk.org
> Subject: RE: [PATCH 3/4] ethdev: add APIs for hairpin queue
> operation
> 
> Hi Bing,
> 
> PSB
> Best,
> Ori
> 
> > -Original Message-
> > From: Bing Zhao 
> > Sent: Thursday, October 1, 2020 3:26 AM
> > Subject: [PATCH 3/4] ethdev: add APIs for hairpin queue operation
> >
> > Every hairpin queue pair should be configured properly and the
> > connection between TX and RX queues should be established, before
> > hairpin function works. In single port hairpin mode, the queues of
> > each pair belong to the same device. It is easy to get the
> hardware
> > and software information of each queue and configure the hairpin
> > connection with such information. In two ports hairpin mode, it is
> not
> > easy or inappropriate to access one queue's information from
> another
> > device.
> >
> > Since hairpin is configured per queue pair, three new APIs are
> > introduced and they are internal for the PMD using.
> >
> > The peer update API helps to pass one queue's information to the
> peer
> > queue and get the peer's information back for the next step.
> > The peer bind API configures the current queue with the peer's
> > information. For each hairpin queue pair, this API may need to be
> > called twice to configure the TX, RX queues separately.
> > The peer unbind API resets the current queue configuraion and
> state to
> > disconnect it from the peer queue. Also, it may need to be called
> > twice to disconnect TX, RX queues from each other.
> >
> > Some parameter of the above APIs might not be mandatory, and it
> > depends on the PMD implementation.
> >
> > The structure of `rte_hairpin_peer_info` is only a declaration and
> the
> > actual members will be defined in each PMD when being used.
> >
> > Signed-off-by: Bing Zhao 
> > ---
> >  lib/librte_ethdev/rte_ethdev.c   |  55 
> >  lib/librte_ethdev/rte_ethdev_driver.h| 108
> > +++
> >  lib/librte_ethdev/rte_ethdev_version.map |   3 +
> >  3 files changed, 166 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > b/lib/librte_ethdev/rte_ethdev.c index 72f567b..4bfc26e 100644
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > @@ -5515,6 +5515,61 @@ handle_port_link_status(const char *cmd
> > __rte_unused,
> > return 0;
> >  }
> >
> > +int
> > +rte_eth_hairpin_queue_peer_update(uint16_t peer_port, uint16_t
> > peer_queue,
> > + struct rte_hairpin_peer_info *cur_info,
> > + struct rte_hairpin_peer_info *peer_info,
> > + bool direction)
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   /* Current queue information is not mandatory. */
> > +   if (peer_info == NULL)
> > +   return -EINVAL;
> > +
> > +   /* No need to check the validity again. */
> > +   dev = &rte_eth_devices[peer_port];
> > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> > >hairpin_queue_peer_update,
> > +   -ENOTSUP);
> > +
> > +   return (*dev->dev_ops->hairpin_queue_peer_update)(dev,
> > peer_queue,
> > +   cur_info, peer_info, direction); }
> > +
> > +int
> > +rte_eth_hairpin_queue_peer_bind(uint16_t cur_port, uint16_t
> cur_queue,
> > +   struct rte_hairpin_peer_info *peer_info,
> > +   bool direction)
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   if (peer_info == NULL)
> > +   return -EINVAL;
> > +
> > +   /* No need to check the validity again. */
> > +   dev = &rte_eth_devices[cur_port];
> > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> > >hairpin_queue_peer_bind,
> > +   -ENOTSUP);
> > +
> > +   return (*dev->dev_ops->hairpin_queue_peer_bind)(dev, cur_queue,
> > +   peer_info, direction);
> > +}
> > +
> > +int
> > +rte_eth_hairpin_queue_peer_unbind(uint16_t cur_port, uint16_t
> cur_queue,
> > + bool direction)
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   /* No need to check the validity again. */
> > +   dev = &rte_eth_devices[cur_port];
> > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> > >hairpin_queue_peer_bind,
> > +   -ENOTSUP);
> > +
> > +   return (*dev->dev_ops->hairpin_queue_peer_unbind)(dev,
> cur_queue,
> > + direction);
> > +}
> > +
> >  RTE_LOG_REGISTER(rte_eth_dev_logtype, lib.ethdev, INFO);
> >
> >  RTE_INIT(ethdev_init_telemetry)
> > diff --git a/lib/librte_ethdev/rte_ethdev_driver.h
> > b/lib/librte_ethdev/rte_ethdev_driver.h
> > index 9104

Re: [dpdk-dev] [PATCH 4/4] app/testpmd: change hairpin queues setup

2020-10-07 Thread Bing Zhao
Hi Ori,
Thanks for your comments.
I will update the commit logs as well as the documents.

BR. Bing

> -Original Message-
> From: Ori Kam 
> Sent: Sunday, October 4, 2020 5:40 PM
> To: Bing Zhao ; NBU-Contact-Thomas Monjalon
> ; ferruh.yi...@intel.com;
> arybche...@solarflare.com; m...@ashroe.eu; nhor...@tuxdriver.com;
> bernard.iremon...@intel.com; beilei.x...@intel.com;
> wenzhuo...@intel.com
> Cc: dev@dpdk.org
> Subject: RE: [PATCH 4/4] app/testpmd: change hairpin queues setup
> 
> Hi Bing,
> 
> PSB,
> 
> Best,
> Ori
> 
> > -Original Message-
> > From: Bing Zhao 
> > Sent: Thursday, October 1, 2020 3:26 AM
> > Subject: [PATCH 4/4] app/testpmd: change hairpin queues setup
> >
> > A new parameter `hairpin-mode` is introduced to the testpmd
> command
> > line. Bitmask value is used to provide more flexible configuration.
> >
> > Bit 0 in the LSB indicates the hairpin will use the loop mode. The
> > previous port RX queue will be connected to the current port TX
> queue.
> > Bit 1 in the LSB indicates the hairpin will use pair port mode.
> The
> > even index port will be paired with the next odd index port. If
> the
> > total number of probed port is odd, then the last one will be
> paired
> > to itself.
> > If this byte is zero, then each port will be paired to itself.
> > Bit 0 takes a higher priority in the checking.
> >
> > Bit 4 in the second bytes indicate if the hairpin will use
> explicit TX
> > flow mode.
> >
> I think some basic example will be great here.
> 
> > If not set, default value zero will be used and the behavior will
> try
> > to get align with the previous single port mode. If the ports
> belong
> > to different vendors' NICs, it is suggested to use the `self`
> hairpin
> > mode only.
> >
> > Since hairpin configures the hardware resources, the port mask of
> > packets forwarding engine will not be used here.
> >
> > Signed-off-by: Bing Zhao 
> > ---
> >  app/test-pmd/parameters.c | 15 +++
> >  app/test-pmd/testpmd.c| 68
> > ---
> >  app/test-pmd/testpmd.h|  2 ++
> >  3 files changed, 81 insertions(+), 4 deletions(-)
> >
> > diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
> > index 1ead595..991029d 100644
> > --- a/app/test-pmd/parameters.c
> > +++ b/app/test-pmd/parameters.c
> > @@ -221,6 +221,9 @@ usage(char* progname)
> >"enabled\n");
> > printf("  --record-core-cycles: enable measurement of CPU
> cycles.\n");
> > printf("  --record-burst-stats: enable display of RX and TX
> > bursts.\n");
> > +   printf("  --hairpin-mode=0xXX: bitmask set the hairpin port
> mode.\n "
> > +  "0x10 - explicit tx rule, 0x02 - hairpin ports
> paired\n"
> > +  "0x01 - hairpin ports loop, 0x00 - hairpin port
> self\n");
> >  }
> >
> >  #ifdef RTE_LIBRTE_CMDLINE
> > @@ -644,6 +647,7 @@ launch_args_parse(int argc, char** argv)
> > { "rxd",1, 0, 0 },
> > { "txd",1, 0, 0 },
> > { "hairpinq",   1, 0, 0 },
> > +   { "hairpin-mode",   1, 0, 0 },
> > { "burst",  1, 0, 0 },
> > { "mbcache",1, 0, 0 },
> > { "txpt",   1, 0, 0 },
> > @@ -,6 +1115,17 @@ launch_args_parse(int argc, char** argv)
> > rte_exit(EXIT_FAILURE, "Either rx or tx
> queues should "
> > "be non-zero\n");
> > }
> > +   if (!strcmp(lgopts[opt_idx].name, "hairpin-mode"))
> {
> > +   char *end = NULL;
> > +   unsigned int n;
> > +
> > +   errno = 0;
> > +   n = strtoul(optarg, &end, 0);
> > +   if (errno != 0 || end == optarg)
> > +   rte_exit(EXIT_FAILURE, "hairpin mode
> > invalid\n");
> > +   else
> > +   hairpin_mode = (uint16_t)n;
> > +   }
> > if (!strcmp(lgopts[opt_idx].name, "burst")) {
> > n = atoi(optarg);
> > if (n == 0) {
> > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> > fe6450c..c604045 100644
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -367,6 +367,9 @@ bool setup_on_probe_event = true;
> >  /* Clear ptypes on port initialization. */  uint8_t clear_ptypes
> =
> > true;
> >
> > +/* Hairpin ports configuration mode. */ uint16_t hairpin_mode;
> > +
> >  /* Pretty printing of ethdev events */  static const char * const
> > eth_event_desc[] = {
> > [RTE_ETH_EVENT_UNKNOWN] = "unknown", @@ -2345,7 +2348,7 @@
> > port_is_started(portid_t port_id)
> >
> >  /* Configure the Rx and Tx hairpin queues for the selected port.
> */
> > static 

[dpdk-dev] [PATCH v3 0/2] support query of AGE action

2020-10-07 Thread Dekel Peled
This series includes change in ethdev to support a new query
response format, and change in testpmd CLI using it.

---
v2: enclose 2 separate patches in series.
(no change in patch contents). 
v3: rename struct rte_flow_query_age field for clarity.
---

Dekel Peled (2):
  ethdev: support query of AGE action
  app/testpmd: support query of AGE action

 app/test-pmd/config.c  | 12 
 doc/guides/prog_guide/rte_flow.rst | 17 +
 doc/guides/rel_notes/release_20_11.rst |  3 +++
 lib/librte_ethdev/rte_flow.h   | 14 ++
 4 files changed, 46 insertions(+)

-- 
1.8.3.1



Re: [dpdk-dev] [PATCH] doc: update pdump documentation

2020-10-07 Thread Pattan, Reshma



> -Original Message-
> From: dev  On Behalf Of Pattan, Reshma
> > -Original Message-
> > From: Pattan, Reshma 
> 
> Gentle remainder for review and ack on this .

Gentle remainder to ack and apply


[dpdk-dev] [PATCH v3 1/2] ethdev: support query of AGE action

2020-10-07 Thread Dekel Peled
Existing API supports AGE action to monitor the aging of a flow.
This patch implements RFC [1], introducing the response format for query
of an AGE action.
Application will be able to query the AGE action state.
The response will be returned in the format implemented here.

[1] https://mails.dpdk.org/archives/dev/2020-September/180061.html

Signed-off-by: Dekel Peled 
Acked-by: Matan Azrad 
---
 doc/guides/prog_guide/rte_flow.rst | 17 +
 doc/guides/rel_notes/release_20_11.rst |  3 +++
 lib/librte_ethdev/rte_flow.h   | 14 ++
 3 files changed, 34 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index 119b128..341e5ed 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2666,6 +2666,23 @@ timeout passed without any matching on the flow.
| ``context``  | user input flow context |
+--+-+
 
+Query structure to retrieve ageing status information of a
+shared AGE action, or a flow rule using the AGE action:
+
+.. _table_rte_flow_query_age:
+
+.. table:: AGE query
+
+   +-+-++
+   | Field   | I/O | Value  |
+   +=+=++
+   | ``aged``| out | Aging timeout expired  |
+   +-+-++
+   | ``last_hit_time_valid`` | out | ``sec_since_last_hit`` field is valid  |
+   +-+-++
+   | ``sec_since_last_hit``  | out | Seconds since last traffic hit |
+   +-+-++
+
 Negative types
 ~~
 
diff --git a/doc/guides/rel_notes/release_20_11.rst 
b/doc/guides/rel_notes/release_20_11.rst
index 0b2a370..7e093f7 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -191,6 +191,9 @@ API Changes
 
 * ethdev: ``rte_eth_rx_descriptor_done()`` API has been deprecated.
 
+* ethdev: Added struct ``rte_flow_query_age`` for use in response to query
+  of AGE action.
+
 * Renamed internal ethdev APIs:
 
   * ``_rte_eth_dev_callback_process()`` -> ``rte_eth_dev_callback_process()``
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..129e4ab 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2130,6 +2130,7 @@ enum rte_flow_action_type {
 * See struct rte_flow_action_age.
 * See function rte_flow_get_aged_flows
 * see enum RTE_ETH_EVENT_FLOW_AGED
+* See struct rte_flow_query_age
 */
RTE_FLOW_ACTION_TYPE_AGE,
 };
@@ -2194,6 +2195,19 @@ struct rte_flow_action_age {
 };
 
 /**
+ * RTE_FLOW_ACTION_TYPE_AGE (query)
+ *
+ * Query structure to retrieve the aging status information of a
+ * shared AGE action, or a flow rule using the AGE action.
+ */
+struct rte_flow_query_age {
+   uint32_t reserved:6; /**< Reserved, must be zero. */
+   uint32_t aged:1; /**< 1 if aging timeout expired, 0 otherwise. */
+   uint32_t last_hit_time_valid:1; /**< sec_since_last_hit value valid. */
+   uint32_t sec_since_last_hit:24; /**< Seconds since last traffic hit. */
+};
+
+/**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
  *
-- 
1.8.3.1



[dpdk-dev] [PATCH v3 2/2] app/testpmd: support query of AGE action

2020-10-07 Thread Dekel Peled
Following ethdev update [1], this patch adds CLI support to query
information related to AGE action.

[1] https://mails.dpdk.org/archives/dev/2020-September/183699.html

Signed-off-by: Dekel Peled 
Acked-by: Matan Azrad 
---
 app/test-pmd/config.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 418ea6d..83ef332 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1770,6 +1770,7 @@ void print_valid_ports(void)
union {
struct rte_flow_query_count count;
struct rte_flow_action_rss rss_conf;
+   struct rte_flow_query_age age;
} query;
int ret;
 
@@ -1792,6 +1793,7 @@ void print_valid_ports(void)
switch (action->type) {
case RTE_FLOW_ACTION_TYPE_COUNT:
case RTE_FLOW_ACTION_TYPE_RSS:
+   case RTE_FLOW_ACTION_TYPE_AGE:
break;
default:
printf("Cannot query action type %d (%s)\n",
@@ -1819,6 +1821,16 @@ void print_valid_ports(void)
case RTE_FLOW_ACTION_TYPE_RSS:
rss_config_display(&query.rss_conf);
break;
+   case RTE_FLOW_ACTION_TYPE_AGE:
+   printf("%s:\n"
+  " aged: %u\n"
+  " last_hit_time_valid: %u\n"
+  " sec_since_last_hit: %" PRIu32 "\n",
+  name,
+  query.age.aged,
+  query.age.last_hit_time_valid,
+  query.age.sec_since_last_hit);
+   break;
default:
printf("Cannot display result for action type %d (%s)\n",
   action->type, name);
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH 1/4] ethdev: add hairpin bind and unbind APIs

2020-10-07 Thread Ori Kam
Hi Bing,

> -Original Message-
> From: Bing Zhao 
> Sent: Wednesday, October 7, 2020 2:22 PM
> Subject: RE: [PATCH 1/4] ethdev: add hairpin bind and unbind APIs
> 
> Hi Ori,
> 
> > -Original Message-
> > From: Ori Kam 
> > Sent: Sunday, October 4, 2020 5:20 PM
> > Subject: RE: [PATCH 1/4] ethdev: add hairpin bind and unbind APIs
> >
> > Hi Bing,
> >
> > PSB,
> >
> > Thanks,
> > Ori
> > > -Original Message-
> > > From: Bing Zhao 
> > > Sent: Thursday, October 1, 2020 3:26 AM
> > > Cc: dev@dpdk.org
> > > Subject: [PATCH 1/4] ethdev: add hairpin bind and unbind APIs
> > >
> > > In single port hairpin mode, all the hairpin TX and RX queues
> > belong
> > > to the same device. After the queues are set up properly, there is
> > no
> > > other dependency between the TX queue and its RX peer queue. The
> > > binding process that connected the TX and RX queues together from
> > > hardware level will be done automatically during the device start
> > > procedure. Everything required is configured and initialized
> > already
> > > for the binding process.
> > >
> > > But in two ports hairpin mode, there will be some cross-
> > dependences
> > > between two different ports. Usually, the ports will be
> > initialized
> > > serially by the main thread but not in parallel. The earlier port
> > will
> > > not be able to enable the bind if the following peer port is not
> > yet
> > > configured with HW resources. What's more, if one port is detached
> > /
> > > attached dynamically, it would introduce more trouble for the
> > hairpin
> > > binding.
> > >
> > > To overcome these, new APIs for binding and unbinding are added.
> > > During startup, only the hairpin TX and RX peer queues will be set
> > up.
> > > Nothing will be done when starting the device if the queues are
> > > without auto-bind attribute. Only after the required ports pair
> > > started, the `rte_eth_hairpin_bind()` API can be called to bind
> > the
> > > all TX queues of the egress port to the RX queues of the peer port.
> > > Then the connection between the egress and ingress ports pair will
> > be
> > > established.
> > >
> > > The `rte_eth_hairpin_unbind()` API could be used to disconnect the
> > > egress and the peer ingress ports. This should only be called
> > before
> > > the device is closed if needed. When doing the clean up, all the
> > > egress and ingress pairs related to a single port should be taken
> > into
> > > consideration.
> > >
> > > Signed-off-by: Bing Zhao 
> > > ---
> > >  lib/librte_ethdev/rte_ethdev.c   | 107
> > > +++
> > >  lib/librte_ethdev/rte_ethdev.h   |  51 +++
> > >  lib/librte_ethdev/rte_ethdev_driver.h|  52 +++
> > >  lib/librte_ethdev/rte_ethdev_version.map |   2 +
> > >  4 files changed, 212 insertions(+)
> > >
> > > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > > b/lib/librte_ethdev/rte_ethdev.c index dfe5c1b..72f567b 100644
> > > --- a/lib/librte_ethdev/rte_ethdev.c
> > > +++ b/lib/librte_ethdev/rte_ethdev.c
> > > @@ -2175,6 +2175,113 @@ rte_eth_tx_hairpin_queue_setup(uint16_t
> > > port_id, uint16_t tx_queue_id,
> > >   return eth_err(port_id, ret);
> > >  }
> > >
> > > +int
> > > +rte_eth_hairpin_bind(uint16_t tx_port, uint16_t rx_port) {
> > > + struct rte_eth_dev *dev;
> > > + struct rte_eth_dev *rdev;
> > > + uint16_t p;
> > > + uint16_t rp;
> > > + int ret = 0;
> > > +
> > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(tx_port, -EINVAL);
> > > + dev = &rte_eth_devices[tx_port];
> > > + if (!dev->data->dev_started) {
> > > + RTE_ETHDEV_LOG(ERR, "TX port %d is not started",
> > tx_port);
> > > + return -EBUSY;
> > > + }
> > > +
> > > + /*
> > > +  * If the all the ports probed belong to two or more separate
> > NICs, it
> > > +  * is recommended that each pair is bound independently but
> > not in the
> > > +  * loop to bind all ports.
> > > +  */
> >
> > I don't understand your comment.
> 
> I mean, in the following loops, if the application wants to bind one TX port 
> to all
> the other RX ports, it would not be supported between different PMDs or
> different NICs. Then the roll-back actions will be done and no hairpin port 
> peers
> will work successfully.
> 

O.K.

> >
> > > + if (rx_port == RTE_MAX_ETHPORTS) {
> >
> > I think maybe this should be done in the tx queue. Since if the bind
> > don't need some port why do we care if it is started?
> 
> Do you mean the checking in the RX ports loop below? At first, I intended to
> make sure this API would be called only after all the ports are started.
> But yes, there is no need if some port is not being used.
> 
I mean move the check to the tx pmd, if for some reason (peer device not 
started)
then it should fail.

> > So either add a new function to get all peer ports from the tx port,
> > or move this logic to the Target PMD.
> 
> There would be one more API to be introduced. To my understanding, it would
> be more appropriate if we add a new API here.
> B

Re: [dpdk-dev] [PATCH] net/af_xdp: use snprintf instead of strncpy

2020-10-07 Thread Ferruh Yigit

On 10/7/2020 11:28 AM, Bruce Richardson wrote:

On Wed, Oct 07, 2020 at 11:26:38AM +0100, Bruce Richardson wrote:

On Wed, Oct 07, 2020 at 11:51:31AM +0200, Olivier Matz wrote:

On Wed, Oct 07, 2020 at 10:40:32AM +0100, Ferruh Yigit wrote:

On 10/7/2020 10:01 AM, Ciara Loftus wrote:

strncpy may leave the destination buffer not NULL terminated so use
snprintf instead.


What do you think using 'strlcpy'?


Or even better, rte_strscpy()
https://git.dpdk.org/dpdk/commit/?id=b0236c7cf761


I think this is largely a matter of preference, and unless there is a good
reason not to, I tend towards strlcpy as the older and more common (till
now) interface. The main thing is just to use a function that will
guarantee dest is null-terminated here, and both strlcpy and strscpy meet
that criteria.


I'd also add that strlcpy is more likely to be recognised by tools like
coverity, compared to rte_strscpy which is DPDK-specific.



+1 to 'strlcpy'


Re: [dpdk-dev] [PATCH v3 1/2] ethdev: support query of AGE action

2020-10-07 Thread Ori Kam
Hi Dekel,

> -Original Message-
> From: Dekel Peled 
> Sent: Wednesday, October 7, 2020 2:38 PM
> Subject: [PATCH v3 1/2] ethdev: support query of AGE action
> 
> Existing API supports AGE action to monitor the aging of a flow.
> This patch implements RFC [1], introducing the response format for query
> of an AGE action.
> Application will be able to query the AGE action state.
> The response will be returned in the format implemented here.
> 
> [1]
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dp
> dk.org%2Farchives%2Fdev%2F2020-
> September%2F180061.html&data=02%7C01%7Corika%40nvidia.com%7Cd
> 97faa2a57154e3ca5dd08d86ab58131%7C43083d15727340c1b7db39efd9ccc17
> a%7C0%7C0%7C637376675069240107&sdata=SufcP3eY5NHVPKneg3jx06S
> EGhTl45DCMeUual5H7n8%3D&reserved=0
> 
> Signed-off-by: Dekel Peled 
> Acked-by: Matan Azrad 
> ---
>  doc/guides/prog_guide/rte_flow.rst | 17 +
>  doc/guides/rel_notes/release_20_11.rst |  3 +++
>  lib/librte_ethdev/rte_flow.h   | 14 ++
>  3 files changed, 34 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/rte_flow.rst
> b/doc/guides/prog_guide/rte_flow.rst
> index 119b128..341e5ed 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -2666,6 +2666,23 @@ timeout passed without any matching on the flow.
> | ``context``  | user input flow context |
> +--+-+
> 
> +Query structure to retrieve ageing status information of a
> +shared AGE action, or a flow rule using the AGE action:
> +
> +.. _table_rte_flow_query_age:
> +
> +.. table:: AGE query
> +
> +   +-+-++
> +   | Field   | I/O | Value  |
> +
> +=+=+===
> =+
> +   | ``aged``| out | Aging timeout expired  |
> +   +-+-++
> +   | ``last_hit_time_valid`` | out | ``sec_since_last_hit`` field is valid  |
> +   +-+-++
> +   | ``sec_since_last_hit``  | out | Seconds since last traffic hit |
> +   +-+-++
> +

I think the last_hit_time_valid name should also be changed.

>  Negative types
>  ~~
> 
> diff --git a/doc/guides/rel_notes/release_20_11.rst
> b/doc/guides/rel_notes/release_20_11.rst
> index 0b2a370..7e093f7 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -191,6 +191,9 @@ API Changes
> 
>  * ethdev: ``rte_eth_rx_descriptor_done()`` API has been deprecated.
> 
> +* ethdev: Added struct ``rte_flow_query_age`` for use in response to query
> +  of AGE action.
> +
>  * Renamed internal ethdev APIs:
> 
>* ``_rte_eth_dev_callback_process()`` -> ``rte_eth_dev_callback_process()``
> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> index da8bfa5..129e4ab 100644
> --- a/lib/librte_ethdev/rte_flow.h
> +++ b/lib/librte_ethdev/rte_flow.h
> @@ -2130,6 +2130,7 @@ enum rte_flow_action_type {
>* See struct rte_flow_action_age.
>* See function rte_flow_get_aged_flows
>* see enum RTE_ETH_EVENT_FLOW_AGED
> +  * See struct rte_flow_query_age
>*/
>   RTE_FLOW_ACTION_TYPE_AGE,
>  };
> @@ -2194,6 +2195,19 @@ struct rte_flow_action_age {
>  };
> 
>  /**
> + * RTE_FLOW_ACTION_TYPE_AGE (query)
> + *
> + * Query structure to retrieve the aging status information of a
> + * shared AGE action, or a flow rule using the AGE action.
> + */
> +struct rte_flow_query_age {
> + uint32_t reserved:6; /**< Reserved, must be zero. */
> + uint32_t aged:1; /**< 1 if aging timeout expired, 0 otherwise. */
> + uint32_t last_hit_time_valid:1; /**< sec_since_last_hit value valid. */
> + uint32_t sec_since_last_hit:24; /**< Seconds since last traffic hit. */
> +};
> +
> +/**
>   * @warning
>   * @b EXPERIMENTAL: this structure may change without prior notice
>   *
> --
> 1.8.3.1



Re: [dpdk-dev] [PATCH] ethdev: add VLAN attributes to ETH and VLAN items

2020-10-07 Thread Maxime Leroy
On Mon, Oct 5, 2020 at 11:37 AM Dekel Peled  wrote:
>
> Thanks, PSB.
>
> > -Original Message-
> > From: Maxime Leroy 
> > Sent: Friday, October 2, 2020 3:39 PM
> > To: Dekel Peled 
> > Cc: Ori Kam ; NBU-Contact-Thomas Monjalon
> > ; ferruh.yi...@intel.com;
> > arybche...@solarflare.com; dev@dpdk.org; Dekel Peled
> > 
> > Subject: Re: [PATCH] ethdev: add VLAN attributes to ETH and VLAN items
> >
> > Hi Dekel,
> >
> > On Thu, Oct 1, 2020 at 8:49 PM Dekel Peled  wrote:
> > >
> > > From: Dekel Peled 
> > >
> > > This patch implements the change proposes in RFC [1], adding dedicated
> > > fields to ETH and VLAN items structs, to clearly define the required
> > > characteristic of a packet, and enable precise match criteria.
> > >
> > > [1]
> > >
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail
> > > s.dpdk.org%2Farchives%2Fdev%2F2020-
> > August%2F177536.html&data=02%7C
> > >
> > 01%7Cdekelp%40nvidia.com%7Cc12bfd3f662747f7b7c408d866d0376f%7C430
> > 83d15
> > >
> > 727340c1b7db39efd9ccc17a%7C0%7C0%7C637372391779092411&sdata=
> > yeOKvc
> > > 4r0dL09UZ65%2Bt4qWJqJmcp21VyPSK%2FhbablKI%3D&reserved=0
> > >
> > > Signed-off-by: Dekel Peled 
> > > ---
> > >  doc/guides/rel_notes/release_20_11.rst |  7 +++
> > >  lib/librte_ethdev/rte_flow.h   | 16 +---
> > >  2 files changed, 20 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/doc/guides/rel_notes/release_20_11.rst
> > > b/doc/guides/rel_notes/release_20_11.rst
> > > index 7f9d0dd..199c60b 100644
> > > --- a/doc/guides/rel_notes/release_20_11.rst
> > > +++ b/doc/guides/rel_notes/release_20_11.rst
> > > @@ -173,6 +173,13 @@ API Changes
> > >* ``_rte_eth_dev_callback_process()`` ->
> > ``rte_eth_dev_callback_process()``
> > >* ``_rte_eth_dev_reset`` -> ``rte_eth_dev_internal_reset()``
> > >
> > > +* ethdev: Added new field ``vlan_exist`` to structure
> > > +``rte_flow_item_eth``,
> > > +  indicating that at least one VLAN exists in the packet header.
> > > +
> > > +* ethdev: Added new field ``more_vlans_exist`` to structure
> > > +  ``rte_flow_item_vlan``, indicating that at least one more VLAN
> > > +exists in
> > > +  packet header, following this VLAN.
> > > +
> > >  * rawdev: Added a structure size parameter to the functions
> > >``rte_rawdev_queue_setup()``, ``rte_rawdev_queue_conf_get()``,
> > >``rte_rawdev_info_get()`` and ``rte_rawdev_configure()``, diff
> > > --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> > > index da8bfa5..39d04ef 100644
> > > --- a/lib/librte_ethdev/rte_flow.h
> > > +++ b/lib/librte_ethdev/rte_flow.h
> > > @@ -723,14 +723,18 @@ struct rte_flow_item_raw {
> > >   * If the @p type field contains a TPID value, then only tagged packets 
> > > with
> > the
> > >   * specified TPID will match the pattern.
> > >   * Otherwise, only untagged packets will match the pattern.
> > > - * If the @p ETH item is the only item in the pattern, and the @p
> > > type field
> > > - * is not specified, then both tagged and untagged packets will match
> > > the
> > > - * pattern.
> > > + * The field @p vlan_exist can be used to match specific packet
> > > + types, instead
> > > + * of using the @p type field.
> > > + * This can be used to match any type of tagged packets.
> > > + * If the @p type and @p vlan_exist fields are not specified, then
> > > + both tagged
> > > + * and untagged packets will match the pattern.
> > >   */
> > >  struct rte_flow_item_eth {
> > > struct rte_ether_addr dst; /**< Destination MAC. */
> > > struct rte_ether_addr src; /**< Source MAC. */
> > > rte_be16_t type; /**< EtherType or TPID. */
> > > +   uint32_t vlan_exist:1; /**< At least one VLAN exist in header. */
> > > +   uint32_t reserved:31; /**< Reserved, must be zero. */
> > >  };
> >
> > To resume:
> > - type and vlan_exists fields not specified:  tag and untagged matched
> > - with vlan_exists, match only tag or untagged
> > - with type matching specific ethernet type
> > - vlan_exists and type should not setted at the same time ?
>
> PMD should validate they don't conflict.
>
> >
> > With this new specification, I think you address all the use cases.
> > That's great !
> >
>
> Glad to see we agree on this.
>
> > >
> > >  /** Default mask for RTE_FLOW_ITEM_TYPE_ETH. */ @@ -752,10 +756,16
> > @@
> > > struct rte_flow_item_eth {
> > >   * the preceding pattern item.
> > >   * If a @p VLAN item is present in the pattern, then only tagged packets
> > will
> > >   * match the pattern.
> > > + * The field @p more_vlans_exist can be used to match specific packet
> > > + types,
> > > + * instead of using the @p inner_type field.
> > > + * This can be used to match any type of tagged packets.
> > >   */
> >
> > Could you please specify what the expected behavior when inner_type and
> > more_vlans_exist are not specified .
> > What is the default behavior ?
> >
>
> You wrote above for the eth item, if the user didn't specify it means 
> don't-care.
Could

Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Akhil Goyal
> 07/10/2020 13:06, Thomas Monjalon:
> > 07/10/2020 12:52, Maxime Coquelin:
> > > On 10/7/20 12:29 PM, Thomas Monjalon wrote:
> > > > 07/10/2020 12:12, Maxime Coquelin:
> > > >> On 10/7/20 12:09 PM, Thomas Monjalon wrote:
> > > >>> 07/10/2020 10:22, Maxime Coquelin:
> > >  On 10/6/20 9:42 PM, Akhil Goyal wrote:
> > > >>
> > > >> The series prefixes drivers APIs with rte__ in
> > > >> order to avoid namespace pollution.
> > > >>
> > > >> These APIs are experimental, so no need to follow the
> > > >> deprecation process.
> > > >>
> > > > Added Fixes commit in patch description.
> > > 
> > >  Thanks for applying it to your tree.
> > > 
> > >  I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
> > >  because I thought it is was not a good idea to backport API changes,
> > >  even if this is experimental it might be annoying for the user.
> > > 
> > >  Thomas, do you confirm?
> > > >>>
> > > >>> Absolutely: API must not change in a stable branch.
> > > >>>
> > > >>> If an API is changed, it must be in the release notes.
> > > >>
> > > >> Ok, even for experimental APIs? I thought not.
> > > >
> > > > Yes, experimental means it can change in the main branch
> > > > without prior notice. But it must be noted when it's changed.
> > >
> > > Ok, do you want me to send add-on patches that you will squash when
> > > pulling Akhil's branch?
> >
> > Yes please
> 
> Better: Akhil can squash in his tree and remove the Cc: stable tag.

Sure will do that.


Re: [dpdk-dev] [PATCH] cryptodev: revert ABI compatibility for ChaCha20-Poly1305

2020-10-07 Thread David Marchand
On Wed, Oct 7, 2020 at 12:41 PM Doherty, Declan
 wrote:
> >> @@ -789,33 +781,9 @@ rte_cryptodev_stats_reset(uint8_t dev_id);
> >>* the last valid element has it's op field set to
> >>* RTE_CRYPTO_OP_TYPE_UNDEFINED.
> >>*/
> >> -
> >> -void
> >> +extern void
> > Nit: no need for extern.
> Hey David, I think the cryptodev API consistently uses extern on nearly
> all it's function declarations. I'd proposed we do a separate patchset
> which removes extern on all function declarations to make it more
> consistent with the rest of DPDKs libraries.

Ok for me.


-- 
David Marchand



Re: [dpdk-dev] [PATCH] ethdev: add VLAN attributes to ETH and VLAN items

2020-10-07 Thread Ori Kam
Sorry for jumping late,


> -Original Message-
> From: Maxime Leroy 
> Sent: Wednesday, October 7, 2020 2:52 PM
> Subject: Re: [PATCH] ethdev: add VLAN attributes to ETH and VLAN items
> 
> On Mon, Oct 5, 2020 at 11:37 AM Dekel Peled  wrote:
> >
> > Thanks, PSB.
> >
> > > -Original Message-
> > > From: Maxime Leroy 
> > > Sent: Friday, October 2, 2020 3:39 PM
> > > To: Dekel Peled 
> > > Cc: Ori Kam ; NBU-Contact-Thomas Monjalon
> > > ; ferruh.yi...@intel.com;
> > > arybche...@solarflare.com; dev@dpdk.org; Dekel Peled
> > > 
> > > Subject: Re: [PATCH] ethdev: add VLAN attributes to ETH and VLAN items
> > >
> > > Hi Dekel,
> > >
> > > On Thu, Oct 1, 2020 at 8:49 PM Dekel Peled  wrote:
> > > >
> > > > From: Dekel Peled 
> > > >
> > > > This patch implements the change proposes in RFC [1], adding dedicated
> > > > fields to ETH and VLAN items structs, to clearly define the required
> > > > characteristic of a packet, and enable precise match criteria.
> > > >
> > > > [1]
> > > >
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail
> > > > s.dpdk.org%2Farchives%2Fdev%2F2020-
> > > August%2F177536.html&data=02%7C
> > > >
> > >
> 01%7Cdekelp%40nvidia.com%7Cc12bfd3f662747f7b7c408d866d0376f%7C430
> > > 83d15
> > > >
> > >
> 727340c1b7db39efd9ccc17a%7C0%7C0%7C637372391779092411&sdata=
> > > yeOKvc
> > > > 4r0dL09UZ65%2Bt4qWJqJmcp21VyPSK%2FhbablKI%3D&reserved=0
> > > >
> > > > Signed-off-by: Dekel Peled 
> > > > ---
> > > >  doc/guides/rel_notes/release_20_11.rst |  7 +++
> > > >  lib/librte_ethdev/rte_flow.h   | 16 +---
> > > >  2 files changed, 20 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/doc/guides/rel_notes/release_20_11.rst
> > > > b/doc/guides/rel_notes/release_20_11.rst
> > > > index 7f9d0dd..199c60b 100644
> > > > --- a/doc/guides/rel_notes/release_20_11.rst
> > > > +++ b/doc/guides/rel_notes/release_20_11.rst
> > > > @@ -173,6 +173,13 @@ API Changes
> > > >* ``_rte_eth_dev_callback_process()`` ->
> > > ``rte_eth_dev_callback_process()``
> > > >* ``_rte_eth_dev_reset`` -> ``rte_eth_dev_internal_reset()``
> > > >
> > > > +* ethdev: Added new field ``vlan_exist`` to structure
> > > > +``rte_flow_item_eth``,
> > > > +  indicating that at least one VLAN exists in the packet header.
> > > > +
> > > > +* ethdev: Added new field ``more_vlans_exist`` to structure
> > > > +  ``rte_flow_item_vlan``, indicating that at least one more VLAN
> > > > +exists in
> > > > +  packet header, following this VLAN.
> > > > +
> > > >  * rawdev: Added a structure size parameter to the functions
> > > >``rte_rawdev_queue_setup()``, ``rte_rawdev_queue_conf_get()``,
> > > >``rte_rawdev_info_get()`` and ``rte_rawdev_configure()``, diff
> > > > --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
> > > > index da8bfa5..39d04ef 100644
> > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > @@ -723,14 +723,18 @@ struct rte_flow_item_raw {
> > > >   * If the @p type field contains a TPID value, then only tagged packets
> with
> > > the
> > > >   * specified TPID will match the pattern.
> > > >   * Otherwise, only untagged packets will match the pattern.
> > > > - * If the @p ETH item is the only item in the pattern, and the @p
> > > > type field
> > > > - * is not specified, then both tagged and untagged packets will match
> > > > the
> > > > - * pattern.
> > > > + * The field @p vlan_exist can be used to match specific packet
> > > > + types, instead
> > > > + * of using the @p type field.
> > > > + * This can be used to match any type of tagged packets.
> > > > + * If the @p type and @p vlan_exist fields are not specified, then
> > > > + both tagged
> > > > + * and untagged packets will match the pattern.
> > > >   */
> > > >  struct rte_flow_item_eth {
> > > > struct rte_ether_addr dst; /**< Destination MAC. */
> > > > struct rte_ether_addr src; /**< Source MAC. */
> > > > rte_be16_t type; /**< EtherType or TPID. */
> > > > +   uint32_t vlan_exist:1; /**< At least one VLAN exist in header. 
> > > > */
> > > > +   uint32_t reserved:31; /**< Reserved, must be zero. */
> > > >  };
> > >
> > > To resume:
> > > - type and vlan_exists fields not specified:  tag and untagged matched
> > > - with vlan_exists, match only tag or untagged
> > > - with type matching specific ethernet type
> > > - vlan_exists and type should not setted at the same time ?
> >
> > PMD should validate they don't conflict.
> >
> > >
> > > With this new specification, I think you address all the use cases.
> > > That's great !
> > >
> >
> > Glad to see we agree on this.
> >
> > > >
> > > >  /** Default mask for RTE_FLOW_ITEM_TYPE_ETH. */ @@ -752,10
> +756,16
> > > @@
> > > > struct rte_flow_item_eth {
> > > >   * the preceding pattern item.
> > > >   * If a @p VLAN item is present in the pattern, then only tagged 
> > > > packets
> > > will
> > > >   * match the pattern.
> > > > + * The field 

[dpdk-dev] [PATCH 0/2] Add missing API change in release note

2020-10-07 Thread Maxime Coquelin
Add missing experimental API change in release note.

Maxime Coquelin (2):
  baseband/fpga_5gnr_fec: add API change in release note
  baseband/fpga_lte_fec: add API change in release note

 doc/guides/rel_notes/release_20_11.rst | 8 
 1 file changed, 8 insertions(+)

-- 
2.26.2



[dpdk-dev] [PATCH 2/2] baseband/fpga_lte_fec: add API change in release note

2020-10-07 Thread Maxime Coquelin
Fixes: 717edebcd394 ("baseband/fpga_lte_fec: fix API naming")

Signed-off-by: Maxime Coquelin 
---
 doc/guides/rel_notes/release_20_11.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/release_20_11.rst 
b/doc/guides/rel_notes/release_20_11.rst
index b3dd9a3646..98ae729a27 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -233,6 +233,10 @@ API Changes
   ``rte_fpga_5gnr_fec_configure`` and structure ``fpga_5gnr_fec_conf`` to
   ``rte_fpga_5gnr_fec_conf``.
 
+* baseband/fpga_lte_fec: Renamed function ``fpga_lte_fec_configure`` to
+  ``rte_fpga_lte_fec_configure`` and structure ``fpga_lte_fec_conf`` to
+  ``rte_fpga_lte_fec_conf``.
+
 
 ABI Changes
 ---
-- 
2.26.2



[dpdk-dev] [PATCH 1/2] baseband/fpga_5gnr_fec: add API change in release note

2020-10-07 Thread Maxime Coquelin
Fixes: 7fd60723065e ("baseband/fpga_5gnr_fec: fix API naming")

Signed-off-by: Maxime Coquelin 
---
 doc/guides/rel_notes/release_20_11.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/release_20_11.rst 
b/doc/guides/rel_notes/release_20_11.rst
index c0a3d76005..b3dd9a3646 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -229,6 +229,10 @@ API Changes
 
 * ipsec: ``RTE_SATP_LOG2_NUM`` has been dropped from ``enum``
 
+* baseband/fpga_5gnr_fec: Renamed function ``fpga_5gnr_fec_configure`` to
+  ``rte_fpga_5gnr_fec_configure`` and structure ``fpga_5gnr_fec_conf`` to
+  ``rte_fpga_5gnr_fec_conf``.
+
 
 ABI Changes
 ---
-- 
2.26.2



Re: [dpdk-dev] [PATCH 0/2] baseband: fix drivers API

2020-10-07 Thread Maxime Coquelin



On 10/7/20 2:05 PM, Akhil Goyal wrote:
>> 07/10/2020 13:06, Thomas Monjalon:
>>> 07/10/2020 12:52, Maxime Coquelin:
 On 10/7/20 12:29 PM, Thomas Monjalon wrote:
> 07/10/2020 12:12, Maxime Coquelin:
>> On 10/7/20 12:09 PM, Thomas Monjalon wrote:
>>> 07/10/2020 10:22, Maxime Coquelin:
 On 10/6/20 9:42 PM, Akhil Goyal wrote:
>>
>> The series prefixes drivers APIs with rte__ in
>> order to avoid namespace pollution.
>>
>> These APIs are experimental, so no need to follow the
>> deprecation process.
>>
> Added Fixes commit in patch description.

 Thanks for applying it to your tree.

 I did not add Fixes tag and Cc'ed sta...@dpdk.org on purpose,
 because I thought it is was not a good idea to backport API changes,
 even if this is experimental it might be annoying for the user.

 Thomas, do you confirm?
>>>
>>> Absolutely: API must not change in a stable branch.
>>>
>>> If an API is changed, it must be in the release notes.
>>
>> Ok, even for experimental APIs? I thought not.
>
> Yes, experimental means it can change in the main branch
> without prior notice. But it must be noted when it's changed.

 Ok, do you want me to send add-on patches that you will squash when
 pulling Akhil's branch?
>>>
>>> Yes please
>>
>> Better: Akhil can squash in his tree and remove the Cc: stable tag.
> 
> Sure will do that.
> 

Thanks Akhil for handling it, I just posted the release note patch.

Maxime



[dpdk-dev] 19.11 ABI changes

2020-10-07 Thread Денис Коновалов
Hello!
In 17.05 I get thread_id like that:
worker_threads.push_back(worker);
pthread_setname_np(lcore_config[lcore_id].thread_id, worker_name.c_str());
But in 19.11 lcore_config now private and I don't see any method in
rte_lcore.h for getting thread_id. How can I do that? Thank you.


Re: [dpdk-dev] [EXT] Re: [PATCH 1/1] eal: increase TRACE CTF SIZE to recommended size

2020-10-07 Thread David Marchand
On Wed, Oct 7, 2020 at 12:07 PM Sunil Kumar Kori  wrote:
>> it is on a slow path. 448 is OK.
> >
> >Ah yes, this is for the ctf description.
> >Could it be changed to rely on dynamic allocations and we simply remove this
> >limit?
> Changing it to dynamic allocation is difficult because if we do this then 
> every time memory is to reallocated.
> So IMO, It is okay to keep it static with enough size.

Ok.
I will take this as a ack from both Jerin and you and push with the
suggested commitlog.


-- 
David Marchand



Re: [dpdk-dev] [PATCH v4 2/4] ethdev: tunnel offload model

2020-10-07 Thread Gregory Etelson
Hello Harsha,

> -Original Message-

[snip]
> 
> Tunnel vport is an internal construct used by one specific
> application: OVS. So, shouldn't the rte APIs also be application
> agnostic apart from being vendor agnostic ? For OVS, the match fields
> in the existing datapath flow rules contain enough information to
> identify the tunnel type.

Tunnel offload model was inspired by OVS vport, but it is not part of the 
existing API.
It looks like the API documentation should not use that term to avoid confusion.

[snip]

[snip]
> 
> Wouldn't it be better if the APIs do not refer to vports and avoid
> percolating it down to the PMD ? My point here is to avoid bringing in
> the knowledge of an application specific virtual object (vport) to the
> PMD.
> 

As I have mentioned above, the API description should not mention vport.
I'll post updated documents. 

> Here's some other issues that I see with the helper APIs and
> vendor-specific variable actions.
> 1) The application needs some kind of validation (or understanding) of
> the actions returned by the PMD. The application can't just blindly
> use the actions specified by the PMD. That is, the decision to pick
> the set of actions can't be left entirely to the PMD.
> 2) The application needs to learn a PMD-specific way of action
> processing for each vendor. For example, how should the application
> handle flow-miss, given a different set of actions between two vendors
> (if one vendor has already popped the tunnel header while the other
> one hasn't).
> 3) The end-users/customers won't have a common interface (as in,
> consistent actions) to perform tunnel decap action. This becomes a
> manageability/maintenance issue for the application while working with
> different vendors.
> 
> IMO, the API shouldn't expect the PMD to understand the notion of
> vport. The goal here is to offload a flow rule to decap the tunnel
> header and forward the packet to a HW endpoint.  The problem is that
> we don't have a way to express the "tnl_pop" datapath action to the HW
> (decap flow #1, in the context of br-phy in OVS-DPDK) and also we may
> not want the HW to really pop the tunnel header at that stage. If this
> cannot be expressed with existing rte action types, maybe we should
> introduce a new action that clearly defines what is expected to the
> PMD.

Tunnel Offload API provides a common interface for all HW vendors:
Rule #1: define a tunneled traffic and steer / group traffic related to
that tunnel
Rule #2: within the tunnel selection, run matchers on all packet headers,
outer and inner, and perform actions on inner headers in case of a match.
For the rule #1 application provides tunnel matchers and traffic selection 
actions
and for rule #2 application provides full header matchers and actions for inner 
parts.
The rest is supplied by PMD according to HW and rule type. Application does not
need to understand exact PMD elements implementation.
Helper return value notifies application whether it received requested PMD 
elements or not.
If helper completed successfully, it means that application received required 
elements
and can complete flow rule compilation.
As the result, a packet will be fully offloaded or returned to application with 
enough
information to continue processing in SW.

[snip]

[snip]

> > Miss handling
> > -
> > Packets going through multiple rte_flow groups are exposed to hw
> > misses due to partial packet processing. In such cases, the software
> > should continue the packet's processing from the point where the
> > hardware missed.
> 
> Whether the packet goes through multiple groups or not for tunnel
> decap processing, should be left to the PMD/HW.  These assumptions
> shouldn't be built into the APIs. The encapsulated packet (i,e with
> outer headers) should be provided to the application, rather than
> making SW understand that there was a miss in stage-1, or stage-n in
> HW. That is, HW either processes it entirely, or punts the whole
> packet to SW if there's a miss. And the packet should take the normal
> processing path in SW (no action offload).
> 
> Thanks,
> -Harsha

The packet is provided to the application via the standard rte_eth_rx_burst API.
Additional information about the HW packet processing state is provided to
the application by the suggested rte_flow_get_restore_info API. It is up to the
application if to use such provided info, or even if to call this API at all.

[snip]

Regards,
Gregory


Re: [dpdk-dev] [PATCH 2/2] trace: add size_t as a generic trace point

2020-10-07 Thread Thomas Monjalon
> >From: Pavan Nikhilesh 
> >
> >Add size_t as a generic trace point. Also, update
> >test_generic_trace_point() to validate size_t emitter.
> >
> >Signed-off-by: Pavan Nikhilesh 
> 
> Acked-by: Sunil Kumar Kori 

Applied, thanks




Re: [dpdk-dev] [PATCH v3 1/1] vfio: modify spapr iommu support to use static window sizing

2020-10-07 Thread Thomas Monjalon
Hi David,
Do you plan to send a v4?

17/09/2020 13:13, Burakov, Anatoly:
> On 10-Aug-20 10:07 PM, David Christensen wrote:
> > The SPAPR IOMMU requires that a DMA window size be defined before memory
> > can be mapped for DMA. Current code dynamically modifies the DMA window
> > size in response to every new memory allocation which is potentially
> > dangerous because all existing mappings need to be unmapped/remapped in
> > order to resize the DMA window, leaving hardware holding IOVA addresses
> > that are temporarily unmapped.  The new SPAPR code statically assigns
> > the DMA window size on first use, using the largest physical memory
> > memory address when IOVA=PA and the highest existing memseg virtual
> > address when IOVA=VA.
> > 
> > Signed-off-by: David Christensen 
> > ---
> 
> 
> 
> > +struct spapr_size_walk_param {
> > +   uint64_t max_va;
> > +   uint64_t page_sz;
> > +   int external;
> > +};
> > +
> > +/*
> > + * In order to set the DMA window size required for the SPAPR IOMMU
> > + * we need to walk the existing virtual memory allocations as well as
> > + * find the hugepage size used.
> > + */
> >   static int
> > -vfio_spapr_unmap_walk(const struct rte_memseg_list *msl,
> > -   const struct rte_memseg *ms, void *arg)
> > +vfio_spapr_size_walk(const struct rte_memseg_list *msl, void *arg)
> >   {
> > -   int *vfio_container_fd = arg;
> > +   struct spapr_size_walk_param *param = arg;
> > +   uint64_t max = (uint64_t) msl->base_va + (uint64_t) msl->len;
> >   
> > -   /* skip external memory that isn't a heap */
> > -   if (msl->external && !msl->heap)
> > -   return 0;
> > +   if (msl->external) {
> > +   param->external++;
> > +   if (!msl->heap)
> > +   return 0;
> > +   }
> 
> It would be nice to have some comments in the code explaining what we're 
> skipping and why.
> 
> Also, seems that you're using param->external as bool? This is a 
> non-public API so using stdbool is not an issue here, perhaps replace it 
> with bool param->has_external?
> 
> >   
> > -   /* skip any segments with invalid IOVA addresses */
> > -   if (ms->iova == RTE_BAD_IOVA)
> > -   return 0;
> > +   if (max > param->max_va) {
> > +   param->page_sz = msl->page_sz;
> > +   param->max_va = max;
> > +   }
> >   
> > -   return vfio_spapr_dma_do_map(*vfio_container_fd, ms->addr_64, ms->iova,
> > -   ms->len, 0);
> > +   return 0;
> >   }
> >   
> > -struct spapr_walk_param {
> > -   uint64_t window_size;
> > -   uint64_t hugepage_sz;
> > -};
> > -
> > +/*
> > + * The SPAPRv2 IOMMU supports 2 DMA windows with starting
> > + * address at 0 or 1<<59.  By default, a DMA window is set
> > + * at address 0, 2GB long, with a 4KB page.  For DPDK we
> > + * must remove the default window and setup a new DMA window
> > + * based on the hugepage size and memory requirements of
> > + * the application before we can map memory for DMA.
> > + */
> >   static int
> > -vfio_spapr_window_size_walk(const struct rte_memseg_list *msl,
> > -   const struct rte_memseg *ms, void *arg)
> > +spapr_dma_win_size(void)
> >   {
> > -   struct spapr_walk_param *param = arg;
> > -   uint64_t max = ms->iova + ms->len;
> > +   struct spapr_size_walk_param param;
> >   
> > -   /* skip external memory that isn't a heap */
> > -   if (msl->external && !msl->heap)
> > +   /* only create DMA window once */
> > +   if (spapr_dma_win_len > 0)
> > return 0;
> >   
> > -   /* skip any segments with invalid IOVA addresses */
> > -   if (ms->iova == RTE_BAD_IOVA)
> > -   return 0;
> > +   /* walk the memseg list to find the page size/max VA address */
> > +   memset(¶m, 0, sizeof(param));
> > +   if (rte_memseg_list_walk(vfio_spapr_size_walk, ¶m) < 0) {
> > +   RTE_LOG(ERR, EAL, "Failed to walk memseg list for DMA "
> > +   "window size\n");
> > +   return -1;
> > +   }
> > +
> > +   /* We can't be sure if DMA window covers external memory */
> > +   if (param.external > 0)
> > +   RTE_LOG(WARNING, EAL, "Detected external memory which may "
> > +   "not be managed by the IOMMU\n");
> > +
> > +   /* find the maximum IOVA address for setting the DMA window size */
> > +   if (rte_eal_iova_mode() == RTE_IOVA_PA) {
> > +   static const char proc_iomem[] = "/proc/iomem";
> > +   static const char str_sysram[] = "System RAM";
> > +   uint64_t start, end, max = 0;
> > +   char *line = NULL;
> > +   char *dash, *space;
> > +   size_t line_len;
> > +
> > +   /*
> > +* Example "System RAM" in /proc/iomem:
> > +* -1f : System RAM
> > +* 2000-201f : System RAM
> > +*/
> > +   FILE *fd = fopen(proc_iomem, "r");
> > +   if (fd == NULL) {
> > +   RTE_LOG(ERR, EAL, "Cannot open %s\n", proc_iomem);
> > +   return -1;
> >

Re: [dpdk-dev] [DPDK_KMODS v3] linux/igb_uio: add Makefile to build the kernel module

2020-10-07 Thread Burakov, Anatoly

On 05-Oct-20 6:57 PM, Hariprasad Govindharajan wrote:

With DPDK 20.11 release, the igb_uio module is no more part of DPDK.
There are use cases where this module is required, for example while
testing the virtual ports in OvS, the virtual ports are bound to
igb_uio module inside a VM. So, this patch provides a Makefile
which can be used to build this module and use as needed.

Before building this module, the user is expected to build the
DPDK using meson build system and make sure that the required
libraries are installed in the path /usr/local

Signed-off-by: Hariprasad Govindharajan 
---
This patch will be part of dpdk-kmods repo
https://git.dpdk.org/dpdk-kmods/
---
v3:
Edited the commit message and corrected the mistakes in the variable
definition
v2:
Added more information to the commit message
---
  linux/igb_uio/Makefile | 8 
  1 file changed, 8 insertions(+)
  create mode 100644 linux/igb_uio/Makefile

diff --git a/linux/igb_uio/Makefile b/linux/igb_uio/Makefile
new file mode 100644
index 000..8be32f0
--- /dev/null
+++ b/linux/igb_uio/Makefile
@@ -0,0 +1,8 @@
+DPDK_HEADERS ?= /usr/local/include
+RTE_KERNELDIR ?= /lib/modules/`uname -r`/build
+
+all:
+   make EXTRA_CFLAGS="-I $(DPDK_HEADERS)" -C $(RTE_KERNELDIR)/ 
M=$(PWD)
+
+clean:
+   make -C $(RTE_KERNELDIR)/ M=$(PWD) clean



It seems that indentation is inconsistent - tabs in "all" and spaces in 
"clean". Please enable visible whitespace in your editor :)


--
Thanks,
Anatoly


[dpdk-dev] [PATCH] vhost: fix external mbuf creation

2020-10-07 Thread Olivier Matz
In virtio_dev_extbuf_alloc(), the shinfo structure used to store
the reference counter and the free callback of the external buffer
is by default stored inside the mbuf data.

This is wrong because the mbuf (and its data) can be freed before
the external buffer, for instance in the following situation:

  pkt2 = rte_pktmbuf_alloc(mp);
  rte_pktmbuf_attach(pkt2, pkt);
  rte_pktmbuf_free(pkt);

After this, pkt is freed, but it still contains shinfo, which is
referenced by pkt2.

Fix this by always storing the shinfo beside the external buffer.

Fixes: c3ff0ac70acb ("vhost: improve performance by supporting large buffer")
Cc: sta...@dpdk.org

Signed-off-by: Olivier Matz 
---

Hi,

I found this issue by code review while discussing about this
patchset [1]. I did not tested the patch, but as I'm only removing
one path among the two, I don't expect that it breaks anything.

[1] http://inbox.dpdk.org/dev/20200730120900.108232-1-yang_y...@163.com/

Regards,
Olivier

 lib/librte_vhost/virtio_net.c | 30 --
 1 file changed, 8 insertions(+), 22 deletions(-)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 0a0bea1a5a..008f5ceb04 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -2098,16 +2098,8 @@ virtio_dev_extbuf_alloc(struct rte_mbuf *pkt, uint32_t 
size)
rte_iova_t iova;
void *buf;
 
-   /* Try to use pkt buffer to store shinfo to reduce the amount of memory
-* required, otherwise store shinfo in the new buffer.
-*/
-   if (rte_pktmbuf_tailroom(pkt) >= sizeof(*shinfo))
-   shinfo = rte_pktmbuf_mtod(pkt,
- struct rte_mbuf_ext_shared_info *);
-   else {
-   total_len += sizeof(*shinfo) + sizeof(uintptr_t);
-   total_len = RTE_ALIGN_CEIL(total_len, sizeof(uintptr_t));
-   }
+   total_len += sizeof(*shinfo) + sizeof(uintptr_t);
+   total_len = RTE_ALIGN_CEIL(total_len, sizeof(uintptr_t));
 
if (unlikely(total_len > UINT16_MAX))
return -ENOSPC;
@@ -2118,18 +2110,12 @@ virtio_dev_extbuf_alloc(struct rte_mbuf *pkt, uint32_t 
size)
return -ENOMEM;
 
/* Initialize shinfo */
-   if (shinfo) {
-   shinfo->free_cb = virtio_dev_extbuf_free;
-   shinfo->fcb_opaque = buf;
-   rte_mbuf_ext_refcnt_set(shinfo, 1);
-   } else {
-   shinfo = rte_pktmbuf_ext_shinfo_init_helper(buf, &buf_len,
- virtio_dev_extbuf_free, buf);
-   if (unlikely(shinfo == NULL)) {
-   rte_free(buf);
-   VHOST_LOG_DATA(ERR, "Failed to init shinfo\n");
-   return -1;
-   }
+   shinfo = rte_pktmbuf_ext_shinfo_init_helper(buf, &buf_len,
+   virtio_dev_extbuf_free, buf);
+   if (unlikely(shinfo == NULL)) {
+   rte_free(buf);
+   VHOST_LOG_DATA(ERR, "Failed to init shinfo\n");
+   return -1;
}
 
iova = rte_malloc_virt2iova(buf);
-- 
2.25.1



Re: [dpdk-dev] [PATCH] power: fix current frequency index

2020-10-07 Thread Thomas Monjalon
31/07/2020 11:32, Liang, Ma:
> Hi Reshma, 
> 
>I'm OK with the change. That's the missing part about init process.
> 
> Reviewed-by Liang Ma 

Because of a missing colon, this review was not counted in patchwork.

I guess Dave is OK with this patch but he was not Cc'ed.
Please use --cc-cmd devtools/get-maintainer.sh when sending patches.

Dave, you can poll the backlog with this request:
https://patches.dpdk.org/project/dpdk/list/?q=power

Patch applied, thanks





[dpdk-dev] [PATCH v5 1/2] ethdev: add flow shared action API

2020-10-07 Thread Andrey Vesnovaty
This commit introduces extension of DPDK flow action API enabling
sharing of single rte_flow_action in multiple flows. The API intended for
PMDs, where multiple HW offloaded flows can reuse the same HW
essence/object representing flow action and modification of such an
essence/object affects all the rules using it.

Motivation and example
===
Adding or removing one or more queues to RSS used by multiple flow rules
imposes per rule toll for current DPDK flow API; the scenario requires
for each flow sharing cloned RSS action:
- call `rte_flow_destroy()`
- call `rte_flow_create()` with modified RSS action

API for sharing action and its in-place update benefits:
- reduce the overhead of multiple RSS flow rules reconfiguration
- optimize resource utilization by sharing action across multiple
  flows

Change description
===

Shared action
===
In order to represent flow action shared by multiple flows new action
type RTE_FLOW_ACTION_TYPE_SHARED is introduced (see `enum
rte_flow_action_type`).
Actually the introduced API decouples action from any specific flow and
enables sharing of single action by its handle across multiple flows.

Shared action create/use/destroy
===
Shared action may be reused by some or none flow rules at any given
moment, i.e. shared action reside outside of the context of any flow.
Shared action represent HW resources/objects used for action offloading
implementation.
API for shared action create (see `rte_flow_shared_action_create()`):
- should allocate HW resources and make related initializations required
  for shared action implementation.
- make necessary preparations to maintain shared access to
  the action resources, configuration and state.
API for shared action destroy (see `rte_flow_shared_action_destroy()`)
should release HW resources and make related cleanups required for shared
action implementation.

In order to share some flow action reuse the handle of type
`struct rte_flow_shared_action` returned by
rte_flow_shared_action_create() as a `conf` field of
`struct rte_flow_action` (see "example" section).

If some shared action not used by any flow rule all resources allocated
by the shared action can be released by rte_flow_shared_action_destroy()
(see "example" section). The shared action handle passed as argument to
destroy API should not be used any further i.e. result of the usage is
undefined.

Shared action re-configuration
===
Shared action behavior defined by its configuration can be updated via
rte_flow_shared_action_update() (see "example" section). The shared
action update operation modifies HW related resources/objects allocated
on the action creation. The number of operations performed by the update
operation should not depend on the number of flows sharing the related
action. On return of shared action update API action behavior should be
according to updated configuration for all flows sharing the action.

Shared action query
===
Provide separate API to query shared action state (see
rte_flow_shared_action_update()). Taking a counter as an example: query
returns value aggregating all counter increments across all flow rules
sharing the counter. This API doesn't query shared action configuration
since it is controlled by rte_flow_shared_action_create() and
rte_flow_shared_action_update() APIs and no supposed to change by other
means.

PMD support
===
The support of introduced API is pure PMD specific design and
responsibility for each action type (see struct rte_flow_ops).

testpmd
===
In order to utilize introduced API testpmd cli may implement following
extension
create/update/destroy/query shared action accordingly

flow shared_action (port) create {action_id (id)} (action) / end
flow shared_action (port) update (id) (action) / end
flow shared_action (port) destroy action_id (id) {action_id (id) [...]}
flow shared_action (port) query (id)

testpmd example
===

configure rss to queues 1 & 2

> flow shared_action 0 create action_id 100 rss queues 1 2 end / end

create flow rule utilizing shared action

> flow create 0 ingress \
pattern eth dst is 0c:42:a1:15:fd:ac / ipv6 / tcp / end \
  actions shared 100 / end

add 2 more queues

> flow shared_action 0 modify 100 rss queues 1 2 3 4 end / end

example
===

struct rte_flow_action actions[2];
struct rte_flow_action action;
/* skipped: initialize action */
struct rte_flow_shared_action *handle = rte_flow_shared_action_create(
port_id, &action, &error);
actions[0].type = RTE_FLOW_ACTION_TYPE_SHARED;
actions[0].conf = handle;
actions[1].type = RTE_FLOW_ACTION_TYPE_END;
/* skipped: init attr0 & pattern0 args */
struct rte_flow *flow0 = rte_flow_create(port_id, &attr0, pattern0,
actions, error);
/* create more rules reusing shared action */
struct rte_flow *flow1 = rte_flow_create(port_id, &attr1, pattern1,
actions, error);
/* skipped: for flows 2 till N */
struct rte_flow *flowN = rte_flow_creat

[dpdk-dev] [PATCH v5 0/2] RTE flow shared action

2020-10-07 Thread Andrey Vesnovaty
This patchset introduces shared action API for RTE flow.

v5 changes:
* updated RTE flow programmers guide
* misc elaborations in code comments

Andrey Vesnovaty (2):
  ethdev: add flow shared action API
  app/testpmd: support shared action

 app/test-pmd/cmdline_flow.c | 262 +++-
 app/test-pmd/config.c   | 217 
 app/test-pmd/testpmd.h  |  19 ++
 doc/guides/prog_guide/rte_flow.rst  |  19 ++
 doc/guides/rel_notes/release_20_11.rst  |   9 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 111 +
 lib/librte_ethdev/rte_ethdev_version.map|   4 +
 lib/librte_ethdev/rte_flow.c|  84 +++
 lib/librte_ethdev/rte_flow.h| 163 +++-
 lib/librte_ethdev/rte_flow_driver.h |  23 ++
 10 files changed, 909 insertions(+), 2 deletions(-)

-- 
2.26.2



[dpdk-dev] [PATCH v5 2/2] app/testpmd: support shared action

2020-10-07 Thread Andrey Vesnovaty
This patch adds shared action support to testpmd CLI.

All shared actions created via testpmd CLI assigned ID for further
reference in other CLI commands. Shared action ID supplied as CLI
argument or assigned by testpmd is similar to flow ID & limited to
scope of testpdm CLI.

Create shared action syntax:
flow shared_action (port) create {action_id (shared_action_id)} \
(action) / end

Create shared action examples:
flow shared_action 0 create action_id 100 \
rss queues 1 2 end / end
This creates shared rss action with id 100 on port 0.

flow shared_action 0 create action_id \
rss queues 0 1 end / end
This creates shared rss action with id assigned by tetspmd
on port 0.

Update shared action syntax:
flow shared_action (port) update (shared_action_id) (action) / end

Update shared action example:
flow shared_action 0 update 100 rss queues 0 3 end / end
This updates shared rss action having id 100 on port 0
with rss to queues 0 3 (in create example rss queues were
1 & 2).

Destroy shared action syntax:
flow shared_action (port) destroy action_id (shared_action_id) \
{ action_id (shared_action_id) [...]}

Update shared action example:
flow shared_action 0 destroy action_id 100 action_id 101
This destroys shared actions having id 100 & 101

Query shared action syntax:
flow shared_action (port) query (shared_action_id)

Query shared action example:
flow shared_action 0 query 100
This queries shared actions having id 100

Use shared action as flow action syntax:
flow create (port) ... / end actions {action / [...]} \
shared (action_id) / {action / [...]} end

Use shared action as flow action example:
flow create 0 ingress pattern ... / end \
actions shared 100 / end
This creates flow rule where rss action is shared rss action
having id 100.

All shared action CLIs report status of the command.
Shared action query CLI output depends on action type.

Signed-off-by: Andrey Vesnovaty 
Acked-by: Ori Kam 
---
 app/test-pmd/cmdline_flow.c | 262 +++-
 app/test-pmd/config.c   | 217 
 app/test-pmd/testpmd.h  |  19 ++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst | 111 +
 4 files changed, 608 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 6e04d538ea..402ce69aa3 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -49,6 +49,7 @@ enum index {
PORT_ID,
GROUP_ID,
PRIORITY_LEVEL,
+   SHARED_ACTION_ID,
 
/* Top-level command. */
SET,
@@ -60,6 +61,7 @@ enum index {
/* Top-level command. */
FLOW,
/* Sub-level commands. */
+   SHARED_ACTION,
VALIDATE,
CREATE,
DESTROY,
@@ -89,6 +91,18 @@ enum index {
EGRESS,
TRANSFER,
 
+   /* Shared action arguments */
+   SHARED_ACTION_CREATE,
+   SHARED_ACTION_UPDATE,
+   SHARED_ACTION_DESTROY,
+   SHARED_ACTION_QUERY,
+
+   /* Shared action create arguments */
+   SHARED_ACTION_CREATE_ID,
+
+   /* Shared action destroy arguments */
+   SHARED_ACTION_DESTROY_ID,
+
/* Validate/create pattern. */
PATTERN,
ITEM_PARAM_IS,
@@ -360,6 +374,8 @@ enum index {
ACTION_SET_IPV6_DSCP_VALUE,
ACTION_AGE,
ACTION_AGE_TIMEOUT,
+   ACTION_SHARED,
+   SHARED_ACTION_ID2PTR,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -653,6 +669,13 @@ struct buffer {
enum index command; /**< Flow command. */
portid_t port; /**< Affected port ID. */
union {
+   struct {
+   uint32_t *action_id;
+   uint32_t action_id_n;
+   } sa_destroy; /**< Shared action destroy arguments. */
+   struct {
+   uint32_t action_id;
+   } sa; /* Shared action query arguments */
struct {
struct rte_flow_attr attr;
struct rte_flow_item *pattern;
@@ -709,6 +732,19 @@ struct parse_action_priv {
.size = s, \
})
 
+static const enum index next_sa_create_attr[] = {
+   SHARED_ACTION_CREATE_ID,
+   ACTION_RSS,
+   ZERO,
+};
+
+static const enum index next_sa_subcmd[] = {
+   SHARED_ACTION_CREATE,
+   SHARED_ACTION_UPDATE,
+   SHARED_ACTION_DESTROY,
+   SHARED_ACTION_QUERY,
+};
+
 static const enum index next_vc_attr[] = {
GROUP,
PRIORITY,
@@ -743,6 +779,12 @@ static const enum index next_aged_attr[] = {
ZERO,
 };
 
+static const enum index next_sa_destroy_attr[] = {
+   SHARED_ACTION_DESTROY_ID,
+   END,
+   ZERO,
+};
+
 static const enum index item_param[] = {
ITEM_PARAM_IS,

Re: [dpdk-dev] [PATCH v5 1/2] ethdev: add flow shared action API

2020-10-07 Thread Ori Kam
Hi Andrey,

> -Original Message-
> From: dev  On Behalf Of Andrey Vesnovaty
> Sent: Wednesday, October 7, 2020 3:56 PM
> Subject: [dpdk-dev] [PATCH v5 1/2] ethdev: add flow shared action API
> 
> This commit introduces extension of DPDK flow action API enabling
> sharing of single rte_flow_action in multiple flows. The API intended for
> PMDs, where multiple HW offloaded flows can reuse the same HW
> essence/object representing flow action and modification of such an
> essence/object affects all the rules using it.
> 
> Motivation and example
> ===
> Adding or removing one or more queues to RSS used by multiple flow rules
> imposes per rule toll for current DPDK flow API; the scenario requires
> for each flow sharing cloned RSS action:
> - call `rte_flow_destroy()`
> - call `rte_flow_create()` with modified RSS action
> 
> API for sharing action and its in-place update benefits:
> - reduce the overhead of multiple RSS flow rules reconfiguration
> - optimize resource utilization by sharing action across multiple
>   flows
> 
> Change description
> ===
> 
> Shared action
> ===
> In order to represent flow action shared by multiple flows new action
> type RTE_FLOW_ACTION_TYPE_SHARED is introduced (see `enum
> rte_flow_action_type`).
> Actually the introduced API decouples action from any specific flow and
> enables sharing of single action by its handle across multiple flows.
> 
> Shared action create/use/destroy
> ===
> Shared action may be reused by some or none flow rules at any given
> moment, i.e. shared action reside outside of the context of any flow.
> Shared action represent HW resources/objects used for action offloading
> implementation.
> API for shared action create (see `rte_flow_shared_action_create()`):
> - should allocate HW resources and make related initializations required
>   for shared action implementation.
> - make necessary preparations to maintain shared access to
>   the action resources, configuration and state.
> API for shared action destroy (see `rte_flow_shared_action_destroy()`)
> should release HW resources and make related cleanups required for shared
> action implementation.
> 
> In order to share some flow action reuse the handle of type
> `struct rte_flow_shared_action` returned by
> rte_flow_shared_action_create() as a `conf` field of
> `struct rte_flow_action` (see "example" section).
> 
> If some shared action not used by any flow rule all resources allocated
> by the shared action can be released by rte_flow_shared_action_destroy()
> (see "example" section). The shared action handle passed as argument to
> destroy API should not be used any further i.e. result of the usage is
> undefined.
> 
> Shared action re-configuration
> ===
> Shared action behavior defined by its configuration can be updated via
> rte_flow_shared_action_update() (see "example" section). The shared
> action update operation modifies HW related resources/objects allocated
> on the action creation. The number of operations performed by the update
> operation should not depend on the number of flows sharing the related
> action. On return of shared action update API action behavior should be
> according to updated configuration for all flows sharing the action.
> 
> Shared action query
> ===
> Provide separate API to query shared action state (see
> rte_flow_shared_action_update()). Taking a counter as an example: query
> returns value aggregating all counter increments across all flow rules
> sharing the counter. This API doesn't query shared action configuration
> since it is controlled by rte_flow_shared_action_create() and
> rte_flow_shared_action_update() APIs and no supposed to change by other
> means.
> 
> PMD support
> ===
> The support of introduced API is pure PMD specific design and
> responsibility for each action type (see struct rte_flow_ops).
> 
> testpmd
> ===
> In order to utilize introduced API testpmd cli may implement following
> extension
> create/update/destroy/query shared action accordingly
> 
> flow shared_action (port) create {action_id (id)} (action) / end
> flow shared_action (port) update (id) (action) / end
> flow shared_action (port) destroy action_id (id) {action_id (id) [...]}
> flow shared_action (port) query (id)
> 
> testpmd example
> ===
> 
> configure rss to queues 1 & 2
> 
> > flow shared_action 0 create action_id 100 rss queues 1 2 end / end
> 
> create flow rule utilizing shared action
> 
> > flow create 0 ingress \
> pattern eth dst is 0c:42:a1:15:fd:ac / ipv6 / tcp / end \
>   actions shared 100 / end
> 
> add 2 more queues
> 
> > flow shared_action 0 modify 100 rss queues 1 2 3 4 end / end
> 
> example
> ===
> 
> struct rte_flow_action actions[2];
> struct rte_flow_action action;
> /* skipped: initialize action */
> struct rte_flow_shared_action *handle = rte_flow_shared_action_create(
>   port_id, &action, &error);
> actions[0].type = RTE_FLOW_ACTION_TYPE_SHARED;
> actions[0].conf = handle;
> actions[1].type

Re: [dpdk-dev] [PATCH v2 2/2] doc: add l3fwd-regex application user guide

2020-10-07 Thread Ori Kam
Hi Guy,

> -Original Message-
> From: g...@marvell.com 
> Sent: Wednesday, September 9, 2020 3:09 PM
> Subject: [PATCH v2 2/2] doc: add l3fwd-regex application user guide
> 
> From: Guy Kaneti 
> 
> Adding the user guide for l3fwd regex application.
> 
> Signed-off-by: Guy Kaneti 
> ---

Acked-by: Ori Kam 
Thanks,
Ori



[dpdk-dev] [PATCH v4 1/2] ethdev: support query of AGE action

2020-10-07 Thread Dekel Peled
Existing API supports AGE action to monitor the aging of a flow.
This patch implements RFC [1], introducing the response format for query
of an AGE action.
Application will be able to query the AGE action state.
The response will be returned in the format implemented here.

[1] https://mails.dpdk.org/archives/dev/2020-September/180061.html

Signed-off-by: Dekel Peled 
Acked-by: Matan Azrad 
---
 doc/guides/prog_guide/rte_flow.rst | 17 +
 doc/guides/rel_notes/release_20_11.rst |  3 +++
 lib/librte_ethdev/rte_flow.h   | 15 +++
 3 files changed, 35 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index 119b128..a672fc5 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2666,6 +2666,23 @@ timeout passed without any matching on the flow.
| ``context``  | user input flow context |
+--+-+
 
+Query structure to retrieve ageing status information of a
+shared AGE action, or a flow rule using the AGE action:
+
+.. _table_rte_flow_query_age:
+
+.. table:: AGE query
+
+   
+--+-++
+   | Field| I/O | Value
  |
+   
+==+=++
+   | ``aged`` | out | Aging timeout expired
  |
+   
+--+-++
+   | ``sec_since_last_hit_valid`` | out | ``sec_since_last_hit`` value is 
valid  |
+   
+--+-++
+   | ``sec_since_last_hit``   | out | Seconds since last traffic hit   
  |
+   
+--+-++
+
 Negative types
 ~~
 
diff --git a/doc/guides/rel_notes/release_20_11.rst 
b/doc/guides/rel_notes/release_20_11.rst
index 0b2a370..7e093f7 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -191,6 +191,9 @@ API Changes
 
 * ethdev: ``rte_eth_rx_descriptor_done()`` API has been deprecated.
 
+* ethdev: Added struct ``rte_flow_query_age`` for use in response to query
+  of AGE action.
+
 * Renamed internal ethdev APIs:
 
   * ``_rte_eth_dev_callback_process()`` -> ``rte_eth_dev_callback_process()``
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..42a315a 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2130,6 +2130,7 @@ enum rte_flow_action_type {
 * See struct rte_flow_action_age.
 * See function rte_flow_get_aged_flows
 * see enum RTE_ETH_EVENT_FLOW_AGED
+* See struct rte_flow_query_age
 */
RTE_FLOW_ACTION_TYPE_AGE,
 };
@@ -2194,6 +2195,20 @@ struct rte_flow_action_age {
 };
 
 /**
+ * RTE_FLOW_ACTION_TYPE_AGE (query)
+ *
+ * Query structure to retrieve the aging status information of a
+ * shared AGE action, or a flow rule using the AGE action.
+ */
+struct rte_flow_query_age {
+   uint32_t reserved:6; /**< Reserved, must be zero. */
+   uint32_t aged:1; /**< 1 if aging timeout expired, 0 otherwise. */
+   uint32_t sec_since_last_hit_valid:1;
+   /**< sec_since_last_hit value is valid. */
+   uint32_t sec_since_last_hit:24; /**< Seconds since last traffic hit. */
+};
+
+/**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
  *
-- 
1.8.3.1



  1   2   3   >