Re: [dpdk-dev] [PATCH] doc: abstract the behaviour of rte_ctrl_thread_create
On Tue, Aug 3, 2021 at 11:24 AM Ruifeng Wang wrote: > > > -Original Message- > > From: Honnappa Nagarahalli > > Sent: Saturday, July 31, 2021 5:45 AM > > To: dev@dpdk.org; Honnappa Nagarahalli > > ; olivier.m...@6wind.com; > > lucp.at.w...@gmail.com; david.march...@redhat.com; > > tho...@monjalon.net > > Cc: Ruifeng Wang ; nd > > Subject: [PATCH] doc: abstract the behaviour of rte_ctrl_thread_create > > > > The current expected behaviour of the function rte_ctrl_thread_create is > > rigid which makes the implementation of the function complex. > > Make the expected behaviour abstract to allow for simplified > > implementation. > > > > With this change, the calls to pthread_setaffinity_np can be moved to the > > control thread. This will avoid the use of pthread_barrier_wait and simplify > > the synchronization mechanism between rte_ctrl_thread_create and the > > calling thread. > > > > Signed-off-by: Honnappa Nagarahalli > > --- > > Possible patch is at: > > http://patches.dpdk.org/project/dpdk/patch/20210730213709.19400-1- > > honnappa.nagaraha...@arm.com/ > > > > doc/guides/rel_notes/deprecation.rst | 7 +++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/doc/guides/rel_notes/deprecation.rst > > b/doc/guides/rel_notes/deprecation.rst > > index 9584d6bfd7..1960e3c8bf 100644 > > --- a/doc/guides/rel_notes/deprecation.rst > > +++ b/doc/guides/rel_notes/deprecation.rst > > @@ -11,6 +11,13 @@ here. > > Deprecation Notices > > --- > > > > +* eal: The expected behaviour of the function > > +``rte_ctrl_thread_create`` > > + abstracted to allow for simplified implementation. The new behaviour > > +is > > + as follows: > > + Creates a control thread with the given name. The affinity of the new > > + thread is based on the CPU affinity retrieved at the time > > +rte_eal_init() > > + was called, the dataplane and service lcores are then excluded. > > + > > * kvargs: The function ``rte_kvargs_process`` will get a new parameter > >for returning key match count. It will ease handling of no-match case. > > > > -- > > 2.17.1 > Acked-by: Ruifeng Wang Acked-by: Jerin Jacob
[dpdk-dev] [PATCH v2] net/ice/base: support L2 and L3 FDIR field for IP fragment packets
Add L2 and L3 FDIR field support for IPv6 fragment packets. Signed-off-by: Wenjun Wu --- v2: remove redundant IPv6 protocol field, because for IPv6 fragment packets, this value should be fix. --- drivers/net/ice/base/ice_fdir.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ice/base/ice_fdir.c b/drivers/net/ice/base/ice_fdir.c index 2e4770061d..43209263d3 100644 --- a/drivers/net/ice/base/ice_fdir.c +++ b/drivers/net/ice/base/ice_fdir.c @@ -1958,6 +1958,13 @@ ice_fdir_get_gen_prgm_pkt(struct ice_hw *hw, struct ice_fdir_fltr *input, ice_pkt_insert_mac_addr(loc, input->ext_data.dst_mac); break; case ICE_FLTR_PTYPE_FRAG_IPV6: + ice_pkt_insert_ipv6_addr(loc, ICE_IPV6_DST_ADDR_OFFSET, +input->ip.v6.src_ip); + ice_pkt_insert_ipv6_addr(loc, ICE_IPV6_SRC_ADDR_OFFSET, +input->ip.v6.dst_ip); + ice_pkt_insert_u8_tc(loc, ICE_IPV6_TC_OFFSET, input->ip.v6.tc); + ice_pkt_insert_u8(loc, ICE_IPV6_HLIM_OFFSET, input->ip.v6.hlim); + ice_pkt_insert_mac_addr(loc, input->ext_data.dst_mac); ice_pkt_insert_u32(loc, ICE_IPV6_ID_OFFSET, input->ip.v6.packet_id); break; -- 2.25.1
Re: [dpdk-dev] [PATCH] vhost: announce experimental tag removal of vhost APIs
> -Original Message- > From: dev On Behalf Of Chenbo Xia > Sent: Friday, July 30, 2021 9:19 AM > To: dev@dpdk.org; maxime.coque...@redhat.com; amore...@redhat.com; > step...@networkplumber.org; tho...@monjalon.net; Yigit, Ferruh > ; Richardson, Bruce ; > Ananyev, Konstantin ; > ktray...@redhat.com; jerinjac...@gmail.com > Subject: [dpdk-dev] [PATCH] vhost: announce experimental tag removal of > vhost APIs > > This patch announces the experimental tag removal of 10 vhost APIs, > which have been experimental for more than 2 years. All APIs could > be made stable in DPDK 21.11. > > Signed-off-by: Chenbo Xia > Acked-by: Maxime Coquelin > --- > doc/guides/rel_notes/deprecation.rst | 8 > 1 file changed, 8 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index 9584d6bfd7..f97a9d0058 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -147,3 +147,11 @@ Deprecation Notices > * cmdline: ``cmdline`` structure will be made opaque to hide platform- > specific >content. On Linux and FreeBSD, supported prior to DPDK 20.11, >original structure will be kept until DPDK 21.11. > + > +* vhost: The experimental tags of > ``rte_vhost_driver_get_protocol_features``, > + ``rte_vhost_driver_get_queue_num``, ``rte_vhost_crypto_create``, > + ``rte_vhost_crypto_free``, ``rte_vhost_crypto_fetch_requests``, > + ``rte_vhost_crypto_finalize_requests``, > ``rte_vhost_crypto_set_zero_copy``, > + ``rte_vhost_va_from_guest_pa``, ``rte_vhost_extern_callback_register``, > + and ``rte_vhost_driver_set_protocol_features`` APIs will be removed and > the > + APIs will be made stable in DPDK 21.11. > \ No newline at end of file > -- > 2.17.1 As of rte_vhost_crypto related APIs Acked-by: Fan Zhang
Re: [dpdk-dev] [PATCH v2] vhost: announce experimental tag removal of vhost APIs
> -Original Message- > From: dev On Behalf Of Chenbo Xia > Sent: Friday, July 30, 2021 9:24 AM > To: dev@dpdk.org; maxime.coque...@redhat.com; amore...@redhat.com; > step...@networkplumber.org; tho...@monjalon.net; Yigit, Ferruh > ; Richardson, Bruce ; > Ananyev, Konstantin ; > ktray...@redhat.com; jerinjac...@gmail.com > Subject: [dpdk-dev] [PATCH v2] vhost: announce experimental tag removal > of vhost APIs > > This patch announces the experimental tag removal of 10 vhost APIs, > which have been experimental for more than 2 years. All APIs could > be made stable in DPDK 21.11. > > Signed-off-by: Chenbo Xia > Acked-by: Maxime Coquelin > --- > doc/guides/rel_notes/deprecation.rst | 8 > 1 file changed, 8 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index 9584d6bfd7..5d5b7884d7 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -147,3 +147,11 @@ Deprecation Notices > * cmdline: ``cmdline`` structure will be made opaque to hide platform- > specific >content. On Linux and FreeBSD, supported prior to DPDK 20.11, >original structure will be kept until DPDK 21.11. > + > +* vhost: The experimental tags of > ``rte_vhost_driver_get_protocol_features``, > + ``rte_vhost_driver_get_queue_num``, ``rte_vhost_crypto_create``, > + ``rte_vhost_crypto_free``, ``rte_vhost_crypto_fetch_requests``, > + ``rte_vhost_crypto_finalize_requests``, > ``rte_vhost_crypto_set_zero_copy``, > + ``rte_vhost_va_from_guest_pa``, ``rte_vhost_extern_callback_register``, > + and ``rte_vhost_driver_set_protocol_features`` APIs will be removed and > the > + APIs will be made stable in DPDK 21.11. > -- > 2.17.1 As of rte_vhost_crypto related APIs Acked-by: Fan Zhang
[dpdk-dev] [v3, 0/3] common/cnxk: enable npa telemetry
This patch series enables telemetry in NPA LF of cnxk. v3: - fixed format specifier for uintptr_t Gowrishankar Muthukrishnan (3): telemetry: enable storing pointer value test/telemetry: add unit tests for pointer value common/cnxk: add telemetry endpoints to npa app/test/test_telemetry_data.c | 125 + app/test/test_telemetry_json.c | 29 ++- drivers/common/cnxk/cnxk_telemetry.h | 26 +++ drivers/common/cnxk/cnxk_telemetry_npa.c | 227 +++ drivers/common/cnxk/meson.build | 4 + drivers/common/cnxk/roc_platform.h | 8 + lib/telemetry/rte_telemetry.h| 37 +++- lib/telemetry/telemetry.c| 21 ++- lib/telemetry/telemetry_data.c | 40 +++- lib/telemetry/telemetry_data.h | 2 + lib/telemetry/telemetry_json.h | 32 lib/telemetry/version.map| 2 + 12 files changed, 539 insertions(+), 14 deletions(-) create mode 100644 drivers/common/cnxk/cnxk_telemetry.h create mode 100644 drivers/common/cnxk/cnxk_telemetry_npa.c -- 2.25.1
[dpdk-dev] [v3, 1/3] telemetry: enable storing pointer value
At present, value of pointer variable or address is stored in u64 type which may not properly work in non64 bit arch. Hence, this patch adds new API to store it in void*. JSON encoding is after converting to uintptr_t so, address value is correctly casted as per arch as well. Once JSON5 support is available at JSON clients, hex value of address can be encoded instead of uintptr_t. Signed-off-by: Gowrishankar Muthukrishnan --- lib/telemetry/rte_telemetry.h | 37 ++- lib/telemetry/telemetry.c | 21 -- lib/telemetry/telemetry_data.c | 40 ++ lib/telemetry/telemetry_data.h | 2 ++ lib/telemetry/telemetry_json.h | 32 +++ lib/telemetry/version.map | 2 ++ 6 files changed, 127 insertions(+), 7 deletions(-) diff --git a/lib/telemetry/rte_telemetry.h b/lib/telemetry/rte_telemetry.h index 8776998b54..6a420f918c 100644 --- a/lib/telemetry/rte_telemetry.h +++ b/lib/telemetry/rte_telemetry.h @@ -46,7 +46,8 @@ enum rte_tel_value_type { RTE_TEL_STRING_VAL, /** a string value */ RTE_TEL_INT_VAL,/** a signed 32-bit int value */ RTE_TEL_U64_VAL,/** an unsigned 64-bit int value */ - RTE_TEL_CONTAINER, /** a container struct */ + RTE_TEL_CONTAINER, /** a container struct */ + RTE_TEL_PTR_VAL,/** a pointer value */ }; /** @@ -137,6 +138,22 @@ __rte_experimental int rte_tel_data_add_array_u64(struct rte_tel_data *d, uint64_t x); +/** + * Add a pointer value to an array. + * The array must have been started by rte_tel_data_start_array() with + * RTE_TEL_PTR_VAL as the type parameter. + * + * @param d + * The data structure passed to the callback + * @param x + * The pointer value to be returned in the array + * @return + * 0 on success, negative errno on error + */ +__rte_experimental +int +rte_tel_data_add_array_ptr(struct rte_tel_data *d, void *x); + /** * Add a container to an array. A container is an existing telemetry data * array. The array the container is to be added to must have been started by @@ -213,6 +230,24 @@ int rte_tel_data_add_dict_u64(struct rte_tel_data *d, const char *name, uint64_t val); +/** + * Add a pointer value to a dictionary. + * The dict must have been started by rte_tel_data_start_dict(). + * + * @param d + * The data structure passed to the callback + * @param name + * The name the value is to be stored under in the dict + * @param ptr + * The pointer value to be stored in the dict + * @return + * 0 on success, negative errno on error, E2BIG on string truncation of name. + */ +__rte_experimental +int +rte_tel_data_add_dict_ptr(struct rte_tel_data *d, + const char *name, void *ptr); + /** * Add a container to a dictionary. A container is an existing telemetry data * array. The dict the container is to be added to must have been started by diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c index 8665db8d03..5842b28740 100644 --- a/lib/telemetry/telemetry.c +++ b/lib/telemetry/telemetry.c @@ -157,8 +157,10 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len) size_t used = 0; unsigned int i; - if (d->type != RTE_TEL_ARRAY_U64 && d->type != RTE_TEL_ARRAY_INT - && d->type != RTE_TEL_ARRAY_STRING) + if (d->type != RTE_TEL_ARRAY_U64 + && d->type != RTE_TEL_ARRAY_INT + && d->type != RTE_TEL_ARRAY_PTR + && d->type != RTE_TEL_ARRAY_STRING) return snprintf(out_buf, buf_len, "null"); used = rte_tel_json_empty_array(out_buf, buf_len, 0); @@ -167,6 +169,11 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len) used = rte_tel_json_add_array_u64(out_buf, buf_len, used, d->data.array[i].u64val); + if (d->type == RTE_TEL_ARRAY_PTR) + for (i = 0; i < d->data_len; i++) + used = rte_tel_json_add_array_ptr(out_buf, + buf_len, used, + d->data.array[i].ptrval); if (d->type == RTE_TEL_ARRAY_INT) for (i = 0; i < d->data_len; i++) used = rte_tel_json_add_array_int(out_buf, @@ -226,6 +233,11 @@ output_json(const char *cmd, const struct rte_tel_data *d, int s) buf_len, used, v->name, v->value.u64val); break; + case RTE_TEL_PTR_VAL: + used = rte_tel_json_add_obj_ptr(cb_data_buf, + buf_len, used, + v->name, v->value.ptrval); + break; case RTE_TEL_CONT
[dpdk-dev] [v3, 2/3] test/telemetry: add unit tests for pointer value
Adding tests to evaluate pointer value in array and dict. Signed-off-by: Gowrishankar Muthukrishnan --- app/test/test_telemetry_data.c | 125 + app/test/test_telemetry_json.c | 29 ++-- 2 files changed, 147 insertions(+), 7 deletions(-) diff --git a/app/test/test_telemetry_data.c b/app/test/test_telemetry_data.c index f34d691265..2351ae5193 100644 --- a/app/test/test_telemetry_data.c +++ b/app/test/test_telemetry_data.c @@ -301,6 +301,127 @@ test_array_with_array_u64_values(void) return TEST_OUTPUT("{\"/test\":[[0,1,2,3,4],[0,1,2,3,4]]}"); } +static int +test_case_array_ptr(void) +{ + int *p, i, j, a[] = {1, 2, 3, 4, 5}; + char exp[120]; + + memset(&response_data, 0, sizeof(response_data)); + memset(exp, 0, sizeof(exp)); + rte_tel_data_start_array(&response_data, RTE_TEL_PTR_VAL); + + i = sprintf(exp, "{\"/test\":["); + for (j = 0; j < 5; j++) { + p = &a[j]; + i += sprintf(exp + i, "%" PRIuPTR ",", (uintptr_t)p); + rte_tel_data_add_array_ptr(&response_data, p); + } + + sprintf(exp + i - 1, "]}"); + return TEST_OUTPUT(exp); +} + +static int +test_case_add_dict_ptr(void) +{ + int *p, i, j, a[] = {1, 2, 3, 4, 5}; + char name[8], exp[160]; + + memset(&response_data, 0, sizeof(response_data)); + memset(exp, 0, sizeof(exp)); + rte_tel_data_start_dict(&response_data); + + i = sprintf(exp, "{\"/test\":{"); + for (j = 0; j < 5; j++) { + p = &a[j]; + sprintf(name, "dict_%d", j); + i += sprintf(exp + i, "\"%s\":%" PRIuPTR ",", name, +(uintptr_t)p); + rte_tel_data_add_dict_ptr(&response_data, name, p); + } + + sprintf(exp + i - 1, "}}"); + return TEST_OUTPUT(exp); +} + +static int +test_dict_with_array_ptr_values(void) +{ + int *p, i, j, a[] = {1, 2, 3, 4, 5}; + char exp[256]; + + struct rte_tel_data *child_data = rte_tel_data_alloc(); + rte_tel_data_start_array(child_data, RTE_TEL_PTR_VAL); + + struct rte_tel_data *child_data2 = rte_tel_data_alloc(); + rte_tel_data_start_array(child_data2, RTE_TEL_PTR_VAL); + + memset(&response_data, 0, sizeof(response_data)); + memset(exp, 0, sizeof(exp)); + rte_tel_data_start_dict(&response_data); + + i = sprintf(exp, "{\"/test\":{\"dict_0\":["); + for (j = 0; j < 5; j++) { + p = &a[j]; + i += sprintf(exp + i, "%" PRIuPTR ",", (uintptr_t)p); + rte_tel_data_add_array_ptr(child_data, p); + } + + i += sprintf(exp + i - 1, "],\"dict_1\":["); + for (j = 5; j > 0; j--) { + p = &a[j - 1]; + i += sprintf(exp + i - 1, "%" PRIuPTR ",", (uintptr_t)p); + rte_tel_data_add_array_ptr(child_data2, p); + } + + sprintf(exp + i - 2, "]}}"); + rte_tel_data_add_dict_container(&response_data, "dict_0", + child_data, 0); + rte_tel_data_add_dict_container(&response_data, "dict_1", + child_data2, 0); + + return TEST_OUTPUT(exp); +} + +static int +test_array_with_array_ptr_values(void) +{ + int *p, i, j, a[] = {1, 2, 3, 4, 5}; + char exp[256]; + + struct rte_tel_data *child_data = rte_tel_data_alloc(); + rte_tel_data_start_array(child_data, RTE_TEL_PTR_VAL); + + struct rte_tel_data *child_data2 = rte_tel_data_alloc(); + rte_tel_data_start_array(child_data2, RTE_TEL_PTR_VAL); + + memset(&response_data, 0, sizeof(response_data)); + memset(exp, 0, sizeof(exp)); + rte_tel_data_start_array(&response_data, RTE_TEL_CONTAINER); + + i = sprintf(exp, "{\"/test\":[["); + for (j = 0; j < 5; j++) { + p = &a[j]; + i += sprintf(exp + i, "%" PRIuPTR ",", (uintptr_t)p); + rte_tel_data_add_array_ptr(child_data, p); + } + + i += sprintf(exp + i - 1, "],["); + for (j = 5; j > 0; j--) { + p = &a[j - 1]; + i += sprintf(exp + i - 1, "%" PRIuPTR ",", (uintptr_t)p); + rte_tel_data_add_array_ptr(child_data2, p); + } + + sprintf(exp + i - 2, "]]}"); + + rte_tel_data_add_array_container(&response_data, child_data, 0); + rte_tel_data_add_array_container(&response_data, child_data2, 0); + + return TEST_OUTPUT(exp); +} + static int connect_to_socket(void) { @@ -350,13 +471,17 @@ test_telemetry_data(void) test_case test_cases[] = {test_case_array_string, test_case_array_int, test_case_array_u64, + test_case_array_ptr, test_case_add_dict_int, test_case_add_dict_u64, + test_case_add_dict_ptr, test_ca
[dpdk-dev] [v3, 3/3] common/cnxk: add telemetry endpoints to npa
Add telemetry endpoints to npa. Signed-off-by: Gowrishankar Muthukrishnan --- drivers/common/cnxk/cnxk_telemetry.h | 26 +++ drivers/common/cnxk/cnxk_telemetry_npa.c | 227 +++ drivers/common/cnxk/meson.build | 4 + drivers/common/cnxk/roc_platform.h | 8 + 4 files changed, 265 insertions(+) create mode 100644 drivers/common/cnxk/cnxk_telemetry.h create mode 100644 drivers/common/cnxk/cnxk_telemetry_npa.c diff --git a/drivers/common/cnxk/cnxk_telemetry.h b/drivers/common/cnxk/cnxk_telemetry.h new file mode 100644 index 00..1461fd893f --- /dev/null +++ b/drivers/common/cnxk/cnxk_telemetry.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 Marvell. + */ + +#ifndef __CNXK_TELEMETRY_H_ +#define __CNXK_TELEMETRY_H_ + +#define CNXK_TEL_STR(s) #s +#define CNXK_TEL_STR_PREFIX(s, p) CNXK_TEL_STR(p##s) +#define CNXK_TEL_DICT_INT(d, p, s, ...) \ + plt_tel_data_add_dict_int(d, CNXK_TEL_STR_PREFIX(s, __VA_ARGS__), \ + (p)->s) +#define CNXK_TEL_DICT_PTR(d, p, s, ...) \ + plt_tel_data_add_dict_ptr(d, CNXK_TEL_STR_PREFIX(s, __VA_ARGS__), \ + (void *)(p)->s) +#define CNXK_TEL_DICT_BF_PTR(d, p, s, ...) \ + plt_tel_data_add_dict_ptr(d, CNXK_TEL_STR_PREFIX(s, __VA_ARGS__), \ + (void *)(uint64_t)(p)->s) +#define CNXK_TEL_DICT_U64(d, p, s, ...) \ + plt_tel_data_add_dict_u64(d, CNXK_TEL_STR_PREFIX(s, __VA_ARGS__), \ + (p)->s) +#define CNXK_TEL_DICT_STR(d, p, s, ...) \ + plt_tel_data_add_dict_string(d, CNXK_TEL_STR_PREFIX(s, __VA_ARGS__), \ +(p)->s) + +#endif /* __CNXK_TELEMETRY_H_ */ diff --git a/drivers/common/cnxk/cnxk_telemetry_npa.c b/drivers/common/cnxk/cnxk_telemetry_npa.c new file mode 100644 index 00..1c2c2cd106 --- /dev/null +++ b/drivers/common/cnxk/cnxk_telemetry_npa.c @@ -0,0 +1,227 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2021 Marvell. + */ + +#include "cnxk_telemetry.h" +#include "roc_api.h" +#include "roc_priv.h" + +#include + +static int +cnxk_tel_npa(struct plt_tel_data *d) +{ + struct npa_lf *lf; + int aura_cnt = 0; + uint32_t i; + + lf = idev_npa_obj_get(); + if (lf == NULL) + return NPA_ERR_DEVICE_NOT_BOUNDED; + + for (i = 0; i < lf->nr_pools; i++) { + if (plt_bitmap_get(lf->npa_bmp, i)) + continue; + aura_cnt++; + } + + plt_tel_data_add_dict_ptr(d, "npa", lf); + plt_tel_data_add_dict_int(d, "pf", dev_get_pf(lf->pf_func)); + plt_tel_data_add_dict_int(d, "vf", dev_get_vf(lf->pf_func)); + plt_tel_data_add_dict_int(d, "aura_cnt", aura_cnt); + + CNXK_TEL_DICT_PTR(d, lf, pci_dev); + CNXK_TEL_DICT_PTR(d, lf, npa_bmp); + CNXK_TEL_DICT_PTR(d, lf, npa_bmp_mem); + CNXK_TEL_DICT_PTR(d, lf, npa_qint_mem); + CNXK_TEL_DICT_PTR(d, lf, intr_handle); + CNXK_TEL_DICT_PTR(d, lf, mbox); + CNXK_TEL_DICT_PTR(d, lf, base); + CNXK_TEL_DICT_INT(d, lf, stack_pg_ptrs); + CNXK_TEL_DICT_INT(d, lf, stack_pg_bytes); + CNXK_TEL_DICT_INT(d, lf, npa_msixoff); + CNXK_TEL_DICT_INT(d, lf, nr_pools); + CNXK_TEL_DICT_INT(d, lf, pf_func); + CNXK_TEL_DICT_INT(d, lf, aura_sz); + CNXK_TEL_DICT_INT(d, lf, qints); + + return 0; +} + +static int +cnxk_tel_npa_aura(int aura_id, struct plt_tel_data *d) +{ + __io struct npa_aura_s *aura; + struct npa_aq_enq_req *req; + struct npa_aq_enq_rsp *rsp; + struct npa_lf *lf; + int rc; + + lf = idev_npa_obj_get(); + if (lf == NULL) + return NPA_ERR_DEVICE_NOT_BOUNDED; + + if (rte_bitmap_get(lf->npa_bmp, aura_id)) + return -1; + + req = mbox_alloc_msg_npa_aq_enq(lf->mbox); + if (!req) { + plt_err("Failed to alloc aq enq for npa"); + return -1; + } + + req->aura_id = aura_id; + req->ctype = NPA_AQ_CTYPE_AURA; + req->op = NPA_AQ_INSTOP_READ; + + rc = mbox_process_msg(lf->mbox, (void *)&rsp); + if (rc) { + plt_err("Failed to get pool(%d) context", aura_id); + return rc; + } + + aura = &rsp->aura; + CNXK_TEL_DICT_PTR(d, aura, pool_addr, w0_); + CNXK_TEL_DICT_INT(d, aura, ena, w1_); + CNXK_TEL_DICT_INT(d, aura, pool_caching, w1_); + CNXK_TEL_DICT_INT(d, aura, pool_way_mask, w1_); + CNXK_TEL_DICT_INT(d, aura, avg_con, w1_); + CNXK_TEL_DICT_INT(d, aura, pool_drop_ena, w1_); + CNXK_TEL_DICT_INT(d, aura, aura_drop_ena, w1_); + CNXK_T
Re: [dpdk-dev] [PATCH] doc: announce removal of ABIs in PCI bus driver
03/08/2021 03:52, Xia, Chenbo: > Hi Thomas, > > From: Thomas Monjalon > > 27/07/2021 10:44, Bruce Richardson: > > > On Mon, Jul 26, 2021 at 05:56:17AM +, Xia, Chenbo wrote: > > > > From: Yigit, Ferruh > > > > > On 7/23/2021 8:39 AM, Xia, Chenbo wrote: > > > > > > From: dev On Behalf Of Chenbo Xia > > > > > >> +* pci: To reduce unnecessary ABIs exposed by DPDK bus driver, > > > > > "rte_bus_pci.h" > > > > > >> + will be made internal in 21.11 and macros/data > > structures/functions > > > > > defined > > > > > >> + in the header will not be considered as ABI anymore. This change > > is > > > > > >> inspired > > > > > >> + by the RFC > > > > > https://patchwork.dpdk.org/project/dpdk/list/?series=17176. > > > > > > > > > > > > I see there's some ABI improvement work on-going and I think it > > > > > > could > > be > > > > > part of > > > > > > the work. If it makes sense to you, I'd like some ACKs. > > > > > > > > > > > > > > > > Acked-by: Ferruh Yigit > > > > > > > > > > I am for reducing the public ABI as much as possible. How big will the > > > > > change > > > > > be? Is the 'rte_bus_pci.h' used other than './drivers/bus/pci/'? > > > > > > > > I don't see big change here. And I am not sure if I understand your > > > > second > > > > question. The rte_bus_pci.h will still be used by drivers (maybe remove > > the > > > > rte prefix and change the file name). > > > > > > > The file itself will still be exported in some cases, where the end-user > > > has their own drivers which need to be compiled, so I'd recommend keeping > > > the rte_ prefix. However, I think making all bus APIs internal-only to > > > DPDK > > > is a good idea. > > > > I don't understand how it can exported _and_ internal. > > I think we can use the meson option 'enable_driver_sdk'. The first use case > is in > lib ethdev for exporting internal APIs for out-of-tree drivers. For pci bus, I > think the use case is similar: users who want to build out-of-tree drivers can > set the option true to export pci header but the structs/functions are marked > internal. Make sense to you? I understand the intent. You are saying an out-of-tree driver is considered internal. Let's see how it works for real.
Re: [dpdk-dev] [PATCH v2] doc: announce changes to eventdev library
On 2021-08-03 06:12, Jerin Jacob wrote: > On Tue, Aug 3, 2021 at 2:46 AM wrote: >> From: Pavan Nikhilesh >> >> Make driver layer as internal, remove unnecessary rte_ prefix for >> structures and functions that are not a part of public API. >> Promote experimental trace and vector APIs to stable. >> Add reserved field to `rte_event_timer` structure. >> >> Signed-off-by: Pavan Nikhilesh > Acked-by: Jerin Jacob > > > ++ Eventdev driver Maintainers. > > This list is based on items identified for 21.11 ABI improvement at > https://protect2.fireeye.com/v1/url?k=bb3a87ff-e4a1bf2d-bb3ac764-866132fe445e-d427d33ed389149e&q=1&e=db41f48a-6628-48aa-93d1-3190b8a53257&u=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2F1betlC000ua5SsSiJIcC54mCCCJnW6voH5Dqv9UxeyfE%2Fedit%23gid%3D0 > Acked-by: Mattias Rönnblom >> --- >> v2 Changes: >> - Fix build issues. >> >> doc/guides/rel_notes/deprecation.rst | 11 +++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/doc/guides/rel_notes/deprecation.rst >> b/doc/guides/rel_notes/deprecation.rst >> index d9c0e65921..6ac321eb1e 100644 >> --- a/doc/guides/rel_notes/deprecation.rst >> +++ b/doc/guides/rel_notes/deprecation.rst >> @@ -158,3 +158,14 @@ Deprecation Notices >> * security: The functions ``rte_security_set_pkt_metadata`` and >> ``rte_security_get_userdata`` will be made inline functions and >> additional >> flags will be added in structure ``rte_security_ctx`` in DPDK 21.11. >> + >> +* eventdev: The file ``rte_eventdev_pmd.h`` will be renamed to >> ``eventdev_driver.h`` >> + to make the driver interface as internal and the structures >> ``rte_eventdev_data``, >> + ``rte_eventdev`` and ``rte_eventdevs`` will be moved to a new file named >> + ``rte_eventdev_core.h`` in DPDK 21.11. >> + The ``rte_`` prefix for internal structures and functions will be removed >> across the >> + library. >> + The experimental eventdev trace APIs and ``rte_event_vector_pool_create``, >> + ``rte_event_eth_rx_adapter_vector_limits_get`` will be promoted to stable. >> + An 8byte reserved field will be added to the structure >> ``rte_event_timer`` to >> + support future extensions. >> -- >> 2.17.1 >>
[dpdk-dev] nvgre inner rss problem in mlx5
Hi nvidia teams, I test the upstream dpdk for vxlan encap offload with dpdk-testpmd # lspci | grep Ether 19:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] 19:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] Fw version is 16.31.1014 #ethtool -i net2 driver: mlx5_core version: 5.13.0-rc3+ firmware-version: 16.31.1014 (MT_80) expansion-rom-version: bus-info: :19:00.0 start the eswitch echo 0 > /sys/class/net/net2/device/sriov_numvfs echo 1 > /sys/class/net/net2/device/sriov_numvfs echo :19:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind devlink dev eswitch set pci/:19:00.0 mode switchdev echo :19:00.2 > /sys/bus/pci/drivers/mlx5_core/bind ip link shows 4: net2: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 1c:34:da:77:fb:d8 brd ff:ff:ff:ff:ff:ff vf 0 MAC 4e:41:8f:92:41:44, spoof checking off, link-state disable, trust off, query_rss off vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state disable, trust off, query_rss off 8: pf0vf0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 4e:41:8f:92:41:44 brd ff:ff:ff:ff:ff:ff 10: eth0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 46:87:9e:9e:c8:23 brd ff:ff:ff:ff:ff:ff net2 is pf, pf0vf0 is vf represntor, eth0 is vf. start the pmd ./dpdk-testpmd -c 0x1f -n 4 -m 4096 --file-prefix=ovs -a ":19:00.0,representor=pf0vf0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1" --huge-dir=/mnt/ovsdpdk -- -i --flow-isolate-all --forward-mode=rxonly --rxq=4 --txq=4 --auto-start --nb-cores=4 testpmd> set vxlan ip-version ipv4 vni 1000 udp-src 0 udp-dst 4789 ip-src 172.168.152.50 ip-dst 172.168.152.73 eth-src 1c:34:da:77:fb:d8 eth-dst 3c:fd:fe:bb:1c:0c testpmd> flow create 1 ingress priority 0 group 0 transfer pattern eth src is 46:87:9e:9e:c8:23 dst is 5a:9e:0f:74:6c:5e type is 0x0800 / ipv4 tos spec 0x0 tos mask 0x3 / end actions count / vxlan_encap / port_id original 0 id 0 / end port_flow_complain(): Caught PMD error type 16 (specific action): port does not belong to E-Switch being configured: Invalid argument Add the rule fail for "port does not belong to E-Switch being configured" I checkout with the dpdk codes In the function flow_dv_validate_action_port_id if (act_priv->domain_id != dev_priv->domain_id) return rte_flow_error_set (error, EINVAL, RTE_FLOW_ERROR_TYPE_ACTION, NULL, "port does not belong to" " E-Switch being configured"); The domain_id of vf representor is not the same as domain_id of PF. And check the mlx5_dev_spawn the vlaue of domain_id for vf representor and PF will be always diffirent. mlx5_dev_spawn /* * Look for sibling devices in order to reuse their switch domain * if any, otherwise allocate one. */ MLX5_ETH_FOREACH_DEV(port_id, NULL) { const struct mlx5_priv *opriv = rte_eth_devices[port_id].data->dev_private; if (!opriv || opriv->sh != priv->sh || opriv->domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) continue; priv->domain_id = opriv->domain_id; break; } if (priv->domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) { err = rte_eth_switch_domain_alloc(&priv->domain_id); The MLX5_ETH_FOREACH_DEV will never for PF eth_dev. mlx5_eth_find_next(uint16_t port_id, struct rte_device *odev) { while (port_id < RTE_MAX_ETHPORTS) { struct rte_eth_dev *dev = &rte_eth_devices[port_id]; if (dev->state != RTE_ETH_DEV_UNUSED && dev->device && (dev->device == odev || (dev->device->driver && dev->device->driver->name && ((strcmp(dev->device->driver->name, MLX5_PCI_DRIVER_NAME) == 0) || (strcmp(dev->device->driver->name, MLX5_AUXILIARY_DRIVER_NAME) == 0) Although the state of eth_dev is ATTACHED. But the driver is not set . The driver only set in the rte_pci_probe_one_driver which all ports on the same device is probed. So at this moment representor vf will never find the PF one, this will lead the repsentor vf choose another domain_id So in this case it should put the pci_driver to the mlx5_driver_probe (mlx5_os_pci_probe) BR wenxu
[dpdk-dev] [PATCH 00/22] backport feature support to DPDK 20.11
Below patches are the backports of features in DPDK 21.02 and DPDK 21.05. They are not for LTS upstream, just for customer to cherrypick. feature includes 1. support RSS hash for IP fragment. 2. enable QinQ filter for switch. Haiyue Wang (4): net/ice/base: do not set VLAN mode in DCF mode net/ice: fix VLAN strip for double VLAN net/ice: fix VLAN 0 adding based on VLAN mode net/ice: update QinQ switch filter handling Junfeng Guo (1): net/ice: enable QinQ filter for switch Qi Zhang (13): net/ice/base: align add VSI and update VSI AQ command buffer net/ice/base: add interface to support configuring VLAN mode net/ice/base: fix outer VLAN related macro net/ice/base: add VLAN TPID for VLAN filters net/ice/base: support checking double VLAN mode net/ice/base: support configuring device in double VLAN mode net/ice/base: update boost TCAM for DVM net/ice/base: change protocol ID for VLAN in DVM net/ice/base: refactor post DDP download VLAN mode config net/ice/base: log if DDP/FW do not support QinQ net/ice/base: add inner VLAN protocol type for QinQ filter net/ice/base: fix QinQ PPPoE dummy packet selection net/ice/base: add priority check of matching recipe Ting Xu (1): net/ice/base: fix wrong ptype bitmap for IP fragment Wenjun Wu (1): net/ice: support RSS hash for IP fragment Yuying Zhang (2): net/ice/base: add ethertype offset for QinQ dummy packet net/ice: support flow priority for DCF switch filter drivers/net/ice/base/ice_adminq_cmd.h| 268 - drivers/net/ice/base/ice_bitops.h| 45 +++ drivers/net/ice/base/ice_common.c| 38 ++ drivers/net/ice/base/ice_common.h| 4 + drivers/net/ice/base/ice_flex_pipe.c | 302 +-- drivers/net/ice/base/ice_flex_pipe.h | 12 + drivers/net/ice/base/ice_flex_type.h | 39 ++ drivers/net/ice/base/ice_flow.c | 87 - drivers/net/ice/base/ice_flow.h | 5 +- drivers/net/ice/base/ice_protocol_type.h | 1 + drivers/net/ice/base/ice_switch.c| 133 ++- drivers/net/ice/base/ice_switch.h| 15 + drivers/net/ice/base/ice_type.h | 4 + drivers/net/ice/base/ice_vlan_mode.c | 451 ++ drivers/net/ice/base/ice_vlan_mode.h | 16 + drivers/net/ice/base/meson.build | 1 + drivers/net/ice/ice_acl_filter.c | 1 + drivers/net/ice/ice_ethdev.c | 455 +-- drivers/net/ice/ice_ethdev.h | 10 +- drivers/net/ice/ice_fdir_filter.c| 1 + drivers/net/ice/ice_generic_flow.c | 51 ++- drivers/net/ice/ice_generic_flow.h | 9 + drivers/net/ice/ice_hash.c | 39 +- drivers/net/ice/ice_switch_filter.c | 128 ++- 24 files changed, 1714 insertions(+), 401 deletions(-) create mode 100644 drivers/net/ice/base/ice_vlan_mode.c create mode 100644 drivers/net/ice/base/ice_vlan_mode.h -- 2.25.1
[dpdk-dev] [PATCH 01/22] net/ice: support RSS hash for IP fragment
[ upstream commit f1ea76eb63944a65e9e0bbc32244bc7c8b4fbd1d ] [ upstream commit 664b8eb745b9b6249231cea2f2bc6ff4d4b6bc40 ] [ upstream commit 8434528175614f4cc8ab25fd28560848d8999605 ] New pattern and RSS hash flow parsing are added to handle fragmented IPv4/IPv6 packet. This patch is not for LTS upstream, just for customer to cherry-pick. Signed-off-by: Jeff Guo Signed-off-by: Ting Xu Signed-off-by: Qi Zhang Signed-off-by: Wenjun Wu Acked-by: Qi Zhang --- drivers/net/ice/base/ice_flow.c| 51 +- drivers/net/ice/base/ice_flow.h| 5 ++- drivers/net/ice/ice_generic_flow.c | 25 +++ drivers/net/ice/ice_generic_flow.h | 7 drivers/net/ice/ice_hash.c | 37 -- 5 files changed, 120 insertions(+), 5 deletions(-) diff --git a/drivers/net/ice/base/ice_flow.c b/drivers/net/ice/base/ice_flow.c index c75f58659c..049e2f0c26 100644 --- a/drivers/net/ice/base/ice_flow.c +++ b/drivers/net/ice/base/ice_flow.c @@ -13,6 +13,8 @@ #define ICE_FLOW_FLD_SZ_IPV6_PRE32_ADDR4 #define ICE_FLOW_FLD_SZ_IPV6_PRE48_ADDR6 #define ICE_FLOW_FLD_SZ_IPV6_PRE64_ADDR8 +#define ICE_FLOW_FLD_SZ_IPV4_ID2 +#define ICE_FLOW_FLD_SZ_IPV6_ID4 #define ICE_FLOW_FLD_SZ_IP_DSCP1 #define ICE_FLOW_FLD_SZ_IP_TTL 1 #define ICE_FLOW_FLD_SZ_IP_PROT1 @@ -94,6 +96,12 @@ struct ice_flow_field_info ice_flds_info[ICE_FLOW_FIELD_IDX_MAX] = { ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV6, 8, ICE_FLOW_FLD_SZ_IPV6_ADDR), /* ICE_FLOW_FIELD_IDX_IPV6_DA */ ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV6, 24, ICE_FLOW_FLD_SZ_IPV6_ADDR), + /* ICE_FLOW_FIELD_IDX_IPV4_FRAG */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV_FRAG, 4, + ICE_FLOW_FLD_SZ_IPV4_ID), + /* ICE_FLOW_FIELD_IDX_IPV6_FRAG */ + ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV_FRAG, 4, + ICE_FLOW_FLD_SZ_IPV6_ID), /* ICE_FLOW_FIELD_IDX_IPV6_PRE32_SA */ ICE_FLOW_FLD_INFO(ICE_FLOW_SEG_HDR_IPV6, 8, ICE_FLOW_FLD_SZ_IPV6_PRE32_ADDR), @@ -468,6 +476,28 @@ static const u32 ice_ptypes_gtpc_tid[] = { 0x, 0x, 0x, 0x, }; +static const u32 ice_ptypes_ipv4_frag[] = { + 0x0040, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, +}; + +static const u32 ice_ptypes_ipv6_frag[] = { + 0x, 0x, 0x0100, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, + 0x, 0x, 0x, 0x, +}; + /* Packet types for GTPU */ static const struct ice_ptype_attributes ice_attr_gtpu_session[] = { { ICE_MAC_IPV4_GTPU_IPV4_FRAG,ICE_PTYPE_ATTR_GTP_SESSION }, @@ -851,6 +881,16 @@ ice_flow_proc_seg_hdrs(struct ice_flow_prof_params *params) (const ice_bitmap_t *)ice_ptypes_ipv6_ofos_all; ice_and_bitmap(params->ptypes, params->ptypes, src, ICE_FLOW_PTYPE_MAX); + } else if ((hdrs & ICE_FLOW_SEG_HDR_IPV4) && + (hdrs & ICE_FLOW_SEG_HDR_IPV_FRAG)) { + src = (const ice_bitmap_t *)ice_ptypes_ipv4_frag; + ice_and_bitmap(params->ptypes, params->ptypes, src, + ICE_FLOW_PTYPE_MAX); + } else if ((hdrs & ICE_FLOW_SEG_HDR_IPV6) && + (hdrs & ICE_FLOW_SEG_HDR_IPV_FRAG)) { + src = (const ice_bitmap_t *)ice_ptypes_ipv6_frag; + ice_and_bitmap(params->ptypes, params->ptypes, src, + ICE_FLOW_PTYPE_MAX); } else if ((hdrs & ICE_FLOW_SEG_HDR_IPV4) && !(hdrs & ICE_FLOW_SEG_HDRS_L4_MASK_NO_OTHER)) { src = !i ? (const ice_bitmap_t *)ice_ptypes_ipv4_ofos_no_l4 : @@ -1121,6 +1161,9 @@ ice_flow_xtract_fld(struct ice_hw *hw, struct ice_flow_prof_params *params, case ICE_FLOW_FIELD_IDX_IPV4_DA: prot_id = seg == 0 ? ICE_PROT_IPV4_OF_OR_S : ICE_PROT_IPV4_IL; break; + case ICE_FLOW_FIELD_IDX_IPV4_ID: + prot_id = ICE_PROT_IPV4_OF_OR_S; + break;
[dpdk-dev] [PATCH 03/22] net/ice/base: add interface to support configuring VLAN mode
From: Qi Zhang [ upstream commit 4e4dc21e450b5860da650d255abb6e17c3e637a9 ] The VLAN mode of the device has to be configured while the global configuration lock is held while downloading the DDP, specifically after the DDP has been downloaded. In order to support this a VLAN mode interface was added. By default the device will stay in single VLAN mode (SVM), which is the current implementation. However, this can be changed by implementing the .set_dvm op. Signed-off-by: Brett Creeley Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_adminq_cmd.h | 23 ++ drivers/net/ice/base/ice_common.c | 40 drivers/net/ice/base/ice_common.h | 4 +++ drivers/net/ice/base/ice_flex_pipe.c | 7 + drivers/net/ice/base/ice_type.h | 2 ++ drivers/net/ice/base/ice_vlan_mode.c | 44 +++ drivers/net/ice/base/ice_vlan_mode.h | 32 +++ drivers/net/ice/base/meson.build | 1 + 8 files changed, 153 insertions(+) create mode 100644 drivers/net/ice/base/ice_vlan_mode.c create mode 100644 drivers/net/ice/base/ice_vlan_mode.h diff --git a/drivers/net/ice/base/ice_adminq_cmd.h b/drivers/net/ice/base/ice_adminq_cmd.h index 91d360be62..eebafee7c7 100644 --- a/drivers/net/ice/base/ice_adminq_cmd.h +++ b/drivers/net/ice/base/ice_adminq_cmd.h @@ -227,6 +227,27 @@ struct ice_aqc_get_sw_cfg_resp_elem { #define ICE_AQC_GET_SW_CONF_RESP_IS_VF BIT(15) }; +/* Set Port parameters, (direct, 0x0203) */ +struct ice_aqc_set_port_params { + __le16 cmd_flags; +#define ICE_AQC_SET_P_PARAMS_SAVE_BAD_PACKETS BIT(0) +#define ICE_AQC_SET_P_PARAMS_PAD_SHORT_PACKETS BIT(1) +#define ICE_AQC_SET_P_PARAMS_DOUBLE_VLAN_ENA BIT(2) + __le16 bad_frame_vsi; +#define ICE_AQC_SET_P_PARAMS_VSI_S 0 +#define ICE_AQC_SET_P_PARAMS_VSI_M (0x3FF << ICE_AQC_SET_P_PARAMS_VSI_S) +#define ICE_AQC_SET_P_PARAMS_VSI_VALID BIT(15) + __le16 swid; +#define ICE_AQC_SET_P_PARAMS_SWID_S0 +#define ICE_AQC_SET_P_PARAMS_SWID_M(0xFF << ICE_AQC_SET_P_PARAMS_SWID_S) +#define ICE_AQC_SET_P_PARAMS_LOGI_PORT_ID_S8 +#define ICE_AQC_SET_P_PARAMS_LOGI_PORT_ID_M\ + (0x3F << ICE_AQC_SET_P_PARAMS_LOGI_PORT_ID_S) +#define ICE_AQC_SET_P_PARAMS_IS_LOGI_PORT BIT(14) +#define ICE_AQC_SET_P_PARAMS_SWID_VALIDBIT(15) + u8 reserved[10]; +}; + /* These resource type defines are used for all switch resource * commands where a resource type is required, such as: * Get Resource Allocation command (indirect 0x0204) @@ -2713,6 +2734,7 @@ struct ice_aq_desc { struct ice_aqc_sff_eeprom read_write_sff_param; struct ice_aqc_set_port_id_led set_port_id_led; struct ice_aqc_get_sw_cfg get_sw_conf; + struct ice_aqc_set_port_params set_port_params; struct ice_aqc_sw_rules sw_rules; struct ice_aqc_storm_cfg storm_conf; struct ice_aqc_add_get_recipe add_get_recipe; @@ -2876,6 +2898,7 @@ enum ice_adminq_opc { /* internal switch commands */ ice_aqc_opc_get_sw_cfg = 0x0200, + ice_aqc_opc_set_port_params = 0x0203, /* Alloc/Free/Get Resources */ ice_aqc_opc_get_res_alloc = 0x0204, diff --git a/drivers/net/ice/base/ice_common.c b/drivers/net/ice/base/ice_common.c index 304e55e210..b5f1d8cce5 100644 --- a/drivers/net/ice/base/ice_common.c +++ b/drivers/net/ice/base/ice_common.c @@ -830,6 +830,9 @@ enum ice_status ice_init_hw(struct ice_hw *hw) if (status) goto err_unroll_fltr_mgmt_struct; ice_init_lock(&hw->tnl_lock); + + ice_init_vlan_mode_ops(hw); + return ICE_SUCCESS; err_unroll_fltr_mgmt_struct: @@ -2370,6 +2373,43 @@ void ice_clear_pxe_mode(struct ice_hw *hw) ice_aq_clear_pxe_mode(hw); } +/** + * ice_aq_set_port_params - set physical port parameters. + * @pi: pointer to the port info struct + * @bad_frame_vsi: defines the VSI to which bad frames are forwarded + * @save_bad_pac: if set packets with errors are forwarded to the bad frames VSI + * @pad_short_pac: if set transmit packets smaller than 60 bytes are padded + * @double_vlan: if set double VLAN is enabled + * @cd: pointer to command details structure or NULL + * + * Set Physical port parameters (0x0203) + */ +enum ice_status +ice_aq_set_port_params(struct ice_port_info *pi, u16 bad_frame_vsi, + bool save_bad_pac, bool pad_short_pac, bool double_vlan, + struct ice_sq_cd *cd) + +{ + struct ice_aqc_set_port_params *cmd; + struct ice_hw *hw = pi->hw; + struct ice_aq_desc desc; + u16 cmd_flags = 0; + + cmd = &desc.params.set_port_params; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_set_port_params); + cmd->bad_frame_vsi = CPU_TO_LE16(bad_frame_vsi); +
[dpdk-dev] [PATCH 02/22] net/ice/base: align add VSI and update VSI AQ command buffer
From: Qi Zhang [ upstream commit 9ea028123a0bef9f6bbf5dd1a5250b9bfa63c1ea ] Aligned the buffer the following admin commands to their new definitions: * 0x210 = add_vsi * 0x211 = update_vsi Signed-off-by: Shay Amir Signed-off-by: Qi Zhang --- drivers/net/ice/base/ice_adminq_cmd.h | 209 +- drivers/net/ice/ice_ethdev.c | 88 +-- 2 files changed, 152 insertions(+), 145 deletions(-) diff --git a/drivers/net/ice/base/ice_adminq_cmd.h b/drivers/net/ice/base/ice_adminq_cmd.h index f715fb0910..91d360be62 100644 --- a/drivers/net/ice/base/ice_adminq_cmd.h +++ b/drivers/net/ice/base/ice_adminq_cmd.h @@ -411,144 +411,151 @@ struct ice_aqc_vsi_props { #define ICE_AQ_VSI_SW_FLAG_SRC_PRUNE BIT(7) u8 sw_flags2; #define ICE_AQ_VSI_SW_FLAG_RX_PRUNE_EN_S 0 -#define ICE_AQ_VSI_SW_FLAG_RX_PRUNE_EN_M \ - (0xF << ICE_AQ_VSI_SW_FLAG_RX_PRUNE_EN_S) +#define ICE_AQ_VSI_SW_FLAG_RX_PRUNE_EN_M (0xF << ICE_AQ_VSI_SW_FLAG_RX_PRUNE_EN_S) #define ICE_AQ_VSI_SW_FLAG_RX_VLAN_PRUNE_ENA BIT(0) #define ICE_AQ_VSI_SW_FLAG_LAN_ENA BIT(4) u8 veb_stat_id; #define ICE_AQ_VSI_SW_VEB_STAT_ID_S0 -#define ICE_AQ_VSI_SW_VEB_STAT_ID_M(0x1F << ICE_AQ_VSI_SW_VEB_STAT_ID_S) +#define ICE_AQ_VSI_SW_VEB_STAT_ID_M(0x1F << ICE_AQ_VSI_SW_VEB_STAT_ID_S) #define ICE_AQ_VSI_SW_VEB_STAT_ID_VALIDBIT(5) /* security section */ u8 sec_flags; #define ICE_AQ_VSI_SEC_FLAG_ALLOW_DEST_OVRDBIT(0) #define ICE_AQ_VSI_SEC_FLAG_ENA_MAC_ANTI_SPOOF BIT(2) -#define ICE_AQ_VSI_SEC_TX_PRUNE_ENA_S 4 -#define ICE_AQ_VSI_SEC_TX_PRUNE_ENA_M (0xF << ICE_AQ_VSI_SEC_TX_PRUNE_ENA_S) +#define ICE_AQ_VSI_SEC_TX_PRUNE_ENA_S 4 +#define ICE_AQ_VSI_SEC_TX_PRUNE_ENA_M (0xF << ICE_AQ_VSI_SEC_TX_PRUNE_ENA_S) #define ICE_AQ_VSI_SEC_TX_VLAN_PRUNE_ENA BIT(0) u8 sec_reserved; /* VLAN section */ - __le16 pvid; /* VLANS include priority bits */ - u8 pvlan_reserved[2]; - u8 vlan_flags; -#define ICE_AQ_VSI_VLAN_MODE_S 0 -#define ICE_AQ_VSI_VLAN_MODE_M (0x3 << ICE_AQ_VSI_VLAN_MODE_S) -#define ICE_AQ_VSI_VLAN_MODE_UNTAGGED 0x1 -#define ICE_AQ_VSI_VLAN_MODE_TAGGED0x2 -#define ICE_AQ_VSI_VLAN_MODE_ALL 0x3 -#define ICE_AQ_VSI_PVLAN_INSERT_PVID BIT(2) -#define ICE_AQ_VSI_VLAN_EMOD_S 3 -#define ICE_AQ_VSI_VLAN_EMOD_M (0x3 << ICE_AQ_VSI_VLAN_EMOD_S) -#define ICE_AQ_VSI_VLAN_EMOD_STR_BOTH (0x0 << ICE_AQ_VSI_VLAN_EMOD_S) -#define ICE_AQ_VSI_VLAN_EMOD_STR_UP(0x1 << ICE_AQ_VSI_VLAN_EMOD_S) -#define ICE_AQ_VSI_VLAN_EMOD_STR (0x2 << ICE_AQ_VSI_VLAN_EMOD_S) -#define ICE_AQ_VSI_VLAN_EMOD_NOTHING (0x3 << ICE_AQ_VSI_VLAN_EMOD_S) - u8 pvlan_reserved2[3]; + __le16 port_based_inner_vlan; /* VLANS include priority bits */ + u8 inner_vlan_reserved[2]; + u8 inner_vlan_flags; +#define ICE_AQ_VSI_INNER_VLAN_TX_MODE_S0 +#define ICE_AQ_VSI_INNER_VLAN_TX_MODE_M(0x3 << ICE_AQ_VSI_INNER_VLAN_TX_MODE_S) +#define ICE_AQ_VSI_INNER_VLAN_TX_MODE_ACCEPTUNTAGGED 0x1 +#define ICE_AQ_VSI_INNER_VLAN_TX_MODE_ACCEPTTAGGED 0x2 +#define ICE_AQ_VSI_INNER_VLAN_TX_MODE_ALL 0x3 +#define ICE_AQ_VSI_INNER_VLAN_INSERT_PVID BIT(2) +#define ICE_AQ_VSI_INNER_VLAN_EMODE_S 3 +#define ICE_AQ_VSI_INNER_VLAN_EMODE_M (0x3 << ICE_AQ_VSI_INNER_VLAN_EMODE_S) +#define ICE_AQ_VSI_INNER_VLAN_EMODE_STR_BOTH (0x0 << ICE_AQ_VSI_INNER_VLAN_EMODE_S) +#define ICE_AQ_VSI_INNER_VLAN_EMODE_STR_UP (0x1 << ICE_AQ_VSI_INNER_VLAN_EMODE_S) +#define ICE_AQ_VSI_INNER_VLAN_EMODE_STR(0x2 << ICE_AQ_VSI_INNER_VLAN_EMODE_S) +#define ICE_AQ_VSI_INNER_VLAN_EMODE_NOTHING(0x3 << ICE_AQ_VSI_INNER_VLAN_EMODE_S) +#define ICE_AQ_VSI_INNER_VLAN_BLOCK_TX_DESCBIT(5) + u8 inner_vlan_reserved2[3]; /* ingress egress up sections */ __le32 ingress_table; /* bitmap, 3 bits per up */ -#define ICE_AQ_VSI_UP_TABLE_UP0_S 0 -#define ICE_AQ_VSI_UP_TABLE_UP0_M (0x7 << ICE_AQ_VSI_UP_TABLE_UP0_S) -#define ICE_AQ_VSI_UP_TABLE_UP1_S 3 -#define ICE_AQ_VSI_UP_TABLE_UP1_M (0x7 << ICE_AQ_VSI_UP_TABLE_UP1_S) -#define ICE_AQ_VSI_UP_TABLE_UP2_S 6 -#define ICE_AQ_VSI_UP_TABLE_UP2_M (0x7 << ICE_AQ_VSI_UP_TABLE_UP2_S) -#define ICE_AQ_VSI_UP_TABLE_UP3_S 9 -#define ICE_AQ_VSI_UP_TABLE_UP3_M (0x7 << ICE_AQ_VSI_UP_TABLE_UP3_S) -#define ICE_AQ_VSI_UP_TABLE_UP4_S 12 -#define ICE_AQ_VSI_UP_TABLE_UP4_M (0x7 << ICE_AQ_VSI_UP_TABLE_UP4_S) -#define ICE_AQ_VSI_UP_TABLE_UP5_S 15 -#define ICE_AQ_VSI_UP_TABLE_UP5_M (0x7 << ICE_AQ_VSI_UP_TABLE_UP5_S) -#define ICE_AQ_VSI_UP_TABLE_UP6_S 18 -#define ICE_AQ_VSI_UP_TABLE_UP6_M (0x7 << ICE_AQ_VSI_UP_TABLE_UP6_S) -#define ICE_AQ_VSI_UP_TABLE_UP7_S 21 -#define ICE_AQ_VSI_UP_TABLE_UP7_M (0x7 << ICE_AQ_VSI_UP_TABLE_UP7_S) +#define ICE_AQ_VSI_UP_TABLE_UP0_S 0 +#
[dpdk-dev] [PATCH 04/22] net/ice/base: fix outer VLAN related macro
From: Qi Zhang [ upstream commit 25aa214490814d14e5f8f69121c23c0b91d2aeb9 ] Fix the wrong value of ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST Fixes: 9ea028123a0b ("net/ice/base: align add VSI and update VSI AQ command buffer") Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_adminq_cmd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ice/base/ice_adminq_cmd.h b/drivers/net/ice/base/ice_adminq_cmd.h index eebafee7c7..c0617093d4 100644 --- a/drivers/net/ice/base/ice_adminq_cmd.h +++ b/drivers/net/ice/base/ice_adminq_cmd.h @@ -500,7 +500,7 @@ struct ice_aqc_vsi_props { #define ICE_AQ_VSI_OUTER_TAG_VLAN_8100 0x2 #define ICE_AQ_VSI_OUTER_TAG_VLAN_9100 0x3 #define ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_INSERTBIT(4) -#define ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST BIT(4) +#define ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST BIT(6) #define ICE_AQ_VSI_OUTER_VLAN_TX_MODE_S5 #define ICE_AQ_VSI_OUTER_VLAN_TX_MODE_M(0x3 << ICE_AQ_VSI_OUTER_VLAN_TX_MODE_S) #define ICE_AQ_VSI_OUTER_VLAN_TX_MODE_ACCEPTUNTAGGED 0x1 -- 2.25.1
[dpdk-dev] [PATCH 05/22] net/ice/base: add VLAN TPID for VLAN filters
From: Qi Zhang [ upstream commit a6b975d23c10756083357355372c4f545ddc1ebe ] Currently VLAN filters via RID4 are only based on VLAN ID. However, with incoming support for Double VLAN Mode (DVM), the driver needs to be able to support filtering on VLAN ID + VLAN TPID (i.e. 0x8100, 0x88a8, etc.). Add support for this by adding two fields to the ice_fltr_info structure. First, add the tpid_valid field so the code can determine whether or not to overwrite the default 0x8100 value for programming packets or use the tpid field. Signed-off-by: Brett Creeley Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_switch.c | 6 ++ drivers/net/ice/base/ice_switch.h | 2 ++ 2 files changed, 8 insertions(+) diff --git a/drivers/net/ice/base/ice_switch.c b/drivers/net/ice/base/ice_switch.c index e6ea04183f..8d455f5995 100644 --- a/drivers/net/ice/base/ice_switch.c +++ b/drivers/net/ice/base/ice_switch.c @@ -14,6 +14,7 @@ #define ICE_PPP_IPV6_PROTO_ID 0x0057 #define ICE_IPV6_ETHER_ID 0x86DD #define ICE_TCP_PROTO_ID 0x06 +#define ICE_ETH_P_8021Q0x8100 /* Dummy ethernet header needed in the ice_aqc_sw_rules_elem * struct to configure any switch filter rules. @@ -3034,6 +3035,7 @@ ice_fill_sw_rule(struct ice_hw *hw, struct ice_fltr_info *f_info, struct ice_aqc_sw_rules_elem *s_rule, enum ice_adminq_opc opc) { u16 vlan_id = ICE_MAX_VLAN_ID + 1; + u16 vlan_tpid = ICE_ETH_P_8021Q; void *daddr = NULL; u16 eth_hdr_sz; u8 *eth_hdr; @@ -3106,6 +3108,8 @@ ice_fill_sw_rule(struct ice_hw *hw, struct ice_fltr_info *f_info, break; case ICE_SW_LKUP_VLAN: vlan_id = f_info->l_data.vlan.vlan_id; + if (f_info->l_data.vlan.tpid_valid) + vlan_tpid = f_info->l_data.vlan.tpid; if (f_info->fltr_act == ICE_FWD_TO_VSI || f_info->fltr_act == ICE_FWD_TO_VSI_LIST) { act |= ICE_SINGLE_ACT_PRUNE; @@ -3149,6 +3153,8 @@ ice_fill_sw_rule(struct ice_hw *hw, struct ice_fltr_info *f_info, if (!(vlan_id > ICE_MAX_VLAN_ID)) { off = (_FORCE_ __be16 *)(eth_hdr + ICE_ETH_VLAN_TCI_OFFSET); *off = CPU_TO_BE16(vlan_id); + off = (_FORCE_ __be16 *)(eth_hdr + ICE_ETH_ETHTYPE_OFFSET); + *off = CPU_TO_BE16(vlan_tpid); } /* Create the switch rule with the final dummy Ethernet header */ diff --git a/drivers/net/ice/base/ice_switch.h b/drivers/net/ice/base/ice_switch.h index be9b74fd4c..392eb369c7 100644 --- a/drivers/net/ice/base/ice_switch.h +++ b/drivers/net/ice/base/ice_switch.h @@ -160,6 +160,8 @@ struct ice_fltr_info { } mac_vlan; struct { u16 vlan_id; + u16 tpid; + u8 tpid_valid; } vlan; /* Set lkup_type as ICE_SW_LKUP_ETHERTYPE * if just using ethertype as filter. Set lkup_type as -- 2.25.1
[dpdk-dev] [PATCH 06/22] net/ice/base: support checking double VLAN mode
From: Qi Zhang [ upstream commit 67285599c9f413c59118379d1f7162031ea6acdc ] If a driver wants to configure double VLAN mode (DVM) it needs to first check if the DDP supports DVM. To do this the driver needs to read the package metadata section via the upload section AQ (0x04C1). If the DDP doesn't support configuring double VLAN mode (DVM), then there is nothing to do regarding configuring the VLAN mode of the device. The set_svm() or set_dvm() ops should only be called if the current configuration supports configuring the VLAN mode of the device. Suggested-by: Jacob Keller Signed-off-by: Dan Nowlin Signed-off-by: Brett Creeley Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_bitops.h| 45 drivers/net/ice/base/ice_flex_pipe.c | 63 +- drivers/net/ice/base/ice_flex_pipe.h | 11 drivers/net/ice/base/ice_flex_type.h | 26 ++ drivers/net/ice/base/ice_vlan_mode.c | 78 5 files changed, 221 insertions(+), 2 deletions(-) diff --git a/drivers/net/ice/base/ice_bitops.h b/drivers/net/ice/base/ice_bitops.h index 39548967cc..b786bf7a18 100644 --- a/drivers/net/ice/base/ice_bitops.h +++ b/drivers/net/ice/base/ice_bitops.h @@ -449,4 +449,49 @@ ice_cmp_bitmap(ice_bitmap_t *bmp1, ice_bitmap_t *bmp2, u16 size) return true; } +/** + * ice_bitmap_from_array32 - copies u32 array source into bitmap destination + * @dst: the destination bitmap + * @src: the source u32 array + * @size: size of the bitmap (in bits) + * + * This function copies the src bitmap stored in an u32 array into the dst + * bitmap stored as an ice_bitmap_t. + */ +static inline void +ice_bitmap_from_array32(ice_bitmap_t *dst, u32 *src, u16 size) +{ + u32 remaining_bits, i; + +#define BITS_PER_U32 (sizeof(u32) * BITS_PER_BYTE) + /* clear bitmap so we only have to set when iterating */ + ice_zero_bitmap(dst, size); + + for (i = 0; i < (u32)(size / BITS_PER_U32); i++) { + u32 bit_offset = i * BITS_PER_U32; + u32 entry = src[i]; + u32 j; + + for (j = 0; j < BITS_PER_U32; j++) { + if (entry & BIT(j)) + ice_set_bit((u16)(j + bit_offset), dst); + } + } + + /* still need to check the leftover bits (i.e. if size isn't evenly +* divisible by BITS_PER_U32 +**/ + remaining_bits = size % BITS_PER_U32; + if (remaining_bits) { + u32 bit_offset = i * BITS_PER_U32; + u32 entry = src[i]; + u32 j; + + for (j = 0; j < remaining_bits; j++) { + if (entry & BIT(j)) + ice_set_bit((u16)(j + bit_offset), dst); + } + } +} + #endif /* _ICE_BITOPS_H_ */ diff --git a/drivers/net/ice/base/ice_flex_pipe.c b/drivers/net/ice/base/ice_flex_pipe.c index 6c7f83899d..e511b50a00 100644 --- a/drivers/net/ice/base/ice_flex_pipe.c +++ b/drivers/net/ice/base/ice_flex_pipe.c @@ -807,6 +807,28 @@ ice_aq_download_pkg(struct ice_hw *hw, struct ice_buf_hdr *pkg_buf, return status; } +/** + * ice_aq_upload_section + * @hw: pointer to the hardware structure + * @pkg_buf: the package buffer which will receive the section + * @buf_size: the size of the package buffer + * @cd: pointer to command details structure or NULL + * + * Upload Section (0x0C41) + */ +enum ice_status +ice_aq_upload_section(struct ice_hw *hw, struct ice_buf_hdr *pkg_buf, + u16 buf_size, struct ice_sq_cd *cd) +{ + struct ice_aq_desc desc; + + ice_debug(hw, ICE_DBG_TRACE, "%s\n", __func__); + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_upload_section); + desc.flags |= CPU_TO_LE16(ICE_AQ_FLAG_RD); + + return ice_aq_send_cmd(hw, &desc, pkg_buf, buf_size, cd); +} + /** * ice_aq_update_pkg * @hw: pointer to the hardware structure @@ -1800,7 +1822,7 @@ void ice_init_prof_result_bm(struct ice_hw *hw) * * Frees a package buffer */ -static void ice_pkg_buf_free(struct ice_hw *hw, struct ice_buf_build *bld) +void ice_pkg_buf_free(struct ice_hw *hw, struct ice_buf_build *bld) { ice_free(hw, bld); } @@ -1899,6 +1921,43 @@ ice_pkg_buf_alloc_section(struct ice_buf_build *bld, u32 type, u16 size) return NULL; } +/** + * ice_pkg_buf_alloc_single_section + * @hw: pointer to the HW structure + * @type: the section type value + * @size: the size of the section to reserve (in bytes) + * @section: returns pointer to the section + * + * Allocates a package buffer with a single section. + * Note: all package contents must be in Little Endian form. + */ +struct ice_buf_build * +ice_pkg_buf_alloc_single_section(struct ice_hw *hw, u32 type, u16 size, +void **section) +{ + struct ice_buf_build *buf; + + if (!section) + return NULL; + + buf = ice_pkg_buf_alloc
[dpdk-dev] [PATCH 08/22] net/ice/base: do not set VLAN mode in DCF mode
From: Haiyue Wang [ upstream commit 70f4e156ea52e3d8278acff30d06447eab623a15 ] The PF will set the VLAN mode globally, DCF just needs to get the VLAN mode. Signed-off-by: Haiyue Wang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_vlan_mode.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/ice/base/ice_vlan_mode.c b/drivers/net/ice/base/ice_vlan_mode.c index c86e803c52..42bb108928 100644 --- a/drivers/net/ice/base/ice_vlan_mode.c +++ b/drivers/net/ice/base/ice_vlan_mode.c @@ -354,6 +354,12 @@ static enum ice_status ice_set_svm(struct ice_hw *hw) */ enum ice_status ice_set_vlan_mode(struct ice_hw *hw) { + /* DCF only has the ability to query the VLAN mode. Setting the VLAN +* mode is done by the PF. +*/ + if (hw->dcf_enabled) + return ICE_SUCCESS; + if (!ice_is_dvm_supported(hw)) return ICE_SUCCESS; -- 2.25.1
[dpdk-dev] [PATCH 07/22] net/ice/base: support configuring device in double VLAN mode
From: Qi Zhang [ upstream commit 14e7a4b37b4f2f765b4da08019ffc9098d99a076 ] In order to support configuring the device in Double VLAN Mode (DVM), the DDP and FW have to support DVM. If both support DVM, the PF that downloads the package needs to update the default recipes and set the VLAN mode. This is done in ice_set_dvm(). In order to support updating the default recipes in DVM add support for updating an existing switch recipe's lkup_idx and mask. This is done by first calling the get recipe AQ (0x0292) with the desired recipe ID. Then, if that is successful update one of the lookup indices (lkup_idx) and its associated mask if the mask is valid otherwise the already existing mask will be used. Signed-off-by: Brett Creeley Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_adminq_cmd.h | 36 drivers/net/ice/base/ice_common.c | 2 - drivers/net/ice/base/ice_flex_pipe.c | 9 +- drivers/net/ice/base/ice_switch.c | 58 ++ drivers/net/ice/base/ice_switch.h | 12 ++ drivers/net/ice/base/ice_type.h | 2 +- drivers/net/ice/base/ice_vlan_mode.c | 284 -- drivers/net/ice/base/ice_vlan_mode.h | 24 +-- 8 files changed, 381 insertions(+), 46 deletions(-) diff --git a/drivers/net/ice/base/ice_adminq_cmd.h b/drivers/net/ice/base/ice_adminq_cmd.h index c0617093d4..1bd4f2fc8d 100644 --- a/drivers/net/ice/base/ice_adminq_cmd.h +++ b/drivers/net/ice/base/ice_adminq_cmd.h @@ -358,6 +358,40 @@ struct ice_aqc_get_allocd_res_desc { __le32 addr_low; }; +/* Request buffer for Set VLAN Mode AQ command (indirect 0x020C) */ +struct ice_aqc_set_vlan_mode { + u8 reserved; + u8 l2tag_prio_tagging; +#define ICE_AQ_VLAN_PRIO_TAG_S 0 +#define ICE_AQ_VLAN_PRIO_TAG_M (0x7 << ICE_AQ_VLAN_PRIO_TAG_S) +#define ICE_AQ_VLAN_PRIO_TAG_NOT_SUPPORTED 0x0 +#define ICE_AQ_VLAN_PRIO_TAG_STAG 0x1 +#define ICE_AQ_VLAN_PRIO_TAG_OUTER_CTAG0x2 +#define ICE_AQ_VLAN_PRIO_TAG_OUTER_VLAN0x3 +#define ICE_AQ_VLAN_PRIO_TAG_INNER_CTAG0x4 +#define ICE_AQ_VLAN_PRIO_TAG_MAX 0x4 +#define ICE_AQ_VLAN_PRIO_TAG_ERROR 0x7 + u8 l2tag_reserved[64]; + u8 rdma_packet; +#define ICE_AQ_VLAN_RDMA_TAG_S 0 +#define ICE_AQ_VLAN_RDMA_TAG_M (0x3F << ICE_AQ_VLAN_RDMA_TAG_S) +#define ICE_AQ_SVM_VLAN_RDMA_PKT_FLAG_SETTING 0x10 +#define ICE_AQ_DVM_VLAN_RDMA_PKT_FLAG_SETTING 0x1A + u8 rdma_reserved[2]; + u8 mng_vlan_prot_id; +#define ICE_AQ_VLAN_MNG_PROTOCOL_ID_OUTER 0x10 +#define ICE_AQ_VLAN_MNG_PROTOCOL_ID_INNER 0x11 + u8 prot_id_reserved[30]; +}; + +/* Response buffer for Get VLAN Mode AQ command (indirect 0x020D) */ +struct ice_aqc_get_vlan_mode { + u8 vlan_mode; +#define ICE_AQ_VLAN_MODE_DVM_ENA BIT(0) + u8 l2tag_prio_tagging; + u8 reserved[98]; +}; + /* Add VSI (indirect 0x0210) * Update VSI (indirect 0x0211) * Get VSI (indirect 0x0212) @@ -2905,6 +2939,8 @@ enum ice_adminq_opc { ice_aqc_opc_alloc_res = 0x0208, ice_aqc_opc_free_res= 0x0209, ice_aqc_opc_get_allocd_res_desc = 0x020A, + ice_aqc_opc_set_vlan_mode_parameters= 0x020C, + ice_aqc_opc_get_vlan_mode_parameters= 0x020D, /* VSI commands */ ice_aqc_opc_add_vsi = 0x0210, diff --git a/drivers/net/ice/base/ice_common.c b/drivers/net/ice/base/ice_common.c index b5f1d8cce5..0f120ec6e0 100644 --- a/drivers/net/ice/base/ice_common.c +++ b/drivers/net/ice/base/ice_common.c @@ -831,8 +831,6 @@ enum ice_status ice_init_hw(struct ice_hw *hw) goto err_unroll_fltr_mgmt_struct; ice_init_lock(&hw->tnl_lock); - ice_init_vlan_mode_ops(hw); - return ICE_SUCCESS; err_unroll_fltr_mgmt_struct: diff --git a/drivers/net/ice/base/ice_flex_pipe.c b/drivers/net/ice/base/ice_flex_pipe.c index e511b50a00..cced7b6352 100644 --- a/drivers/net/ice/base/ice_flex_pipe.c +++ b/drivers/net/ice/base/ice_flex_pipe.c @@ -1073,6 +1073,7 @@ static enum ice_status ice_download_pkg(struct ice_hw *hw, struct ice_seg *ice_seg) { struct ice_buf_table *ice_buf_tbl; + enum ice_status status; ice_debug(hw, ICE_DBG_TRACE, "%s\n", __func__); ice_debug(hw, ICE_DBG_PKG, "Segment format version: %d.%d.%d.%d\n", @@ -1090,8 +1091,12 @@ ice_download_pkg(struct ice_hw *hw, struct ice_seg *ice_seg) ice_debug(hw, ICE_DBG_PKG, "Seg buf count: %d\n", LE32_TO_CPU(ice_buf_tbl->buf_count)); - return ice_dwnld_cfg_bufs(hw, ice_buf_tbl->buf_array, - LE32_TO_CPU(ice_buf_tbl->buf_count)); + status = ice_dwnld_cfg_bufs(hw, ice_buf_tbl->buf_array, + LE32_TO_CPU(ice_buf_tbl->buf_count)); + + ic
[dpdk-dev] [PATCH 09/22] net/ice/base: update boost TCAM for DVM
From: Qi Zhang [ upstream commit f977165db0ba8435269a5e19e0e9239a4b22d140 ] Add code to update boost TCAM entries to enable DVM. This requires enabled DVM entries, and disabling SVM entries. Signed-off-by: Dan Nowlin Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_flex_pipe.c | 223 +++ drivers/net/ice/base/ice_flex_pipe.h | 1 + drivers/net/ice/base/ice_flex_type.h | 13 ++ drivers/net/ice/base/ice_type.h | 2 + drivers/net/ice/base/ice_vlan_mode.c | 7 + 5 files changed, 213 insertions(+), 33 deletions(-) diff --git a/drivers/net/ice/base/ice_flex_pipe.c b/drivers/net/ice/base/ice_flex_pipe.c index cced7b6352..058694653a 100644 --- a/drivers/net/ice/base/ice_flex_pipe.c +++ b/drivers/net/ice/base/ice_flex_pipe.c @@ -7,9 +7,17 @@ #include "ice_protocol_type.h" #include "ice_flow.h" +/* For supporting double VLAN mode, it is necessary to enable or disable certain + * boost tcam entries. The metadata labels names that match the following + * prefixes will be saved to allow enabling double VLAN mode. + */ +#define ICE_DVM_PRE"BOOST_MAC_VLAN_DVM"/* enable these entries */ +#define ICE_SVM_PRE"BOOST_MAC_VLAN_SVM"/* disable these entries */ + /* To support tunneling entries by PF, the package will append the PF number to * the label; for example TNL_VXLAN_PF0, TNL_VXLAN_PF1, TNL_VXLAN_PF2, etc. */ +#define ICE_TNL_PRE"TNL_" static const struct ice_tunnel_type_scan tnls[] = { { TNL_VXLAN,"TNL_VXLAN_PF" }, { TNL_GENEVE, "TNL_GENEVE_PF" }, @@ -452,6 +460,57 @@ ice_enum_labels(struct ice_seg *ice_seg, u32 type, struct ice_pkg_enum *state, return label->name; } +/** + * ice_add_tunnel_hint + * @hw: pointer to the HW structure + * @label_name: label text + * @val: value of the tunnel port boost entry + */ +static void ice_add_tunnel_hint(struct ice_hw *hw, char *label_name, u16 val) +{ + if (hw->tnl.count < ICE_TUNNEL_MAX_ENTRIES) { + u16 i; + + for (i = 0; tnls[i].type != TNL_LAST; i++) { + size_t len = strlen(tnls[i].label_prefix); + + /* Look for matching label start, before continuing */ + if (strncmp(label_name, tnls[i].label_prefix, len)) + continue; + + /* Make sure this label matches our PF. Note that the PF +* character ('0' - '7') will be located where our +* prefix string's null terminator is located. +*/ + if ((label_name[len] - '0') == hw->pf_id) { + hw->tnl.tbl[hw->tnl.count].type = tnls[i].type; + hw->tnl.tbl[hw->tnl.count].valid = false; + hw->tnl.tbl[hw->tnl.count].in_use = false; + hw->tnl.tbl[hw->tnl.count].marked = false; + hw->tnl.tbl[hw->tnl.count].boost_addr = val; + hw->tnl.tbl[hw->tnl.count].port = 0; + hw->tnl.count++; + break; + } + } + } +} + +/** + * ice_add_dvm_hint + * @hw: pointer to the HW structure + * @val: value of the boost entry + * @enable: true if entry needs to be enabled, or false if needs to be disabled + */ +static void ice_add_dvm_hint(struct ice_hw *hw, u16 val, bool enable) +{ + if (hw->dvm_upd.count < ICE_DVM_MAX_ENTRIES) { + hw->dvm_upd.tbl[hw->dvm_upd.count].boost_addr = val; + hw->dvm_upd.tbl[hw->dvm_upd.count].enable = enable; + hw->dvm_upd.count++; + } +} + /** * ice_init_pkg_hints * @hw: pointer to the HW structure @@ -478,40 +537,34 @@ static void ice_init_pkg_hints(struct ice_hw *hw, struct ice_seg *ice_seg) label_name = ice_enum_labels(ice_seg, ICE_SID_LBL_RXPARSER_TMEM, &state, &val); - while (label_name && hw->tnl.count < ICE_TUNNEL_MAX_ENTRIES) { - for (i = 0; tnls[i].type != TNL_LAST; i++) { - size_t len = strlen(tnls[i].label_prefix); + while (label_name) { + if (!strncmp(label_name, ICE_TNL_PRE, strlen(ICE_TNL_PRE))) + /* check for a tunnel entry */ + ice_add_tunnel_hint(hw, label_name, val); - /* Look for matching label start, before continuing */ - if (strncmp(label_name, tnls[i].label_prefix, len)) - continue; + /* check for a dvm mode entry */ + else if (!strncmp(label_name, ICE_DVM_PRE, strlen(ICE_DVM_PRE))) + ice_add_dvm_hint(hw, val, true); - /* Make sure this label matches our PF. Note that the PF -
[dpdk-dev] [PATCH 10/22] net/ice/base: change protocol ID for VLAN in DVM
From: Qi Zhang [ upstream commit 8d7bb8d500b1ccdeb30668516064337faa20b364 ] Protocol id for first vlan in Double VLAN Mode (DVM) should be ICE_VLAN_OF_HW = 16, but for Single VLAN Mode (SVM) this should be ICE_VLAN_OL_HW = 17. Change protocol id in type to id translation array for outer vlan to 17 when DVM is enabled, which means the driver, package, and firmware support DVM. Signed-off-by: Michal Swiatkowski Signed-off-by: Haiyue Wang Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_flex_pipe.c | 3 +++ drivers/net/ice/base/ice_switch.c| 19 ++- drivers/net/ice/base/ice_switch.h| 1 + 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/net/ice/base/ice_flex_pipe.c b/drivers/net/ice/base/ice_flex_pipe.c index 058694653a..a92c2b8494 100644 --- a/drivers/net/ice/base/ice_flex_pipe.c +++ b/drivers/net/ice/base/ice_flex_pipe.c @@ -1166,6 +1166,9 @@ ice_download_pkg(struct ice_hw *hw, struct ice_seg *ice_seg) ice_cache_vlan_mode(hw); + if (ice_is_dvm_ena(hw)) + ice_change_proto_id_to_dvm(); + return status; } diff --git a/drivers/net/ice/base/ice_switch.c b/drivers/net/ice/base/ice_switch.c index df8319a7e7..55bfc3e6c5 100644 --- a/drivers/net/ice/base/ice_switch.c +++ b/drivers/net/ice/base/ice_switch.c @@ -6127,7 +6127,7 @@ static const struct ice_prot_ext_tbl_entry ice_prot_ext[ICE_PROTOCOL_LAST] = { * following policy. */ -static const struct ice_protocol_entry ice_prot_id_tbl[ICE_PROTOCOL_LAST] = { +static struct ice_protocol_entry ice_prot_id_tbl[ICE_PROTOCOL_LAST] = { { ICE_MAC_OFOS, ICE_MAC_OFOS_HW }, { ICE_MAC_IL, ICE_MAC_IL_HW }, { ICE_ETYPE_OL, ICE_ETYPE_OL_HW }, @@ -6232,6 +6232,23 @@ static u16 ice_find_recp(struct ice_hw *hw, struct ice_prot_lkup_ext *lkup_exts, return ICE_MAX_NUM_RECIPES; } +/** + * ice_change_proto_id_to_dvm - change proto id in prot_id_tbl + * + * As protocol id for outer vlan is different in dvm and svm, if dvm is + * supported protocol array record for outer vlan has to be modified to + * reflect the value proper for DVM. + */ +void ice_change_proto_id_to_dvm(void) +{ + u8 i; + + for (i = 0; i < ARRAY_SIZE(ice_prot_id_tbl); i++) + if (ice_prot_id_tbl[i].type == ICE_VLAN_OFOS && + ice_prot_id_tbl[i].protocol_id != ICE_VLAN_OF_HW) + ice_prot_id_tbl[i].protocol_id = ICE_VLAN_OF_HW; +} + /** * ice_prot_type_to_id - get protocol ID from protocol type * @type: protocol type diff --git a/drivers/net/ice/base/ice_switch.h b/drivers/net/ice/base/ice_switch.h index 78e6be35a9..dd50820430 100644 --- a/drivers/net/ice/base/ice_switch.h +++ b/drivers/net/ice/base/ice_switch.h @@ -536,4 +536,5 @@ bool ice_is_prof_rule(enum ice_sw_tunnel_type type); enum ice_status ice_update_recipe_lkup_idx(struct ice_hw *hw, struct ice_update_recipe_lkup_idx_params *params); +void ice_change_proto_id_to_dvm(void); #endif /* _ICE_SWITCH_H_ */ -- 2.25.1
[dpdk-dev] [PATCH 12/22] net/ice/base: log if DDP/FW do not support QinQ
From: Qi Zhang [ upstream commit daa2ca4217ec6bf4fafb84f78985014b20cf5444 ] Currently if the driver supports QinQ there is no message/information if the DDP and/or FW don't support QinQ. Add functionality that prints if the DDP and/or FW don't support QinQ if the driver attempts to configured DVM. This will make it more obvious to users in the field that they need to update their DDP and/or FW. This required a small refactor so some of the existing code could be shared and used by this new print functionality. Signed-off-by: Brett Creeley Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_vlan_mode.c | 77 +++- 1 file changed, 65 insertions(+), 12 deletions(-) diff --git a/drivers/net/ice/base/ice_vlan_mode.c b/drivers/net/ice/base/ice_vlan_mode.c index 4340189355..775ebb2d6e 100644 --- a/drivers/net/ice/base/ice_vlan_mode.c +++ b/drivers/net/ice/base/ice_vlan_mode.c @@ -131,18 +131,11 @@ static void ice_cache_vlan_mode(struct ice_hw *hw) } /** - * ice_is_dvm_supported - check if Double VLAN Mode is supported - * @hw: pointer to the hardware structure - * - * Returns true if Double VLAN Mode (DVM) is supported and false if only Single - * VLAN Mode (SVM) is supported. In order for DVM to be supported the DDP and - * firmware must support it, otherwise only SVM is supported. This function - * should only be called while the global config lock is held and after the - * package has been successfully downloaded. + * ice_pkg_supports_dvm - find out if DDP supports DVM + * @hw: pointer to the HW structure */ -static bool ice_is_dvm_supported(struct ice_hw *hw) +static bool ice_pkg_supports_dvm(struct ice_hw *hw) { - struct ice_aqc_get_vlan_mode get_vlan_mode = { 0 }; enum ice_status status; bool pkg_supports_dvm; @@ -153,8 +146,17 @@ static bool ice_is_dvm_supported(struct ice_hw *hw) return false; } - if (!pkg_supports_dvm) - return false; + return pkg_supports_dvm; +} + +/** + * ice_fw_supports_dvm - find out if FW supports DVM + * @hw: pointer to the HW structure + */ +static bool ice_fw_supports_dvm(struct ice_hw *hw) +{ + struct ice_aqc_get_vlan_mode get_vlan_mode = { 0 }; + enum ice_status status; /* If firmware returns success, then it supports DVM, else it only * supports SVM @@ -169,6 +171,31 @@ static bool ice_is_dvm_supported(struct ice_hw *hw) return true; } +/** + * ice_is_dvm_supported - check if Double VLAN Mode is supported + * @hw: pointer to the hardware structure + * + * Returns true if Double VLAN Mode (DVM) is supported and false if only Single + * VLAN Mode (SVM) is supported. In order for DVM to be supported the DDP and + * firmware must support it, otherwise only SVM is supported. This function + * should only be called while the global config lock is held and after the + * package has been successfully downloaded. + */ +static bool ice_is_dvm_supported(struct ice_hw *hw) +{ + if (!ice_pkg_supports_dvm(hw)) { + ice_debug(hw, ICE_DBG_PKG, "DDP doesn't support DVM\n"); + return false; + } + + if (!ice_fw_supports_dvm(hw)) { + ice_debug(hw, ICE_DBG_PKG, "FW doesn't support DVM\n"); + return false; + } + + return true; +} + #define ICE_EXTERNAL_VLAN_ID_FV_IDX11 #define ICE_SW_LKUP_VLAN_LOC_LKUP_IDX 1 #define ICE_SW_LKUP_VLAN_PKT_FLAGS_LKUP_IDX2 @@ -376,6 +403,30 @@ enum ice_status ice_set_vlan_mode(struct ice_hw *hw) return ice_set_svm(hw); } +/** + * ice_print_dvm_not_supported - print if DDP and/or FW doesn't support DVM + * @hw: pointer to the HW structure + * + * The purpose of this function is to print that QinQ is not supported due to + * incompatibilty from the DDP and/or FW. This will give a hint to the user to + * update one and/or both components if they expect QinQ functionality. + */ +static void ice_print_dvm_not_supported(struct ice_hw *hw) +{ + bool pkg_supports_dvm = ice_pkg_supports_dvm(hw); + bool fw_supports_dvm = ice_fw_supports_dvm(hw); + + if (!fw_supports_dvm && !pkg_supports_dvm) + ice_info(hw, "QinQ functionality cannot be enabled on this device. " +"Update your DDP package and NVM to versions that support QinQ.\n"); + else if (!pkg_supports_dvm) + ice_info(hw, "QinQ functionality cannot be enabled on this device. " +"Update your DDP package to a version that supports QinQ.\n"); + else if (!fw_supports_dvm) + ice_info(hw, "QinQ functionality cannot be enabled on this device. " +"Update your NVM to a version that supports QinQ.\n"); +} + /** * ice_post_pkg_dwnld_vlan_mode_cfg - configure VLAN mode after DDP download * @hw: pointer to the HW structure @@ -395,4 +446,6 @@ void ic
[dpdk-dev] [PATCH 11/22] net/ice/base: refactor post DDP download VLAN mode config
From: Qi Zhang [ upstream commit 5ade55ab43e6c07a904c03ebe2d796fdea94e7e0 ] Currently it's not clear that only the first PF downloads the package and configures the VLAN mode. When this is happening all other PFs are blocked on the global configuration lock. Once the package is successfully downloaded and the global configuration lock has been released then all PFs resume initialization. This includes some post package download VLAN mode configuration. To make this more obvious add the new function ice_post_pkg_dwnld_vlan_mode_cfg() so any/all post download VLAN mode configuration code can be put in here. This also makes it more clear that all PFs will call this new function. Signed-off-by: Brett Creeley Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_flex_pipe.c | 5 + drivers/net/ice/base/ice_vlan_mode.c | 23 ++- drivers/net/ice/base/ice_vlan_mode.h | 2 +- 3 files changed, 24 insertions(+), 6 deletions(-) diff --git a/drivers/net/ice/base/ice_flex_pipe.c b/drivers/net/ice/base/ice_flex_pipe.c index a92c2b8494..e04b863de3 100644 --- a/drivers/net/ice/base/ice_flex_pipe.c +++ b/drivers/net/ice/base/ice_flex_pipe.c @@ -1164,10 +1164,7 @@ ice_download_pkg(struct ice_hw *hw, struct ice_seg *ice_seg) status = ice_dwnld_cfg_bufs(hw, ice_buf_tbl->buf_array, LE32_TO_CPU(ice_buf_tbl->buf_count)); - ice_cache_vlan_mode(hw); - - if (ice_is_dvm_ena(hw)) - ice_change_proto_id_to_dvm(); + ice_post_pkg_dwnld_vlan_mode_cfg(hw); return status; } diff --git a/drivers/net/ice/base/ice_vlan_mode.c b/drivers/net/ice/base/ice_vlan_mode.c index 4a749cb9f1..4340189355 100644 --- a/drivers/net/ice/base/ice_vlan_mode.c +++ b/drivers/net/ice/base/ice_vlan_mode.c @@ -125,7 +125,7 @@ bool ice_is_dvm_ena(struct ice_hw *hw) * configuration lock has been released because all ports on a device need to * cache the VLAN mode. */ -void ice_cache_vlan_mode(struct ice_hw *hw) +static void ice_cache_vlan_mode(struct ice_hw *hw) { hw->dvm_ena = ice_aq_is_dvm_ena(hw) ? true : false; } @@ -375,3 +375,24 @@ enum ice_status ice_set_vlan_mode(struct ice_hw *hw) return ice_set_svm(hw); } + +/** + * ice_post_pkg_dwnld_vlan_mode_cfg - configure VLAN mode after DDP download + * @hw: pointer to the HW structure + * + * This function is meant to configure any VLAN mode specific functionality + * after the global configuration lock has been released and the DDP has been + * downloaded. + * + * Since only one PF downloads the DDP and configures the VLAN mode there needs + * to be a way to configure the other PFs after the DDP has been downloaded and + * the global configuration lock has been released. All such code should go in + * this function. + */ +void ice_post_pkg_dwnld_vlan_mode_cfg(struct ice_hw *hw) +{ + ice_cache_vlan_mode(hw); + + if (ice_is_dvm_ena(hw)) + ice_change_proto_id_to_dvm(); +} diff --git a/drivers/net/ice/base/ice_vlan_mode.h b/drivers/net/ice/base/ice_vlan_mode.h index 134bd41635..c22d6c2a01 100644 --- a/drivers/net/ice/base/ice_vlan_mode.h +++ b/drivers/net/ice/base/ice_vlan_mode.h @@ -10,7 +10,7 @@ struct ice_hw; bool ice_is_dvm_ena(struct ice_hw *hw); -void ice_cache_vlan_mode(struct ice_hw *hw); enum ice_status ice_set_vlan_mode(struct ice_hw *hw); +void ice_post_pkg_dwnld_vlan_mode_cfg(struct ice_hw *hw); #endif /* _ICE_VLAN_MODE_H */ -- 2.25.1
[dpdk-dev] [PATCH 15/22] net/ice/base: fix QinQ PPPoE dummy packet selection
From: Qi Zhang [ upstream commit 03697c24b7cafbd6c536204ba6470698fcf8c5e0 ] The dummy packet should be QinQ PPPoE ipv6 when ppp protocol is ipv6. Fixes: bb3386f348dd ("net/ice: enable QinQ filter for switch") Cc: sta...@dpdk.org Signed-off-by: Yuying Zhang Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_switch.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/ice/base/ice_switch.c b/drivers/net/ice/base/ice_switch.c index 9d8dc88844..d5576267b8 100644 --- a/drivers/net/ice/base/ice_switch.c +++ b/drivers/net/ice/base/ice_switch.c @@ -7403,6 +7403,11 @@ ice_find_dummy_packet(struct ice_adv_lkup_elem *lkups, u16 lkups_cnt, *pkt_len = sizeof(dummy_qinq_pppoe_ipv4_pkt); *offsets = dummy_qinq_pppoe_ipv4_packet_offsets; return; + } else if (tun_type == ICE_SW_TUN_PPPOE_QINQ && ipv6) { + *pkt = dummy_qinq_pppoe_ipv6_packet; + *pkt_len = sizeof(dummy_qinq_pppoe_ipv6_packet); + *offsets = dummy_qinq_pppoe_packet_offsets; + return; } else if (tun_type == ICE_SW_TUN_PPPOE_QINQ || tun_type == ICE_SW_TUN_PPPOE_PAY_QINQ) { *pkt = dummy_qinq_pppoe_ipv4_pkt; -- 2.25.1
[dpdk-dev] [PATCH 13/22] net/ice/base: add ethertype offset for QinQ dummy packet
From: Yuying Zhang [ upstream commit 0c0735ff4fc15e227631cbfe3fd31e33e42b34fc ] Add the ethertype offset for QinQ switch rule dummy packet to allow matching the corresponding field. Signed-off-by: Yuying Zhang Acked-by: Qi Zhang --- drivers/net/ice/base/ice_switch.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/net/ice/base/ice_switch.c b/drivers/net/ice/base/ice_switch.c index 55bfc3e6c5..887b35ac47 100644 --- a/drivers/net/ice/base/ice_switch.c +++ b/drivers/net/ice/base/ice_switch.c @@ -1205,6 +1205,7 @@ static const u8 dummy_ipv6_l2tpv3_pkt[] = { static const struct ice_dummy_pkt_offsets dummy_qinq_ipv4_packet_offsets[] = { { ICE_MAC_OFOS, 0 }, + { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14 }, { ICE_VLAN_OFOS,18 }, { ICE_IPV4_OFOS,22 }, @@ -1215,7 +1216,8 @@ static const u8 dummy_qinq_ipv4_pkt[] = { 0x00, 0x00, 0x00, 0x00, /* ICE_MAC_OFOS 0 */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, - 0x91, 0x00, + + 0x91, 0x00, /* ICE_ETYPE_OL 12 */ 0x00, 0x00, 0x81, 0x00, /* ICE_VLAN_EX 14 */ 0x00, 0x00, 0x08, 0x00, /* ICE_VLAN_OFOS 18 */ @@ -1234,6 +1236,7 @@ static const u8 dummy_qinq_ipv4_pkt[] = { static const struct ice_dummy_pkt_offsets dummy_qinq_ipv6_packet_offsets[] = { { ICE_MAC_OFOS, 0 }, + { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14 }, { ICE_VLAN_OFOS,18 }, { ICE_IPV6_OFOS,22 }, @@ -1244,7 +1247,8 @@ static const u8 dummy_qinq_ipv6_pkt[] = { 0x00, 0x00, 0x00, 0x00, /* ICE_MAC_OFOS 0 */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, - 0x91, 0x00, + + 0x91, 0x00, /* ICE_ETYPE_OL 12 */ 0x00, 0x00, 0x81, 0x00, /* ICE_VLAN_EX 14 */ 0x00, 0x00, 0x86, 0xDD, /* ICE_VLAN_OFOS 18 */ @@ -1271,6 +1275,7 @@ static const u8 dummy_qinq_ipv6_pkt[] = { static const struct ice_dummy_pkt_offsets dummy_qinq_pppoe_packet_offsets[] = { { ICE_MAC_OFOS, 0 }, + { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14 }, { ICE_VLAN_OFOS,18 }, { ICE_PPPOE,22 }, @@ -1280,6 +1285,7 @@ static const struct ice_dummy_pkt_offsets dummy_qinq_pppoe_packet_offsets[] = { static const struct ice_dummy_pkt_offsets dummy_qinq_pppoe_ipv4_packet_offsets[] = { { ICE_MAC_OFOS, 0 }, + { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14 }, { ICE_VLAN_OFOS,18 }, { ICE_PPPOE,22 }, @@ -1291,7 +1297,8 @@ static const u8 dummy_qinq_pppoe_ipv4_pkt[] = { 0x00, 0x00, 0x00, 0x00, /* ICE_MAC_OFOS 0 */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, - 0x91, 0x00, + + 0x91, 0x00, /* ICE_ETYPE_OL 12 */ 0x00, 0x00, 0x81, 0x00, /* ICE_VLAN_EX 14 */ 0x00, 0x00, 0x88, 0x64, /* ICE_VLAN_OFOS 18 */ -- 2.25.1
[dpdk-dev] [PATCH 14/22] net/ice/base: add inner VLAN protocol type for QinQ filter
From: Qi Zhang [ upstream commit 0475c7770502cb4166b2577df3ff446af9d85515 ] Since VLAN protocol type 'ICE_VLAN_OFOS' has been changed to map the hardware VLAN protocol ID to 'ICE_VLAN_OF_HW (16)' when in Double VLAN mode, and to 'ICE_VLAN_OL_HW (17)' when in Single VLAN mode. So 'ICE_VLAN_OFOS' can't be used with 'ICE_VLAN_EX' which is outer VLAN hardware protocol ID 'ICE_VLAN_OF_HW (16)' to do the QinQ VLAN pattern. Introduce the new inner VLAN protocol type 'ICE_VLAN_IN', which is inner VLAN hardware protocol ID 'ICE_VLAN_OL_HW (17)'. Now for QinQ VLAN pattern, the protocol 'ICE_VLAN_EX' and 'ICE_VLAN_IN' should be used to set the related protocol header fields like VLAN ID. Signed-off-by: Haiyue Wang Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_protocol_type.h | 1 + drivers/net/ice/base/ice_switch.c| 23 +-- 2 files changed, 14 insertions(+), 10 deletions(-) diff --git a/drivers/net/ice/base/ice_protocol_type.h b/drivers/net/ice/base/ice_protocol_type.h index e8caefd8f9..5dc795868e 100644 --- a/drivers/net/ice/base/ice_protocol_type.h +++ b/drivers/net/ice/base/ice_protocol_type.h @@ -53,6 +53,7 @@ enum ice_protocol_type { ICE_NAT_T, ICE_GTP_NO_PAY, ICE_VLAN_EX, + ICE_VLAN_IN, ICE_PROTOCOL_LAST }; diff --git a/drivers/net/ice/base/ice_switch.c b/drivers/net/ice/base/ice_switch.c index 887b35ac47..9d8dc88844 100644 --- a/drivers/net/ice/base/ice_switch.c +++ b/drivers/net/ice/base/ice_switch.c @@ -1207,7 +1207,7 @@ static const struct ice_dummy_pkt_offsets dummy_qinq_ipv4_packet_offsets[] = { { ICE_MAC_OFOS, 0 }, { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14 }, - { ICE_VLAN_OFOS,18 }, + { ICE_VLAN_IN, 18 }, { ICE_IPV4_OFOS,22 }, { ICE_PROTOCOL_LAST,0 }, }; @@ -1220,7 +1220,7 @@ static const u8 dummy_qinq_ipv4_pkt[] = { 0x91, 0x00, /* ICE_ETYPE_OL 12 */ 0x00, 0x00, 0x81, 0x00, /* ICE_VLAN_EX 14 */ - 0x00, 0x00, 0x08, 0x00, /* ICE_VLAN_OFOS 18 */ + 0x00, 0x00, 0x08, 0x00, /* ICE_VLAN_IN 18 */ 0x45, 0x00, 0x00, 0x1c, /* ICE_IPV4_OFOS 22 */ 0x00, 0x01, 0x00, 0x00, @@ -1238,7 +1238,7 @@ static const struct ice_dummy_pkt_offsets dummy_qinq_ipv6_packet_offsets[] = { { ICE_MAC_OFOS, 0 }, { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14 }, - { ICE_VLAN_OFOS,18 }, + { ICE_VLAN_IN, 18 }, { ICE_IPV6_OFOS,22 }, { ICE_PROTOCOL_LAST,0 }, }; @@ -1251,7 +1251,7 @@ static const u8 dummy_qinq_ipv6_pkt[] = { 0x91, 0x00, /* ICE_ETYPE_OL 12 */ 0x00, 0x00, 0x81, 0x00, /* ICE_VLAN_EX 14 */ - 0x00, 0x00, 0x86, 0xDD, /* ICE_VLAN_OFOS 18 */ + 0x00, 0x00, 0x86, 0xDD, /* ICE_VLAN_IN 18 */ 0x60, 0x00, 0x00, 0x00, /* ICE_IPV6_OFOS 22 */ 0x00, 0x10, 0x11, 0x00, /* Next header UDP */ @@ -1277,7 +1277,7 @@ static const struct ice_dummy_pkt_offsets dummy_qinq_pppoe_packet_offsets[] = { { ICE_MAC_OFOS, 0 }, { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14 }, - { ICE_VLAN_OFOS,18 }, + { ICE_VLAN_IN, 18 }, { ICE_PPPOE,22 }, { ICE_PROTOCOL_LAST,0 }, }; @@ -1287,7 +1287,7 @@ struct ice_dummy_pkt_offsets dummy_qinq_pppoe_ipv4_packet_offsets[] = { { ICE_MAC_OFOS, 0 }, { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14 }, - { ICE_VLAN_OFOS,18 }, + { ICE_VLAN_IN, 18 }, { ICE_PPPOE,22 }, { ICE_IPV4_OFOS,30 }, { ICE_PROTOCOL_LAST,0 }, @@ -1301,14 +1301,14 @@ static const u8 dummy_qinq_pppoe_ipv4_pkt[] = { 0x91, 0x00, /* ICE_ETYPE_OL 12 */ 0x00, 0x00, 0x81, 0x00, /* ICE_VLAN_EX 14 */ - 0x00, 0x00, 0x88, 0x64, /* ICE_VLAN_OFOS 18 */ + 0x00, 0x00, 0x88, 0x64, /* ICE_VLAN_IN 18 */ 0x11, 0x00, 0x00, 0x00, /* ICE_PPPOE 22 */ 0x00, 0x16, 0x00, 0x21, /* PPP Link Layer 28 */ - 0x45, 0x00, 0x00, 0x14, /* ICE_IPV4_IL 30 */ + 0x45, 0x00, 0x00, 0x14, /* ICE_IPV4_OFOS 30 */ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, @@ -1322,7 +1322,7 @@ struct ice_dummy_pkt_offsets dummy_qinq_pppoe_packet_ipv6_offsets[] = { { ICE_MAC_OFOS, 0 }, { ICE_ETYPE_OL, 12 }, { ICE_VLAN_EX, 14}, - { ICE_VLAN_OFOS,18 }, + { ICE_VLAN_IN, 18 }, { ICE_PPPOE,22 }, { ICE_IPV6_OFOS,30 }, { ICE_PROTOCOL_LAST,0 }, @@ -1336,7 +1336,7 @@ static const u8 dummy_qinq_pppoe_ipv6_packet[] = { 0x91, 0x00, /* ICE_ETYPE_OL 12 */ 0x00, 0x00, 0x81, 0x00, /* ICE_VLAN_
[dpdk-dev] [PATCH 17/22] net/ice: fix VLAN 0 adding based on VLAN mode
From: Haiyue Wang [ upstream commit 295b34f55b001bceb27d9177b55326ccda49351b ] In Single VLAN Mode, single VLAN filters via ICE_SW_LKUP_VLAN are based on the inner VLAN ID, so the VLAN TPID (i.e. 0x8100 or 0x888a8) doesn't matter. In Double VLAN Mode, outer/single VLAN filters via ICE_SW_LKUP_VLAN are based on the outer/single VLAN ID + VLAN TPID. For both modes, adding a VLAN 0 + no VLAN TPID filter to handle untagged traffic when VLAN pruning is enabled. Also, this handles VLAN 0 priority tagged traffic in Single VLAN Mode, since the VLAN TPID is not part of filtering. If Double VLAN Mode is enabled then an explicit VLAN 0 + VLAN TPID filter needs to be added to allow VLAN 0 priority tagged traffic in DVM, since the VLAN TPID is part of filtering. Fixes: 14e7a4b37b4f ("net/ice/base: support configuring device in double VLAN mode") Signed-off-by: Haiyue Wang Acked-by: Qi Zhang --- drivers/net/ice/ice_ethdev.c | 136 +-- drivers/net/ice/ice_ethdev.h | 10 ++- 2 files changed, 123 insertions(+), 23 deletions(-) diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c index 1d7e5ffbc4..60793f99c6 100644 --- a/drivers/net/ice/ice_ethdev.c +++ b/drivers/net/ice/ice_ethdev.c @@ -944,12 +944,13 @@ ice_remove_mac_filter(struct ice_vsi *vsi, struct rte_ether_addr *mac_addr) /* Find out specific VLAN filter */ static struct ice_vlan_filter * -ice_find_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) +ice_find_vlan_filter(struct ice_vsi *vsi, struct ice_vlan *vlan) { struct ice_vlan_filter *f; TAILQ_FOREACH(f, &vsi->vlan_list, next) { - if (vlan_id == f->vlan_info.vlan_id) + if (vlan->tpid == f->vlan_info.vlan.tpid && + vlan->vid == f->vlan_info.vlan.vid) return f; } @@ -957,7 +958,7 @@ ice_find_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) } static int -ice_add_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) +ice_add_vlan_filter(struct ice_vsi *vsi, struct ice_vlan *vlan) { struct ice_fltr_list_entry *v_list_itr = NULL; struct ice_vlan_filter *f; @@ -965,13 +966,13 @@ ice_add_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) struct ice_hw *hw; int ret = 0; - if (!vsi || vlan_id > RTE_ETHER_MAX_VLAN_ID) + if (!vsi || vlan->vid > RTE_ETHER_MAX_VLAN_ID) return -EINVAL; hw = ICE_VSI_TO_HW(vsi); /* If it's added and configured, return. */ - f = ice_find_vlan_filter(vsi, vlan_id); + f = ice_find_vlan_filter(vsi, vlan); if (f) { PMD_DRV_LOG(INFO, "This VLAN filter already exists."); return 0; @@ -988,7 +989,9 @@ ice_add_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) ret = -ENOMEM; goto DONE; } - v_list_itr->fltr_info.l_data.vlan.vlan_id = vlan_id; + v_list_itr->fltr_info.l_data.vlan.vlan_id = vlan->vid; + v_list_itr->fltr_info.l_data.vlan.tpid = vlan->tpid; + v_list_itr->fltr_info.l_data.vlan.tpid_valid = true; v_list_itr->fltr_info.src_id = ICE_SRC_ID_VSI; v_list_itr->fltr_info.fltr_act = ICE_FWD_TO_VSI; v_list_itr->fltr_info.lkup_type = ICE_SW_LKUP_VLAN; @@ -1012,7 +1015,8 @@ ice_add_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) ret = -ENOMEM; goto DONE; } - f->vlan_info.vlan_id = vlan_id; + f->vlan_info.vlan.tpid = vlan->tpid; + f->vlan_info.vlan.vid = vlan->vid; TAILQ_INSERT_TAIL(&vsi->vlan_list, f, next); vsi->vlan_num++; @@ -1024,7 +1028,7 @@ ice_add_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) } static int -ice_remove_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) +ice_remove_vlan_filter(struct ice_vsi *vsi, struct ice_vlan *vlan) { struct ice_fltr_list_entry *v_list_itr = NULL; struct ice_vlan_filter *f; @@ -1032,17 +1036,13 @@ ice_remove_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) struct ice_hw *hw; int ret = 0; - /** -* Vlan 0 is the generic filter for untagged packets -* and can't be removed. -*/ - if (!vsi || vlan_id == 0 || vlan_id > RTE_ETHER_MAX_VLAN_ID) + if (!vsi || vlan->vid > RTE_ETHER_MAX_VLAN_ID) return -EINVAL; hw = ICE_VSI_TO_HW(vsi); /* Can't find it, return an error */ - f = ice_find_vlan_filter(vsi, vlan_id); + f = ice_find_vlan_filter(vsi, vlan); if (!f) return -EINVAL; @@ -1055,7 +1055,9 @@ ice_remove_vlan_filter(struct ice_vsi *vsi, uint16_t vlan_id) goto DONE; } - v_list_itr->fltr_info.l_data.vlan.vlan_id = vlan_id; + v_list_itr->fltr_info.l_data.vlan.vlan_id = vlan->vid; + v_list_itr->fltr_info.l_data.vlan.tpid = vlan->tpid; + v_list_itr->fltr_info.l_data.vlan.tpid_valid = true; v_lis
[dpdk-dev] [PATCH 16/22] net/ice: fix VLAN strip for double VLAN
From: Haiyue Wang [ upstream commit 8ac4307504bed19ce68b39bc2703975ee0b9ab81 ] VLAN strip was failing for double VLAN because of hardware configuration, resulting mbuf not having the vlan_tci information. Adjusted the strip setting according to current VLAN mode to fix the VLAN strip. Fixes: 14e7a4b37b4f ("net/ice/base: support configuring device in double VLAN mode") Signed-off-by: Haiyue Wang Acked-by: Qiming Yang --- drivers/net/ice/ice_ethdev.c | 297 +++ 1 file changed, 131 insertions(+), 166 deletions(-) diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c index 9ce0280726..1d7e5ffbc4 100644 --- a/drivers/net/ice/ice_ethdev.c +++ b/drivers/net/ice/ice_ethdev.c @@ -70,8 +70,6 @@ static struct proto_xtr_ol_flag ice_proto_xtr_ol_flag_params[] = { .ol_flag = &rte_net_ice_dynflag_proto_xtr_ip_offset_mask }, }; -#define ICE_DFLT_OUTER_TAG_TYPE ICE_AQ_VSI_OUTER_TAG_VLAN_9100 - #define ICE_OS_DEFAULT_PKG_NAME"ICE OS Default Package" #define ICE_COMMS_PKG_NAME "ICE COMMS Package" #define ICE_MAX_RES_DESC_NUM1024 @@ -1119,127 +1117,6 @@ ice_remove_all_mac_vlan_filters(struct ice_vsi *vsi) return ret; } -static int -ice_vsi_config_qinq_insertion(struct ice_vsi *vsi, bool on) -{ - struct ice_hw *hw = ICE_VSI_TO_HW(vsi); - struct ice_vsi_ctx ctxt; - uint8_t qinq_flags; - int ret = 0; - - /* Check if it has been already on or off */ - if (vsi->info.valid_sections & - rte_cpu_to_le_16(ICE_AQ_VSI_PROP_OUTER_TAG_VALID)) { - if (on) { - if ((vsi->info.outer_vlan_flags & -ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST) == - ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST) - return 0; /* already on */ - } else { - if (!(vsi->info.outer_vlan_flags & - ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST)) - return 0; /* already off */ - } - } - - if (on) - qinq_flags = ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST; - else - qinq_flags = 0; - /* clear global insertion and use per packet insertion */ - vsi->info.outer_vlan_flags &= ~(ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_INSERT); - vsi->info.outer_vlan_flags &= ~(ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST); - vsi->info.outer_vlan_flags |= qinq_flags; - /* use default vlan type 0x8100 */ - vsi->info.outer_vlan_flags &= ~(ICE_AQ_VSI_OUTER_TAG_TYPE_M); - vsi->info.outer_vlan_flags |= ICE_DFLT_OUTER_TAG_TYPE << -ICE_AQ_VSI_OUTER_TAG_TYPE_S; - (void)rte_memcpy(&ctxt.info, &vsi->info, sizeof(vsi->info)); - ctxt.info.valid_sections = - rte_cpu_to_le_16(ICE_AQ_VSI_PROP_OUTER_TAG_VALID); - ctxt.vsi_num = vsi->vsi_id; - ret = ice_update_vsi(hw, vsi->idx, &ctxt, NULL); - if (ret) { - PMD_DRV_LOG(INFO, - "Update VSI failed to %s qinq stripping", - on ? "enable" : "disable"); - return -EINVAL; - } - - vsi->info.valid_sections |= - rte_cpu_to_le_16(ICE_AQ_VSI_PROP_OUTER_TAG_VALID); - - return ret; -} - -static int -ice_vsi_config_qinq_stripping(struct ice_vsi *vsi, bool on) -{ - struct ice_hw *hw = ICE_VSI_TO_HW(vsi); - struct ice_vsi_ctx ctxt; - uint8_t qinq_flags; - int ret = 0; - - /* Check if it has been already on or off */ - if (vsi->info.valid_sections & - rte_cpu_to_le_16(ICE_AQ_VSI_PROP_OUTER_TAG_VALID)) { - if (on) { - if ((vsi->info.outer_vlan_flags & -ICE_AQ_VSI_OUTER_VLAN_EMODE_M) == - ICE_AQ_VSI_OUTER_VLAN_EMODE_SHOW) - return 0; /* already on */ - } else { - if ((vsi->info.outer_vlan_flags & -ICE_AQ_VSI_OUTER_VLAN_EMODE_M) == - ICE_AQ_VSI_OUTER_VLAN_EMODE_SHOW_BOTH) - return 0; /* already off */ - } - } - - if (on) - qinq_flags = ICE_AQ_VSI_OUTER_VLAN_EMODE_SHOW; - else - qinq_flags = ICE_AQ_VSI_OUTER_VLAN_EMODE_SHOW_BOTH; - vsi->info.outer_vlan_flags &= ~(ICE_AQ_VSI_OUTER_VLAN_EMODE_M); - vsi->info.outer_vlan_flags |= qinq_flags; - /* use default vlan type 0x8100 */ - vsi->info.outer_vlan_flags &= ~(ICE_AQ_VSI_OUTER_TAG_TYPE_M); - vsi->info.outer_vlan_flags |= ICE_DFLT_OUTER_TAG_TYPE << -ICE_AQ_VSI_OUTER_TAG_TYPE_S; - (void)rte_memcpy(&ctxt.info, &vsi->info, sizeof(vsi->info));
[dpdk-dev] [PATCH 20/22] net/ice/base: fix wrong ptype bitmap for IP fragment
From: Ting Xu IPv4 and IPv6 fragment ptypes are supposed to be separated from IP other ptypes. New bitmaps for IP fragment ptypes were created, but the IP fragment ptypes were not deleted from the previous non-frag bitmaps, which will cause conflicts. This patch removes IP fragment ptypes from the non-frag bitmaps. Signed-off-by: Ting Xu --- drivers/net/ice/base/ice_flow.c | 36 ++--- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/drivers/net/ice/base/ice_flow.c b/drivers/net/ice/base/ice_flow.c index 049e2f0c26..e50de9503d 100644 --- a/drivers/net/ice/base/ice_flow.c +++ b/drivers/net/ice/base/ice_flow.c @@ -226,11 +226,11 @@ static const u32 ice_ptypes_macvlan_il[] = { 0x, 0x, 0x, 0x, }; -/* Packet types for packets with an Outer/First/Single IPv4 header, does NOT - * include IPV4 other PTYPEs +/* Packet types for packets with an Outer/First/Single non-frag IPv4 header, + * does NOT include IPV4 other PTYPEs */ static const u32 ice_ptypes_ipv4_ofos[] = { - 0x1DC0, 0x24000800, 0x, 0x, + 0x1D80, 0x24000800, 0x, 0x, 0x, 0x0155, 0x, 0x, 0x, 0x000FC000, 0x02A0, 0x, 0x, 0x, 0x, 0x, @@ -240,11 +240,11 @@ static const u32 ice_ptypes_ipv4_ofos[] = { 0x, 0x, 0x, 0x, }; -/* Packet types for packets with an Outer/First/Single IPv4 header, includes - * IPV4 other PTYPEs +/* Packet types for packets with an Outer/First/Single non-frag IPv4 header, + * includes IPV4 other PTYPEs */ static const u32 ice_ptypes_ipv4_ofos_all[] = { - 0x1DC0, 0x24000800, 0x, 0x, + 0x1D80, 0x24000800, 0x, 0x, 0x, 0x0155, 0x, 0x, 0x, 0x000FC000, 0x83E0FAA0, 0x0101, 0x, 0x, 0x, 0x, @@ -266,11 +266,11 @@ static const u32 ice_ptypes_ipv4_il[] = { 0x, 0x, 0x, 0x, }; -/* Packet types for packets with an Outer/First/Single IPv6 header, does NOT - * include IVP6 other PTYPEs +/* Packet types for packets with an Outer/First/Single non-frag IPv6 header, + * does NOT include IVP6 other PTYPEs */ static const u32 ice_ptypes_ipv6_ofos[] = { - 0x, 0x, 0x7700, 0x10002000, + 0x, 0x, 0x7600, 0x10002000, 0x, 0x02AA, 0x, 0x, 0x, 0x03F0, 0x0540, 0x, 0x, 0x, 0x, 0x, @@ -280,11 +280,11 @@ static const u32 ice_ptypes_ipv6_ofos[] = { 0x, 0x, 0x, 0x, }; -/* Packet types for packets with an Outer/First/Single IPv6 header, includes - * IPV6 other PTYPEs +/* Packet types for packets with an Outer/First/Single non-frag IPv6 header, + * includes IPV6 other PTYPEs */ static const u32 ice_ptypes_ipv6_ofos_all[] = { - 0x, 0x, 0x7700, 0x10002000, + 0x, 0x, 0x7600, 0x10002000, 0x, 0x02AA, 0x, 0x, 0x, 0x03F0, 0x7C1F0540, 0x0206, 0x, 0x, 0x, 0x, @@ -306,9 +306,11 @@ static const u32 ice_ptypes_ipv6_il[] = { 0x, 0x, 0x, 0x, }; -/* Packet types for packets with an Outer/First/Single IPv4 header - no L4 */ +/* Packet types for packets with an Outer/First/Single + * non-frag IPv4 header - no L4 + */ static const u32 ice_ptypes_ipv4_ofos_no_l4[] = { - 0x10C0, 0x04000800, 0x, 0x, + 0x1080, 0x04000800, 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x000cc000, 0x02A0, 0x, 0x, 0x, 0x, 0x, @@ -330,9 +332,11 @@ static const u32 ice_ptypes_ipv4_il_no_l4[] = { 0x, 0x, 0x, 0x, }; -/* Packet types for packets with an Outer/First/Single IPv6 header - no L4 */ +/* Packet types for packets with an Outer/First/Single + * non-frag IPv6 header - no L4 + */ static const u32 ice_ptypes_ipv6_ofos_no_l4[] = { - 0x, 0x, 0x4300, 0x10002000, + 0x, 0x, 0x4200, 0x10002000, 0x, 0x, 0x, 0x, 0x, 0x0230, 0x0540, 0x, 0x, 0x, 0x, 0x, -- 2.25.1
[dpdk-dev] [PATCH 18/22] net/ice: enable QinQ filter for switch
From: Junfeng Guo [ upstream commit bb3386f348ddf1a32b752ca371146e6be5c56a8b ] Enable the double VLAN support for switch QinQ filtering. Signed-off-by: Wei Zhao Signed-off-by: Haiyue Wang Signed-off-by: Junfeng Guo Acked-by: Qi Zhang --- drivers/net/ice/ice_generic_flow.c | 8 +++ drivers/net/ice/ice_generic_flow.h | 1 + drivers/net/ice/ice_switch_filter.c | 106 +--- 3 files changed, 104 insertions(+), 11 deletions(-) diff --git a/drivers/net/ice/ice_generic_flow.c b/drivers/net/ice/ice_generic_flow.c index bc4e0a5704..c9910a65d1 100644 --- a/drivers/net/ice/ice_generic_flow.c +++ b/drivers/net/ice/ice_generic_flow.c @@ -1480,6 +1480,14 @@ enum rte_flow_item_type pattern_eth_qinq_pppoes[] = { RTE_FLOW_ITEM_TYPE_PPPOES, RTE_FLOW_ITEM_TYPE_END, }; +enum rte_flow_item_type pattern_eth_qinq_pppoes_proto[] = { + RTE_FLOW_ITEM_TYPE_ETH, + RTE_FLOW_ITEM_TYPE_VLAN, + RTE_FLOW_ITEM_TYPE_VLAN, + RTE_FLOW_ITEM_TYPE_PPPOES, + RTE_FLOW_ITEM_TYPE_PPPOE_PROTO_ID, + RTE_FLOW_ITEM_TYPE_END, +}; enum rte_flow_item_type pattern_eth_pppoes_ipv4[] = { RTE_FLOW_ITEM_TYPE_ETH, RTE_FLOW_ITEM_TYPE_PPPOES, diff --git a/drivers/net/ice/ice_generic_flow.h b/drivers/net/ice/ice_generic_flow.h index eb0368e280..1b2cdf7e4c 100644 --- a/drivers/net/ice/ice_generic_flow.h +++ b/drivers/net/ice/ice_generic_flow.h @@ -433,6 +433,7 @@ extern enum rte_flow_item_type pattern_eth_pppoes_proto[]; extern enum rte_flow_item_type pattern_eth_vlan_pppoes[]; extern enum rte_flow_item_type pattern_eth_vlan_pppoes_proto[]; extern enum rte_flow_item_type pattern_eth_qinq_pppoes[]; +extern enum rte_flow_item_type pattern_eth_qinq_pppoes_proto[]; extern enum rte_flow_item_type pattern_eth_pppoes_ipv4[]; extern enum rte_flow_item_type pattern_eth_vlan_pppoes_ipv4[]; extern enum rte_flow_item_type pattern_eth_qinq_pppoes_ipv4[]; diff --git a/drivers/net/ice/ice_switch_filter.c b/drivers/net/ice/ice_switch_filter.c index c383cade65..7e4f041a68 100644 --- a/drivers/net/ice/ice_switch_filter.c +++ b/drivers/net/ice/ice_switch_filter.c @@ -35,11 +35,15 @@ #define ICE_SW_INSET_ETHER ( \ ICE_INSET_DMAC | ICE_INSET_SMAC | ICE_INSET_ETHERTYPE) #define ICE_SW_INSET_MAC_VLAN ( \ - ICE_INSET_DMAC | ICE_INSET_SMAC | ICE_INSET_ETHERTYPE | \ - ICE_INSET_VLAN_OUTER) + ICE_INSET_DMAC | ICE_INSET_SMAC | ICE_INSET_ETHERTYPE | \ + ICE_INSET_VLAN_INNER) +#define ICE_SW_INSET_MAC_QINQ ( \ + ICE_SW_INSET_MAC_VLAN | ICE_INSET_VLAN_OUTER) #define ICE_SW_INSET_MAC_IPV4 ( \ ICE_INSET_DMAC | ICE_INSET_IPV4_DST | ICE_INSET_IPV4_SRC | \ ICE_INSET_IPV4_PROTO | ICE_INSET_IPV4_TTL | ICE_INSET_IPV4_TOS) +#define ICE_SW_INSET_MAC_QINQ_IPV4 ( \ + ICE_SW_INSET_MAC_QINQ | ICE_SW_INSET_MAC_IPV4) #define ICE_SW_INSET_MAC_IPV4_TCP ( \ ICE_INSET_DMAC | ICE_INSET_IPV4_DST | ICE_INSET_IPV4_SRC | \ ICE_INSET_IPV4_TTL | ICE_INSET_IPV4_TOS | \ @@ -52,6 +56,8 @@ ICE_INSET_DMAC | ICE_INSET_IPV6_DST | ICE_INSET_IPV6_SRC | \ ICE_INSET_IPV6_TC | ICE_INSET_IPV6_HOP_LIMIT | \ ICE_INSET_IPV6_NEXT_HDR) +#define ICE_SW_INSET_MAC_QINQ_IPV6 ( \ + ICE_SW_INSET_MAC_QINQ | ICE_SW_INSET_MAC_IPV6) #define ICE_SW_INSET_MAC_IPV6_TCP ( \ ICE_INSET_DMAC | ICE_INSET_IPV6_DST | ICE_INSET_IPV6_SRC | \ ICE_INSET_IPV6_HOP_LIMIT | ICE_INSET_IPV6_TC | \ @@ -148,6 +154,8 @@ ice_pattern_match_item ice_switch_pattern_dist_os[] = { ICE_SW_INSET_ETHER, ICE_INSET_NONE}, {pattern_ethertype_vlan, ICE_SW_INSET_MAC_VLAN, ICE_INSET_NONE}, + {pattern_ethertype_qinq, + ICE_SW_INSET_MAC_QINQ, ICE_INSET_NONE}, {pattern_eth_arp, ICE_INSET_NONE, ICE_INSET_NONE}, {pattern_eth_ipv4, @@ -182,6 +190,8 @@ ice_pattern_match_item ice_switch_pattern_dist_comms[] = { ICE_SW_INSET_ETHER, ICE_INSET_NONE}, {pattern_ethertype_vlan, ICE_SW_INSET_MAC_VLAN, ICE_INSET_NONE}, + {pattern_ethertype_qinq, + ICE_SW_INSET_MAC_QINQ, ICE_INSET_NONE}, {pattern_eth_arp, ICE_INSET_NONE, ICE_INSET_NONE}, {pattern_eth_ipv4, @@ -262,6 +272,18 @@ ice_pattern_match_item ice_switch_pattern_dist_comms[] = { ICE_INSET_NONE, ICE_INSET_NONE}, {pattern_eth_ipv6_pfcp, ICE_INSET_NONE, ICE_INSET_NONE}, + {pattern_eth_qinq_ipv4, + ICE_SW_INSET_MAC_QINQ_IPV4, ICE_INSET_NONE}, + {pattern_eth_qinq_ipv6, + ICE_SW_INSET_MAC_QINQ_IPV6, ICE_INSET_NONE}, + {pattern_eth_qinq_pppoes, + ICE_SW_INSET_MAC_PPPOE, ICE_INSET_NONE}, + {pattern_eth_qinq_pppoes_proto, + ICE_SW_INSET_MAC_PPPOE_PROTO, ICE_INSET_NONE}, + {pattern_eth_qi
[dpdk-dev] [PATCH 19/22] net/ice: update QinQ switch filter handling
From: Haiyue Wang [ upstream commit 23ea199b732bf54861aaea49e52c1089334b29ae ] The hardware outer/inner VLAN protocol types are now updated to map to new interface VLAN protocol types, so update the application to use new VLAN protocol types when the rte_flow is QinQ filter type. Signed-off-by: Haiyue Wang Acked-by: Qi Zhang --- drivers/net/ice/ice_switch_filter.c | 36 ++--- 1 file changed, 22 insertions(+), 14 deletions(-) diff --git a/drivers/net/ice/ice_switch_filter.c b/drivers/net/ice/ice_switch_filter.c index 7e4f041a68..455123ee06 100644 --- a/drivers/net/ice/ice_switch_filter.c +++ b/drivers/net/ice/ice_switch_filter.c @@ -558,12 +558,17 @@ ice_switch_inset_get(const struct rte_flow_item pattern[], bool profile_rule = 0; bool nvgre_valid = 0; bool vxlan_valid = 0; + bool qinq_valid = 0; bool ipv6_valid = 0; bool ipv4_valid = 0; bool udp_valid = 0; bool tcp_valid = 0; uint16_t j, t = 0; + if (*tun_type == ICE_SW_TUN_AND_NON_TUN_QINQ || + *tun_type == ICE_NON_TUN_QINQ) + qinq_valid = 1; + for (item = pattern; item->type != RTE_FLOW_ITEM_TYPE_END; item++) { if (item->last) { @@ -1101,22 +1106,25 @@ ice_switch_inset_get(const struct rte_flow_item pattern[], return 0; } - if (!outer_vlan_valid && - (*tun_type == ICE_SW_TUN_AND_NON_TUN_QINQ || -*tun_type == ICE_NON_TUN_QINQ)) - outer_vlan_valid = 1; - else if (!inner_vlan_valid && -(*tun_type == ICE_SW_TUN_AND_NON_TUN_QINQ || - *tun_type == ICE_NON_TUN_QINQ)) - inner_vlan_valid = 1; - else if (!inner_vlan_valid) - inner_vlan_valid = 1; + if (qinq_valid) { + if (!outer_vlan_valid) + outer_vlan_valid = 1; + else + inner_vlan_valid = 1; + } if (vlan_spec && vlan_mask) { - if (outer_vlan_valid && !inner_vlan_valid) { - list[t].type = ICE_VLAN_EX; - input_set |= ICE_INSET_VLAN_OUTER; - } else if (inner_vlan_valid) { + if (qinq_valid) { + if (!inner_vlan_valid) { + list[t].type = ICE_VLAN_EX; + input_set |= + ICE_INSET_VLAN_OUTER; + } else { + list[t].type = ICE_VLAN_IN; + input_set |= + ICE_INSET_VLAN_INNER; + } + } else { list[t].type = ICE_VLAN_OFOS; input_set |= ICE_INSET_VLAN_INNER; } -- 2.25.1
[dpdk-dev] [PATCH 21/22] net/ice: support flow priority for DCF switch filter
From: Yuying Zhang [ upstream commit 2321e34c23b386c46e4a644682e40214cf59ee4f ] This patch is not for LTS upstream, just for customer to cherry-pick. Support rte flow priority attribute for DCF switch filter. When a packet is matched by two rules, the behavior of it is not defined. This patch supports flow priority to create different recipes for this situation. Only priority 0 and 1 are supported and higher value denotes higher priority. for example: 1. flow create 0 priority 0 ingress pattern eth / vlan tci is 2 / vlan tci is 2 / end actions vf id 2 / end 2. flow create 0 priority 1 ingress pattern eth / vlan / vlan / ipv4 dst is 192.168.0.1 / end actions vf id 1 / end These two rules can be created at the same time in DCF switch filter and priority of rule 2 is higher. Packet hits rule 2 when two conditions of rules are satisfied. Signed-off-by: Yuying Zhang Acked-by: Qi Zhang --- drivers/net/ice/ice_acl_filter.c| 1 + drivers/net/ice/ice_fdir_filter.c | 1 + drivers/net/ice/ice_generic_flow.c | 18 ++ drivers/net/ice/ice_generic_flow.h | 1 + drivers/net/ice/ice_hash.c | 2 ++ drivers/net/ice/ice_switch_filter.c | 14 +- 6 files changed, 24 insertions(+), 13 deletions(-) diff --git a/drivers/net/ice/ice_acl_filter.c b/drivers/net/ice/ice_acl_filter.c index f7dbe53574..14e36aa9f6 100644 --- a/drivers/net/ice/ice_acl_filter.c +++ b/drivers/net/ice/ice_acl_filter.c @@ -904,6 +904,7 @@ ice_acl_parse(struct ice_adapter *ad, uint32_t array_len, const struct rte_flow_item pattern[], const struct rte_flow_action actions[], + uint32_t priority __rte_unused, void **meta, struct rte_flow_error *error) { diff --git a/drivers/net/ice/ice_fdir_filter.c b/drivers/net/ice/ice_fdir_filter.c index 4a071254ce..d2bc882720 100644 --- a/drivers/net/ice/ice_fdir_filter.c +++ b/drivers/net/ice/ice_fdir_filter.c @@ -2029,6 +2029,7 @@ ice_fdir_parse(struct ice_adapter *ad, uint32_t array_len, const struct rte_flow_item pattern[], const struct rte_flow_action actions[], + uint32_t priority __rte_unused, void **meta, struct rte_flow_error *error) { diff --git a/drivers/net/ice/ice_generic_flow.c b/drivers/net/ice/ice_generic_flow.c index c9910a65d1..ec141e8fa0 100644 --- a/drivers/net/ice/ice_generic_flow.c +++ b/drivers/net/ice/ice_generic_flow.c @@ -1799,6 +1799,7 @@ enum rte_flow_item_type pattern_eth_ipv6_pfcp[] = { typedef struct ice_flow_engine * (*parse_engine_t)(struct ice_adapter *ad, struct rte_flow *flow, struct ice_parser_list *parser_list, + uint32_t priority, const struct rte_flow_item pattern[], const struct rte_flow_action actions[], struct rte_flow_error *error); @@ -1990,11 +1991,10 @@ ice_flow_valid_attr(struct ice_adapter *ad, } else { *ice_pipeline_stage = ICE_FLOW_CLASSIFY_STAGE_DISTRIBUTOR_ONLY; - /* Not supported */ - if (attr->priority) { + if (attr->priority > 1) { rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, - attr, "Not support priority."); + attr, "Only support priority 0 and 1."); return -rte_errno; } } @@ -2139,6 +2139,7 @@ static struct ice_flow_engine * ice_parse_engine_create(struct ice_adapter *ad, struct rte_flow *flow, struct ice_parser_list *parser_list, + uint32_t priority, const struct rte_flow_item pattern[], const struct rte_flow_action actions[], struct rte_flow_error *error) @@ -2154,7 +2155,7 @@ ice_parse_engine_create(struct ice_adapter *ad, if (parser_node->parser->parse_pattern_action(ad, parser_node->parser->array, parser_node->parser->array_len, - pattern, actions, &meta, error) < 0) + pattern, actions, priority, &meta, error) < 0) continue; engine = parser_node->parser->engine; @@ -2172,6 +2173,7 @@ static struct ice_flow_engine * ice_parse_engine_validate(struct ice_adapter *ad, struct rte_flow *flow __rte_unused, struct ice_parser_list *parser_list, + uint32_t priority, const struct rte_flow_item pattern[], const struct rte_flow_action actions[], struct rte_flow_error *error) @@ -2184,7 +2186,7 @@ ice_parse_engine_validate(struct ice_adapter *ad, if (parser_node->par
[dpdk-dev] [PATCH 22/22] net/ice/base: add priority check of matching recipe
From: Qi Zhang [ upstream commit 2e6228787d91967a775fb1d99cd887d3f11ad5c6 ] This patch is not for LTS upstream, just for customer to cherry-pick. Check priority when look for a recipe which matches our request to enable flow priority for switch filter. Signed-off-by: Yuying Zhang Signed-off-by: Qi Zhang Acked-by: Qiming Yang --- drivers/net/ice/base/ice_switch.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/net/ice/base/ice_switch.c b/drivers/net/ice/base/ice_switch.c index d5576267b8..6c4dfec062 100644 --- a/drivers/net/ice/base/ice_switch.c +++ b/drivers/net/ice/base/ice_switch.c @@ -6172,7 +6172,7 @@ static struct ice_protocol_entry ice_prot_id_tbl[ICE_PROTOCOL_LAST] = { * Returns index of matching recipe, or ICE_MAX_NUM_RECIPES if not found. */ static u16 ice_find_recp(struct ice_hw *hw, struct ice_prot_lkup_ext *lkup_exts, -enum ice_sw_tunnel_type tun_type) +enum ice_sw_tunnel_type tun_type, u32 priority) { bool refresh_required = true; struct ice_sw_recipe *recp; @@ -6234,7 +6234,8 @@ static u16 ice_find_recp(struct ice_hw *hw, struct ice_prot_lkup_ext *lkup_exts, /* If for "i"th recipe the found was never set to false * then it means we found our match */ - if (tun_type == recp[i].tun_type && found) + if (tun_type == recp[i].tun_type && found && + priority == recp[i].priority) return i; /* Return the recipe ID */ } } @@ -7248,7 +7249,7 @@ ice_add_adv_recipe(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups, } /* Look for a recipe which matches our requested fv / mask list */ - *rid = ice_find_recp(hw, lkup_exts, rinfo->tun_type); + *rid = ice_find_recp(hw, lkup_exts, rinfo->tun_type, rinfo->priority); if (*rid < ICE_MAX_NUM_RECIPES) /* Success if found a recipe that match the existing criteria */ goto err_unroll; @@ -8377,7 +8378,7 @@ ice_rem_adv_rule(struct ice_hw *hw, struct ice_adv_lkup_elem *lkups, if (status) return status; - rid = ice_find_recp(hw, &lkup_exts, rinfo->tun_type); + rid = ice_find_recp(hw, &lkup_exts, rinfo->tun_type, rinfo->priority); /* If did not find a recipe that match the existing criteria */ if (rid == ICE_MAX_NUM_RECIPES) return ICE_ERR_PARAM; -- 2.25.1
Re: [dpdk-dev] [PATCH 2/2] vhost: notice Vhost ops struct renaming
On 7/29/21 4:42 PM, Maxime Coquelin wrote: > This patch announce the renaming of struct vhost_device_ops > to rte_vhost_device_ops in DPDK v21.11. > > Signed-off-by: Maxime Coquelin > --- > doc/guides/rel_notes/deprecation.rst | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index b34bed61a6..76ebf162bd 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -151,3 +151,6 @@ Deprecation Notices > * vhost: ``rte_vdpa_register_device``, ``rte_vdpa_unregister_device``, >``rte_vhost_host_notifier_ctrl`` and ``rte_vdpa_relay_vring_used`` vDPA >driver API will be marked as internal in DPDK v21.11. > + > +* vhost: rename ``struct vhost_device_ops`` to ``struct > rte_vhost_device_ops`` > + int DPDK v21.11. > With nits spotted by Chenbo, Acked-by: Adrian Moreno
Re: [dpdk-dev] [dpdk-announce] release candidate 21.08-rc3
Hi Thomas, The testing with dpdk 21.08-rc3 from Broadcom looks good. The following is a list of tests executed with 21.08-rc3: - Basic functionality: Send and receive multiple types of traffic. - testpmd xstats counter test. - RSS tests. - VLAN filtering tests. - Rx Checksum tests - TSO tests. - MTU and Jumbo frame tests - Changing/checking link status through testpmd. - Unicast/multicast MAC filtering tests - VXLAN/Geneve Rx CSO, TSO, RSS tests We don't see any critical issues blocking the release. Regards, Kalesh On Sun, Aug 1, 2021 at 2:49 AM Thomas Monjalon wrote: > A new DPDK release candidate is ready for testing: > https://git.dpdk.org/dpdk/tag/?id=v21.08-rc3 > > There are 70 new patches in this snapshot. > > Release notes: > https://doc.dpdk.org/guides/rel_notes/release_21_08.html > > Please check all pending announces of deprecations for 21.11. > https://patches.dpdk.org/project/dpdk/list/?q=announce > > You may share some release validation results > by replying to this message at dev@dpdk.org. > > DPDK 21.08-rc4 is expected on Wednesday. > > Thank you everyone > > > -- Regards, Kalesh A P
[dpdk-dev] [PATCH 0/2] Use macro to print MAC address
Added macros to simplyfy print of MAC address. The other method of first formatting mac address into a string and string printed, is avoided. Aman Singh (2): net: macro for MAC address print net: macro to extract MAC address bytes app/pdump/main.c | 5 +--- app/test-pmd/cmdline.c| 6 ++-- app/test-pmd/config.c | 6 ++-- app/test-pmd/testpmd.c| 9 ++ app/test/test_event_eth_rx_adapter.c | 5 +--- app/test/test_event_eth_tx_adapter.c | 5 +--- drivers/bus/dpaa/base/fman/netcfg_layer.c | 9 ++ drivers/common/mlx5/linux/mlx5_nl.c | 6 ++-- drivers/net/bnx2x/bnx2x.c | 4 +-- drivers/net/bnx2x/bnx2x_vfpf.c| 10 ++- drivers/net/bnx2x/ecore_sp.c | 14 - drivers/net/bnxt/bnxt_ethdev.c| 2 +- drivers/net/bonding/rte_eth_bond_8023ad.c | 4 +-- drivers/net/bonding/rte_eth_bond_pmd.c| 12 +++- drivers/net/dpaa/dpaa_ethdev.c| 10 ++- drivers/net/e1000/igb_ethdev.c| 9 ++ drivers/net/enic/base/vnic_dev.c | 4 +-- drivers/net/enic/enic_res.c | 2 +- drivers/net/failsafe/failsafe.c | 6 ++-- drivers/net/hinic/hinic_pmd_ethdev.c | 6 ++-- drivers/net/i40e/i40e_ethdev_vf.c | 21 -- drivers/net/iavf/iavf_ethdev.c| 18 +++- drivers/net/iavf/iavf_vchnl.c | 15 +++--- drivers/net/ice/ice_dcf.c | 6 ++-- drivers/net/ixgbe/ixgbe_ethdev.c | 29 --- drivers/net/mlx4/mlx4.c | 7 ++--- drivers/net/mlx5/linux/mlx5_os.c | 7 ++--- drivers/net/mlx5/windows/mlx5_os.c| 7 ++--- drivers/net/mvpp2/mrvl_flow.c | 4 +-- drivers/net/netvsc/hn_rndis.c | 2 +- drivers/net/nfp/nfp_net.c | 2 +- drivers/net/qede/base/ecore_mcp.c | 2 +- drivers/net/qede/base/ecore_sriov.c | 2 +- drivers/net/qede/qede_ethdev.c| 9 ++ drivers/net/thunderx/nicvf_ethdev.c | 2 +- drivers/net/txgbe/txgbe_ethdev_vf.c | 29 --- drivers/net/virtio/virtio_ethdev.c| 4 +-- drivers/net/vmxnet3/vmxnet3_ethdev.c | 4 +-- examples/bbdev_app/main.c | 9 ++ examples/bond/main.c | 3 +- examples/distributor/main.c | 5 +--- examples/ethtool/ethtool-app/ethapp.c | 10 ++- .../pipeline_worker_generic.c | 5 +--- .../eventdev_pipeline/pipeline_worker_tx.c| 5 +--- examples/flow_classify/flow_classify.c| 5 +--- examples/ioat/ioatfwd.c | 9 ++ examples/ip_pipeline/cli.c| 11 ++- examples/ipsec-secgw/ipsec-secgw.c| 5 +--- examples/l2fwd-cat/l2fwd-cat.c| 5 +--- examples/l2fwd-crypto/main.c | 11 ++- examples/l2fwd-event/l2fwd_common.c | 9 ++ examples/l2fwd-jobstats/main.c| 11 ++- examples/l2fwd-keepalive/main.c | 9 ++ examples/l2fwd/main.c | 11 ++- examples/link_status_interrupt/main.c | 9 ++ examples/packet_ordering/main.c | 5 +--- examples/pipeline/cli.c | 6 ++-- examples/rxtx_callbacks/main.c| 4 +-- examples/server_node_efd/server/main.c| 6 ++-- examples/skeleton/basicfwd.c | 5 +--- examples/vhost/main.c | 17 +++ examples/vm_power_manager/channel_monitor.c | 4 +-- .../guest_cli/vm_power_cli_guest.c| 5 +--- examples/vm_power_manager/main.c | 5 +--- examples/vmdq/main.c | 14 ++--- examples/vmdq_dcb/main.c | 14 ++--- lib/net/rte_ether.h | 14 + lib/vhost/vhost_user.c| 2 +- 68 files changed, 156 insertions(+), 381 deletions(-) -- 2.17.1
[dpdk-dev] [PATCH 1/2] net: macro for MAC address print
Added macro to print six bytes of MAC address. The MAC addresses will be printed in lower case hexdecimal format. In case there is a specific check for upper case MAC address, the user may need to make a change in such test case after this patch. Signed-off-by: Aman Singh --- app/test-pmd/cmdline.c| 2 +- app/test-pmd/config.c | 2 +- app/test-pmd/testpmd.c| 2 +- drivers/bus/dpaa/base/fman/netcfg_layer.c | 2 +- drivers/common/mlx5/linux/mlx5_nl.c | 2 +- drivers/net/bnx2x/bnx2x.c | 4 ++-- drivers/net/bnx2x/bnx2x_vfpf.c| 3 ++- drivers/net/bnx2x/ecore_sp.c | 14 +++--- drivers/net/bnxt/bnxt_ethdev.c| 2 +- drivers/net/bonding/rte_eth_bond_8023ad.c | 4 ++-- drivers/net/bonding/rte_eth_bond_pmd.c| 4 ++-- drivers/net/dpaa/dpaa_ethdev.c| 2 +- drivers/net/e1000/igb_ethdev.c| 2 +- drivers/net/enic/base/vnic_dev.c | 4 ++-- drivers/net/enic/enic_res.c | 2 +- drivers/net/failsafe/failsafe.c | 2 +- drivers/net/hinic/hinic_pmd_ethdev.c | 2 +- drivers/net/i40e/i40e_ethdev_vf.c | 6 +++--- drivers/net/iavf/iavf_ethdev.c| 4 ++-- drivers/net/iavf/iavf_vchnl.c | 4 ++-- drivers/net/ice/ice_dcf.c | 2 +- drivers/net/ixgbe/ixgbe_ethdev.c | 6 +++--- drivers/net/mlx4/mlx4.c | 2 +- drivers/net/mlx5/linux/mlx5_os.c | 2 +- drivers/net/mlx5/windows/mlx5_os.c| 2 +- drivers/net/mvpp2/mrvl_flow.c | 4 ++-- drivers/net/netvsc/hn_rndis.c | 2 +- drivers/net/nfp/nfp_net.c | 2 +- drivers/net/qede/base/ecore_mcp.c | 2 +- drivers/net/qede/base/ecore_sriov.c | 2 +- drivers/net/qede/qede_ethdev.c| 2 +- drivers/net/thunderx/nicvf_ethdev.c | 2 +- drivers/net/txgbe/txgbe_ethdev_vf.c | 6 +++--- drivers/net/virtio/virtio_ethdev.c| 4 ++-- drivers/net/vmxnet3/vmxnet3_ethdev.c | 4 ++-- examples/bbdev_app/main.c | 2 +- examples/ethtool/ethtool-app/ethapp.c | 2 +- examples/ioat/ioatfwd.c | 2 +- examples/ip_pipeline/cli.c| 4 ++-- examples/l2fwd-crypto/main.c | 2 +- examples/l2fwd-event/l2fwd_common.c | 2 +- examples/l2fwd-jobstats/main.c| 2 +- examples/l2fwd-keepalive/main.c | 2 +- examples/l2fwd/main.c | 2 +- examples/link_status_interrupt/main.c | 2 +- examples/pipeline/cli.c | 2 +- examples/server_node_efd/server/main.c| 2 +- examples/vhost/main.c | 2 +- examples/vmdq/main.c | 2 +- examples/vmdq_dcb/main.c | 2 +- lib/net/rte_ether.h | 5 + lib/vhost/vhost_user.c| 2 +- 52 files changed, 79 insertions(+), 73 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 82253bc751..d4186eb9b2 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -10899,7 +10899,7 @@ static void cmd_mcast_addr_parsed(void *parsed_result, if (!rte_is_multicast_ether_addr(&res->mc_addr)) { fprintf(stderr, - "Invalid multicast addr %02X:%02X:%02X:%02X:%02X:%02X\n", + "Invalid multicast addr " RTE_ETHER_ADDR_PRT_FMT "\n", res->mc_addr.addr_bytes[0], res->mc_addr.addr_bytes[1], res->mc_addr.addr_bytes[2], res->mc_addr.addr_bytes[3], res->mc_addr.addr_bytes[4], res->mc_addr.addr_bytes[5]); diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index 31d8ba1b91..21d5db5297 100644 --- a/app/test-pmd/config.c +++ b/app/test-pmd/config.c @@ -782,7 +782,7 @@ port_summary_display(portid_t port_id) if (ret != 0) return; - printf("%-4d %02X:%02X:%02X:%02X:%02X:%02X %-12s %-14s %-8s %s\n", + printf("%-4d " RTE_ETHER_ADDR_PRT_FMT " %-12s %-14s %-8s %s\n", port_id, mac_addr.addr_bytes[0], mac_addr.addr_bytes[1], mac_addr.addr_bytes[2], mac_addr.addr_bytes[3], mac_addr.addr_bytes[4], mac_addr.addr_bytes[5], name, diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 6cbe9ba3c8..d0ede963ea 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -2622,7 +2622,7 @@ start_port(portid_t pid) pi); if (eth_macaddr_get_print_err(pi, &port->eth_addr) == 0) - printf("Port %d: %02X:%02X:%02X:%02X:%02X:%02X\n", pi, + printf("Port %d: " RTE_ETHER_ADDR_PRT_FMT "\n", pi, port->eth_addr.addr_bytes[0], port->eth_addr.addr_bytes[1],
[dpdk-dev] [PATCH 2/2] net: macro to extract MAC address bytes
Added macros to simplyfy print of MAC address. The other method of first formatting mac address into a string and string printed, is avoided. Signed-off-by: Aman Singh --- The change in the document will be done in seperate patch. To ensure document has direct reference of the code. --- app/pdump/main.c | 5 +--- app/test-pmd/cmdline.c| 4 +--- app/test-pmd/config.c | 4 +--- app/test-pmd/testpmd.c| 7 +- app/test/test_event_eth_rx_adapter.c | 5 +--- app/test/test_event_eth_tx_adapter.c | 5 +--- drivers/bus/dpaa/base/fman/netcfg_layer.c | 7 +- drivers/common/mlx5/linux/mlx5_nl.c | 4 +--- drivers/net/bnx2x/bnx2x_vfpf.c| 7 +- drivers/net/bonding/rte_eth_bond_pmd.c| 8 ++- drivers/net/dpaa/dpaa_ethdev.c| 8 +-- drivers/net/e1000/igb_ethdev.c| 7 +- drivers/net/failsafe/failsafe.c | 4 +--- drivers/net/hinic/hinic_pmd_ethdev.c | 4 +--- drivers/net/i40e/i40e_ethdev_vf.c | 15 +++- drivers/net/iavf/iavf_ethdev.c| 14 ++- drivers/net/iavf/iavf_vchnl.c | 11 ++--- drivers/net/ice/ice_dcf.c | 4 +--- drivers/net/ixgbe/ixgbe_ethdev.c | 23 +++ drivers/net/mlx4/mlx4.c | 5 +--- drivers/net/mlx5/linux/mlx5_os.c | 5 +--- drivers/net/mlx5/windows/mlx5_os.c| 5 +--- drivers/net/qede/qede_ethdev.c| 7 +- drivers/net/txgbe/txgbe_ethdev_vf.c | 23 +++ examples/bbdev_app/main.c | 7 +- examples/bond/main.c | 3 +-- examples/distributor/main.c | 5 +--- examples/ethtool/ethtool-app/ethapp.c | 8 +-- .../pipeline_worker_generic.c | 5 +--- .../eventdev_pipeline/pipeline_worker_tx.c| 5 +--- examples/flow_classify/flow_classify.c| 5 +--- examples/ioat/ioatfwd.c | 7 +- examples/ip_pipeline/cli.c| 9 ++-- examples/ipsec-secgw/ipsec-secgw.c| 5 +--- examples/l2fwd-cat/l2fwd-cat.c| 5 +--- examples/l2fwd-crypto/main.c | 9 ++-- examples/l2fwd-event/l2fwd_common.c | 7 +- examples/l2fwd-jobstats/main.c| 9 ++-- examples/l2fwd-keepalive/main.c | 7 +- examples/l2fwd/main.c | 9 ++-- examples/link_status_interrupt/main.c | 7 +- examples/packet_ordering/main.c | 5 +--- examples/pipeline/cli.c | 4 +--- examples/rxtx_callbacks/main.c| 4 +--- examples/server_node_efd/server/main.c| 4 +--- examples/skeleton/basicfwd.c | 5 +--- examples/vhost/main.c | 15 +++- examples/vm_power_manager/channel_monitor.c | 4 +--- .../guest_cli/vm_power_cli_guest.c| 5 +--- examples/vm_power_manager/main.c | 5 +--- examples/vmdq/main.c | 12 ++ examples/vmdq_dcb/main.c | 12 ++ lib/net/rte_ether.h | 9 53 files changed, 78 insertions(+), 309 deletions(-) diff --git a/app/pdump/main.c b/app/pdump/main.c index 63bbe65cd8..46f9d25db0 100644 --- a/app/pdump/main.c +++ b/app/pdump/main.c @@ -612,10 +612,7 @@ configure_vdev(uint16_t port_id) printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8 " %02"PRIx8" %02"PRIx8" %02"PRIx8"\n", - port_id, - addr.addr_bytes[0], addr.addr_bytes[1], - addr.addr_bytes[2], addr.addr_bytes[3], - addr.addr_bytes[4], addr.addr_bytes[5]); + port_id, RTE_ETHER_ADDR_BYTES(&addr)); ret = rte_eth_promiscuous_enable(port_id); if (ret != 0) { diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index d4186eb9b2..a5d6c20be1 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -10900,9 +10900,7 @@ static void cmd_mcast_addr_parsed(void *parsed_result, if (!rte_is_multicast_ether_addr(&res->mc_addr)) { fprintf(stderr, "Invalid multicast addr " RTE_ETHER_ADDR_PRT_FMT "\n", - res->mc_addr.addr_bytes[0], res->mc_addr.addr_bytes[1], - res->mc_addr.addr_bytes[2], res->mc_addr.addr_bytes[3], - res->mc_addr.addr_bytes[4], res->mc_addr.addr_bytes[5]); + RTE_ETHER_ADDR_BYTES(&res->mc_addr)); return; } if (strcmp(res->what, "add") == 0) diff --git a/app/test-pmd/config.c b/app/test-pmd/conf
Re: [dpdk-dev] [EXT] Re: [RFC v2 1/3] eventdev: allow for event devices requiring maintenance
> -Original Message- > From: Mattias Rönnblom > Sent: Tuesday, August 3, 2021 1:57 PM > To: Jerin Jacob > Cc: Jerin Jacob Kollanukkaran ; dpdk-dev > ; Richard Eklycke ; Liron Himi > > Subject: [EXT] Re: [dpdk-dev] [RFC v2 1/3] eventdev: allow for event devices > requiring maintenance > > External Email > > -- > On 2021-08-03 06:39, Jerin Jacob wrote: > > On Mon, Aug 2, 2021 at 9:45 PM Mattias Rönnblom > > wrote: > >> > >> Extend Eventdev API to allow for event devices which require various > >> forms of internal processing to happen, even when events are not > >> enqueued to or dequeued from a port. > >> > >> RFC v2: > >>- Change rte_event_maintain() return type to be consistent > >> with the documentation. > >>- Remove unused typedef from eventdev_pmd.h. > >> > >> Signed-off-by: Mattias Rönnblom > >> Tested-by: Richard Eklycke > >> Tested-by: Liron Himi > >> --- > >> lib/eventdev/rte_eventdev.h | 62 > + > >> 1 file changed, 62 insertions(+) > >> > >> +/** > >> + * Maintain an event device. > >> + * > >> + * This function is only relevant for event devices which has the > >> + * RTE_EVENT_DEV_CAP_REQUIRES_MAINT flag set. Such devices requires > >> + * the application to call rte_event_maintain() on a port during > >> +periods > >> + * which it is neither enqueuing nor dequeuing events from this > >> + * port. No port may be left unattended. > >> + * > >> + * An event device's rte_event_maintain() is a low overhead > >> +function. In > >> + * situations when rte_event_maintain() must be called, the > >> +application > >> + * should do so often. > > > > See rte_service_component_register() scheme, If a driver needs > > additional house keeping it can use DPDK's service core scheme to > > abstract different driver requirements.We may not need any public API for > this. > > > > What DSW requires, and indeed any event device that does software-level > event buffering, is a way schedule the execution of some function to some time > later, on the lcore that currently "owns" that port. > > Put differently; it's not that the driver "needs some cycles at time T", but > "it > needs some cycles at time T on the lcore thread that currently is the user of > eventdev port X". > > The DSW output buffers and other per-port data structures aren't, for > simplicity > and performance, MT safe. That's one of the reasons the processing can't be > done by a random service lcore. > > Pushing output buffering into the application (or whatever is accessing the > event device) is not a solution to the DSW<->adapter integration issue, since > DSW also requires per-port deferred work for the flow migration machinery. In > addition, if you have a look at the RX adapter, for example, you'll see that > the > buffering logic adds complexity to the "application". > > The services cores are a rather course-grained deferred work construct. > A more elaborate one might well have been the basis of a better solution than > the proposed rte_event_maintain(), user-driven API. > > rte_event_maintain() is a crude way to make the Ethernet/Crypto/Timer > adapters work with DSW. I would argue it still puts us in a better position > than > we are today, where the DSW+adapter combo doesn't work at all. + Adapter maintainers - May only concern of this making as public API where application does not know what Interval and when to call it. - We can create an internal API which call be used by Adapters API. No Need to expose public evendev API for this. > > If/when a more fancy DPDK deferred work framework comes along, > rte_event_maintain() may be deprecated. Something like work queues in Linux > could work, run as a DPDK service. In such a case, you might also need to > require a service-cores-only deployment, and thus disallow the use of user- > launched lcore threads. > > That, however, is not a couple of tiny patches.
Re: [dpdk-dev] [PATCH v5] build: optional NUMA and cpu counts detection
> -Original Message- > From: David Christensen > Sent: Tuesday, August 3, 2021 1:29 AM > To: Juraj Linkeš ; tho...@monjalon.net; > david.march...@redhat.com; bruce.richard...@intel.com; > honnappa.nagaraha...@arm.com; ruifeng.w...@arm.com; > ferruh.yi...@intel.com; jerinjac...@gmail.com; jer...@marvell.com; > step...@networkplumber.org; Piotr Kubaj > Cc: dev@dpdk.org > Subject: Re: [PATCH v5] build: optional NUMA and cpu counts detection > > > > On 8/2/21 5:44 AM, Juraj Linkeš wrote: > >> +if os.name == 'posix': > >> +if os.path.isdir('/sys/devices/system/node'): > >> +numa_nodes = glob.glob('/sys/devices/system/node/node*') > >> +numa_nodes.sort() > >> +print(int(os.path.basename(numa_nodes[-1])[4:]) + 1) > >> +else: > >> +subprocess.run(['sysctl', '-n', 'vm.ndomains'], check=False) > >> + > > > > Bruce, David, Thomas, > > > > Is DPDK actually supported on Power9 FreeBSD? Is anyone using this > combination? How can we address the open question of what exactly does > sysctl -n vm.ndomains return on a Power9 FreeBSD system? Or should we just > leave it as is? Or maybe add 1 to the output (as we do in other cases)? > > Not supported within IBM, but you can buy OpenPOWER boxes from 3rd parties > such as Raptor Computing Systems so there may be customers using DPDK on > POWER with FreeBSD that I don't track. Adding Piotr Kubaj who has commented > on POWER/FreeBSD issues in this past. > > Dave Thanks, David. Piotr, to provide more context, we're trying to figure out what the highest NUMA node on a system is. On P9 systems, here's how NUMA nodes look like in Linux: NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 NUMA node252 CPU(s): NUMA node253 CPU(s): NUMA node254 CPU(s): NUMA node255 CPU(s): The highest NUMA with CPUs is node8. We're trying to get the highest NUMA with CPUs on P9 FreeBSD systems, but we don't know whether FreeBSD NUMA layout looks the same (does FreeBSD report non-contiguous NUMA nodes as Linxu above, or does it renumerate) a what does "systemctl -n vm.ndomains" return. Could you check these for us?
Re: [dpdk-dev] vxlan encap offload probelm in mlx5
Hi Asaf, I test the mlnx-dpdk-20.11.0 in the ofed-5.4, This problem is fixed. But there are still other problem estpmd> set vxlan ip-version ipv4 vni 1000 udp-src 0 udp-dst 4789 ip-src 172.168.152.50 ip-dst 172.168.152.73 eth-src 1c:34:da:77:fb:d8 eth-dst 3c:fd:fe:bb:1c:0c testpmd> testpmd> testpmd> flow create 1 ingress priority 0 group 0 transfer pattern eth src is 46:87:9e:9e:c8:23 dst is 5a:9e:0f:74:6c:5e type is 0x0800 / ipv4 tos spec 0x0 tos mask 0x3 / end actions count / vxlan_encap / port_id original 0 id 0 / end mlx5_net: Failed to init cache list mlx5_0_port_id_action_cache entry (nil). port_flow_complain(): Caught PMD error type 1 (cause unspecified): cannot create action: Cannot allocate memory It seems the rdma-core problem? # rpm -aq | grep rdma rdma-core-54mlnx1-1.54103.x86_64 rdma-core-devel-54mlnx1-1.54103.x86_64 librdmacm-54mlnx1-1.54103.x86_64 librdmacm-utils-54mlnx1-1.54103.x86_64 BR wenxu 发件人:wenxu 发送日期:2021-08-03 16:44:53 收件人:as...@nvidia.com 抄送人:dev@dpdk.org 主题:nvgre inner rss problem in mlx5 Hi nvidia teams, I test the upstream dpdk for vxlan encap offload with dpdk-testpmd # lspci | grep Ether 19:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] 19:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] Fw version is 16.31.1014 #ethtool -i net2 driver: mlx5_core version: 5.13.0-rc3+ firmware-version: 16.31.1014 (MT_80) expansion-rom-version: bus-info: :19:00.0 start the eswitch echo 0 > /sys/class/net/net2/device/sriov_numvfs echo 1 > /sys/class/net/net2/device/sriov_numvfs echo :19:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind devlink dev eswitch set pci/:19:00.0 mode switchdev echo :19:00.2 > /sys/bus/pci/drivers/mlx5_core/bind ip link shows 4: net2: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 1c:34:da:77:fb:d8 brd ff:ff:ff:ff:ff:ff vf 0 MAC 4e:41:8f:92:41:44, spoof checking off, link-state disable, trust off, query_rss off vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state disable, trust off, query_rss off 8: pf0vf0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 4e:41:8f:92:41:44 brd ff:ff:ff:ff:ff:ff 10: eth0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 46:87:9e:9e:c8:23 brd ff:ff:ff:ff:ff:ff net2 is pf, pf0vf0 is vf represntor, eth0 is vf. start the pmd ./dpdk-testpmd -c 0x1f -n 4 -m 4096 --file-prefix=ovs -a ":19:00.0,representor=pf0vf0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1" --huge-dir=/mnt/ovsdpdk -- -i --flow-isolate-all --forward-mode=rxonly --rxq=4 --txq=4 --auto-start --nb-cores=4 testpmd> set vxlan ip-version ipv4 vni 1000 udp-src 0 udp-dst 4789 ip-src 172.168.152.50 ip-dst 172.168.152.73 eth-src 1c:34:da:77:fb:d8 eth-dst 3c:fd:fe:bb:1c:0c testpmd> flow create 1 ingress priority 0 group 0 transfer pattern eth src is 46:87:9e:9e:c8:23 dst is 5a:9e:0f:74:6c:5e type is 0x0800 / ipv4 tos spec 0x0 tos mask 0x3 / end actions count / vxlan_encap / port_id original 0 id 0 / end port_flow_complain(): Caught PMD error type 16 (specific action): port does not belong to E-Switch being configured: Invalid argument Add the rule fail for "port does not belong to E-Switch being configured" I checkout with the dpdk codes In the function flow_dv_validate_action_port_id if (act_priv->domain_id != dev_priv->domain_id) return rte_flow_error_set (error, EINVAL, RTE_FLOW_ERROR_TYPE_ACTION, NULL, "port does not belong to" " E-Switch being configured"); The domain_id of vf representor is not the same as domain_id of PF. And check the mlx5_dev_spawn the vlaue of domain_id for vf representor and PF will be always diffirent. mlx5_dev_spawn /* * Look for sibling devices in order to reuse their switch domain * if any, otherwise allocate one. */ MLX5_ETH_FOREACH_DEV(port_id, NULL) { const struct mlx5_priv *opriv = rte_eth_devices[port_id].data->dev_private; if (!opriv || opriv->sh != priv->sh || opriv->domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) continue; priv->domain_id = opriv->domain_id; break; } if (priv->domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) { err = rte_eth_switch_domain_alloc(&priv->domain_id); The MLX5_ETH_FOREACH_DEV will never for PF eth_dev. mlx5_eth_find_next(uint16_t port_id, struct rte_device *odev) { while (port_id < RTE_MAX_ETHPORTS) { struct rte_eth_dev *dev = &rte_eth_devices[port_id]; if (dev->state != RTE_ETH_DEV_UNUSED && d
Re: [dpdk-dev] [PATCH v2] net/mlx5: fix vni matching with non-std port at ConnectX-5
> -Original Message- > From: Rongwei Liu > Sent: Monday, August 2, 2021 15:21 > To: Matan Azrad ; Slava Ovsiienko > ; Ori Kam ; NBU-Contact- > Thomas Monjalon ; Shahaf Shuler > ; Raslan Darawsheh > Cc: dev@dpdk.org; sta...@dpdk.org > Subject: [PATCH v2] net/mlx5: fix vni matching with non-std port at > ConnectX-5 > > In the recent update, the misc5 matcher was introduced to match VxLAN > header extra fields. However, ConnectX-5 doesn't support misc5 for the UDP > ports different from VXLAN's standard one (4789). > > Need to fall back to the previous approach and use legacy misc matcher if > non-standard UDP port is recognized in VxLAN flow. > > Fixes: 630a587bfb37 ("net/mlx5: support matching on VXLAN reserved field") > Cc: sta...@dpdk.org > > Signed-off-by: Rongwei Liu Signed-off-by: Viacheslav Ovsiienko
Re: [dpdk-dev] [PATCH v2] net/mlx5: fix vni matching with non-std port at ConnectX-5
> -Original Message- > From: Rongwei Liu > Sent: Monday, August 2, 2021 15:21 > To: Matan Azrad ; Slava Ovsiienko > ; Ori Kam ; NBU-Contact- > Thomas Monjalon ; Shahaf Shuler > ; Raslan Darawsheh > Cc: dev@dpdk.org; sta...@dpdk.org > Subject: [PATCH v2] net/mlx5: fix vni matching with non-std port at > ConnectX-5 > > In the recent update, the misc5 matcher was introduced to match VxLAN > header extra fields. However, ConnectX-5 doesn't support misc5 for the UDP > ports different from VXLAN's standard one (4789). > > Need to fall back to the previous approach and use legacy misc matcher if > non-standard UDP port is recognized in VxLAN flow. > > Fixes: 630a587bfb37 ("net/mlx5: support matching on VXLAN reserved field") > Cc: sta...@dpdk.org > > Signed-off-by: Rongwei Liu Acked-by: Viacheslav Ovsiienko
[dpdk-dev] [PATCH v13 6/6] maintainers: add for dmadev
This patch add Chengwen Feng as dmadev's maintainer. Signed-off-by: Chengwen Feng --- MAINTAINERS| 5 + doc/guides/rel_notes/release_21_08.rst | 6 ++ 2 files changed, 11 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 8013ba1..84cfb1a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -496,6 +496,11 @@ F: drivers/raw/skeleton/ F: app/test/test_rawdev.c F: doc/guides/prog_guide/rawdev.rst +DMA device API - EXPERIMENTAL +M: Chengwen Feng +F: lib/dmadev/ +F: doc/guides/prog_guide/dmadev.rst + Memory Pool Drivers --- diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst index 16bb9ce..93068a2 100644 --- a/doc/guides/rel_notes/release_21_08.rst +++ b/doc/guides/rel_notes/release_21_08.rst @@ -175,6 +175,12 @@ New Features Updated testpmd application to log errors and warnings to stderr instead of stdout used before. +* **Added dmadev library support.** + + The dmadev library provides a DMA device framework for management and + provisioning of hardware and software DMA poll mode drivers, defining generic + APIs which support a number of different DMA operations. + Removed Items - -- 2.8.1
[dpdk-dev] [PATCH v13 3/6] dmadev: introduce DMA device library PMD header
This patch introduce DMA device library PMD header which was driver facing APIs for a DMA device. Signed-off-by: Chengwen Feng Acked-by: Bruce Richardson Acked-by: Morten Brørup --- lib/dmadev/meson.build | 1 + lib/dmadev/rte_dmadev.h | 2 ++ lib/dmadev/rte_dmadev_pmd.h | 72 + lib/dmadev/version.map | 10 +++ 4 files changed, 85 insertions(+) create mode 100644 lib/dmadev/rte_dmadev_pmd.h diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build index f421ec1..833baf7 100644 --- a/lib/dmadev/meson.build +++ b/lib/dmadev/meson.build @@ -3,3 +3,4 @@ headers = files('rte_dmadev.h') indirect_headers += files('rte_dmadev_core.h') +driver_sdk_headers += files('rte_dmadev_pmd.h') diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h index 1090b06..439ad95 100644 --- a/lib/dmadev/rte_dmadev.h +++ b/lib/dmadev/rte_dmadev.h @@ -743,6 +743,8 @@ struct rte_dmadev_sge { uint32_t length; /**< The DMA operation length. */ }; +#include "rte_dmadev_core.h" + /* DMA flags to augment operation preparation. */ #define RTE_DMA_OP_FLAG_FENCE (1ull << 0) /**< DMA fence flag. diff --git a/lib/dmadev/rte_dmadev_pmd.h b/lib/dmadev/rte_dmadev_pmd.h new file mode 100644 index 000..45141f9 --- /dev/null +++ b/lib/dmadev/rte_dmadev_pmd.h @@ -0,0 +1,72 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 HiSilicon Limited. + */ + +#ifndef _RTE_DMADEV_PMD_H_ +#define _RTE_DMADEV_PMD_H_ + +/** + * @file + * + * RTE DMA Device PMD APIs + * + * Driver facing APIs for a DMA device. These are not to be called directly by + * any application. + */ + +#include "rte_dmadev.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * @internal + * Allocates a new dmadev slot for an DMA device and returns the pointer + * to that slot for the driver to use. + * + * @param name + * DMA device name. + * + * @return + * A pointer to the DMA device slot case of success, + * NULL otherwise. + */ +__rte_internal +struct rte_dmadev * +rte_dmadev_pmd_allocate(const char *name); + +/** + * @internal + * Release the specified dmadev. + * + * @param dev + * Device to be released. + * + * @return + * - 0 on success, negative on error + */ +__rte_internal +int +rte_dmadev_pmd_release(struct rte_dmadev *dev); + +/** + * @internal + * Return the DMA device based on the device name. + * + * @param name + * DMA device name. + * + * @return + * A pointer to the DMA device slot case of success, + * NULL otherwise. + */ +__rte_internal +struct rte_dmadev * +rte_dmadev_get_device_by_name(const char *name); + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_DMADEV_PMD_H_ */ diff --git a/lib/dmadev/version.map b/lib/dmadev/version.map index 02fffe3..408b93c 100644 --- a/lib/dmadev/version.map +++ b/lib/dmadev/version.map @@ -23,3 +23,13 @@ EXPERIMENTAL { local: *; }; + +INTERNAL { +global: + + rte_dmadev_get_device_by_name; + rte_dmadev_pmd_allocate; + rte_dmadev_pmd_release; + + local: *; +}; -- 2.8.1
[dpdk-dev] [PATCH v13 2/6] dmadev: introduce DMA device library internal header
This patch introduce DMA device library internal header, which contains internal data types that are used by the DMA devices in order to expose their ops to the class. Signed-off-by: Chengwen Feng Acked-by: Bruce Richardson Acked-by: Morten Brørup --- lib/dmadev/meson.build | 1 + lib/dmadev/rte_dmadev_core.h | 180 +++ 2 files changed, 181 insertions(+) create mode 100644 lib/dmadev/rte_dmadev_core.h diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build index 6d5bd85..f421ec1 100644 --- a/lib/dmadev/meson.build +++ b/lib/dmadev/meson.build @@ -2,3 +2,4 @@ # Copyright(c) 2021 HiSilicon Limited. headers = files('rte_dmadev.h') +indirect_headers += files('rte_dmadev_core.h') diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h new file mode 100644 index 000..599ab15 --- /dev/null +++ b/lib/dmadev/rte_dmadev_core.h @@ -0,0 +1,180 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 HiSilicon Limited. + * Copyright(c) 2021 Intel Corporation. + */ + +#ifndef _RTE_DMADEV_CORE_H_ +#define _RTE_DMADEV_CORE_H_ + +/** + * @file + * + * RTE DMA Device internal header. + * + * This header contains internal data types, that are used by the DMA devices + * in order to expose their ops to the class. + * + * Applications should not use these API directly. + * + */ + +struct rte_dmadev; + +typedef int (*rte_dmadev_info_get_t)(const struct rte_dmadev *dev, +struct rte_dmadev_info *dev_info, +uint32_t info_sz); +/**< @internal Used to get device information of a device. */ + +typedef int (*rte_dmadev_configure_t)(struct rte_dmadev *dev, + const struct rte_dmadev_conf *dev_conf); +/**< @internal Used to configure a device. */ + +typedef int (*rte_dmadev_start_t)(struct rte_dmadev *dev); +/**< @internal Used to start a configured device. */ + +typedef int (*rte_dmadev_stop_t)(struct rte_dmadev *dev); +/**< @internal Used to stop a configured device. */ + +typedef int (*rte_dmadev_close_t)(struct rte_dmadev *dev); +/**< @internal Used to close a configured device. */ + +typedef int (*rte_dmadev_vchan_setup_t)(struct rte_dmadev *dev, + const struct rte_dmadev_vchan_conf *conf); +/**< @internal Used to allocate and set up a virtual DMA channel. */ + +typedef int (*rte_dmadev_stats_get_t)(const struct rte_dmadev *dev, + uint16_t vchan, struct rte_dmadev_stats *stats, + uint32_t stats_sz); +/**< @internal Used to retrieve basic statistics. */ + +typedef int (*rte_dmadev_stats_reset_t)(struct rte_dmadev *dev, uint16_t vchan); +/**< @internal Used to reset basic statistics. */ + +typedef int (*rte_dmadev_dump_t)(const struct rte_dmadev *dev, FILE *f); +/**< @internal Used to dump internal information. */ + +typedef int (*rte_dmadev_selftest_t)(uint16_t dev_id); +/**< @internal Used to start dmadev selftest. */ + +typedef int (*rte_dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vchan, +rte_iova_t src, rte_iova_t dst, +uint32_t length, uint64_t flags); +/**< @internal Used to enqueue a copy operation. */ + +typedef int (*rte_dmadev_copy_sg_t)(struct rte_dmadev *dev, uint16_t vchan, + const struct rte_dmadev_sge *src, + const struct rte_dmadev_sge *dst, + uint16_t nb_src, uint16_t nb_dst, + uint64_t flags); +/**< @internal Used to enqueue a scatter-gather list copy operation. */ + +typedef int (*rte_dmadev_fill_t)(struct rte_dmadev *dev, uint16_t vchan, +uint64_t pattern, rte_iova_t dst, +uint32_t length, uint64_t flags); +/**< @internal Used to enqueue a fill operation. */ + +typedef int (*rte_dmadev_submit_t)(struct rte_dmadev *dev, uint16_t vchan); +/**< @internal Used to trigger hardware to begin working. */ + +typedef uint16_t (*rte_dmadev_completed_t)(struct rte_dmadev *dev, + uint16_t vchan, const uint16_t nb_cpls, + uint16_t *last_idx, bool *has_error); +/**< @internal Used to return number of successful completed operations. */ + +typedef uint16_t (*rte_dmadev_completed_status_t)(struct rte_dmadev *dev, + uint16_t vchan, const uint16_t nb_cpls, + uint16_t *last_idx, enum rte_dma_status_code *status); +/**< @internal Used to return number of completed operations. */ + +/** + * Possible states of a DMA device. + */ +enum rte_dmadev_state { + RTE_DMADEV_UNUSED = 0, + /**< Device is unused before being probed. */ + RTE_DMADEV_ATTACHED, + /**< Device is attached when allocated in probing. */ +}; + +/** + * DMA device operations function pointer table + */ +struct rt
[dpdk-dev] [PATCH v13 4/6] dmadev: introduce DMA device library implementation
This patch introduce DMA device library implementation which includes configuration and I/O with the DMA devices. Signed-off-by: Chengwen Feng Acked-by: Bruce Richardson Acked-by: Morten Brørup --- config/rte_config.h | 3 + lib/dmadev/meson.build | 1 + lib/dmadev/rte_dmadev.c | 563 +++ lib/dmadev/rte_dmadev.h | 118 - lib/dmadev/rte_dmadev_core.h | 2 + lib/dmadev/version.map | 1 + 6 files changed, 676 insertions(+), 12 deletions(-) create mode 100644 lib/dmadev/rte_dmadev.c diff --git a/config/rte_config.h b/config/rte_config.h index 590903c..331a431 100644 --- a/config/rte_config.h +++ b/config/rte_config.h @@ -81,6 +81,9 @@ /* rawdev defines */ #define RTE_RAWDEV_MAX_DEVS 64 +/* dmadev defines */ +#define RTE_DMADEV_MAX_DEVS 64 + /* ip_fragmentation defines */ #define RTE_LIBRTE_IP_FRAG_MAX_FRAG 4 #undef RTE_LIBRTE_IP_FRAG_TBL_STAT diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build index 833baf7..d2fc85e 100644 --- a/lib/dmadev/meson.build +++ b/lib/dmadev/meson.build @@ -1,6 +1,7 @@ # SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2021 HiSilicon Limited. +sources = files('rte_dmadev.c') headers = files('rte_dmadev.h') indirect_headers += files('rte_dmadev_core.h') driver_sdk_headers += files('rte_dmadev_pmd.h') diff --git a/lib/dmadev/rte_dmadev.c b/lib/dmadev/rte_dmadev.c new file mode 100644 index 000..b4f5498 --- /dev/null +++ b/lib/dmadev/rte_dmadev.c @@ -0,0 +1,563 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 HiSilicon Limited. + * Copyright(c) 2021 Intel Corporation. + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "rte_dmadev.h" +#include "rte_dmadev_pmd.h" + +struct rte_dmadev rte_dmadevices[RTE_DMADEV_MAX_DEVS]; + +static const char *mz_rte_dmadev_data = "rte_dmadev_data"; +/* Shared memory between primary and secondary processes. */ +static struct { + struct rte_dmadev_data data[RTE_DMADEV_MAX_DEVS]; +} *dmadev_shared_data; + +RTE_LOG_REGISTER_DEFAULT(rte_dmadev_logtype, INFO); +#define RTE_DMADEV_LOG(level, ...) \ + rte_log(RTE_LOG_ ## level, rte_dmadev_logtype, "" __VA_ARGS__) + +/* Macros to check for valid device id */ +#define RTE_DMADEV_VALID_DEV_ID_OR_ERR_RET(dev_id, retval) do { \ + if (!rte_dmadev_is_valid_dev(dev_id)) { \ + RTE_DMADEV_LOG(ERR, "Invalid dev_id=%u\n", dev_id); \ + return retval; \ + } \ +} while (0) + +static int +dmadev_check_name(const char *name) +{ + size_t name_len; + + if (name == NULL) { + RTE_DMADEV_LOG(ERR, "Name can't be NULL\n"); + return -EINVAL; + } + + name_len = strnlen(name, RTE_DMADEV_NAME_MAX_LEN); + if (name_len == 0) { + RTE_DMADEV_LOG(ERR, "Zero length DMA device name\n"); + return -EINVAL; + } + if (name_len >= RTE_DMADEV_NAME_MAX_LEN) { + RTE_DMADEV_LOG(ERR, "DMA device name is too long\n"); + return -EINVAL; + } + + return 0; +} + +static uint16_t +dmadev_find_free_dev(void) +{ + uint16_t i; + + for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) { + if (dmadev_shared_data->data[i].dev_name[0] == '\0') + return i; + } + + return RTE_DMADEV_MAX_DEVS; +} + +static struct rte_dmadev* +dmadev_find(const char *name) +{ + uint16_t i; + + for (i = 0; i < RTE_DMADEV_MAX_DEVS; i++) { + if ((rte_dmadevices[i].state == RTE_DMADEV_ATTACHED) && + (!strcmp(name, rte_dmadevices[i].data->dev_name))) + return &rte_dmadevices[i]; + } + + return NULL; +} + +static int +dmadev_shared_data_prepare(void) +{ + const struct rte_memzone *mz; + + if (dmadev_shared_data == NULL) { + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { + /* Allocate port data and ownership shared memory. */ + mz = rte_memzone_reserve(mz_rte_dmadev_data, +sizeof(*dmadev_shared_data), +rte_socket_id(), 0); + } else + mz = rte_memzone_lookup(mz_rte_dmadev_data); + if (mz == NULL) + return -ENOMEM; + + dmadev_shared_data = mz->addr; + if (rte_eal_process_type() == RTE_PROC_PRIMARY) + memset(dmadev_shared_data->data, 0, + sizeof(dmadev_shared_data->data)); + } + + return 0; +} + +static struct rte_dmadev * +dmadev_allocate(const char *name) +{ + struct rte_dmadev *dev; + uint16_t dev_id; + + dev = dmadev_find(name); + if (dev != NULL) { + RTE_D
[dpdk-dev] [PATCH v13 1/6] dmadev: introduce DMA device library public APIs
The 'dmadevice' is a generic type of DMA device. This patch introduce the 'dmadevice' public APIs which expose generic operations that can enable configuration and I/O with the DMA devices. Signed-off-by: Chengwen Feng Acked-by: Bruce Richardson Acked-by: Morten Brørup Acked-by: Jerin Jacob --- doc/api/doxy-api-index.md | 1 + doc/api/doxy-api.conf.in | 1 + lib/dmadev/meson.build| 4 + lib/dmadev/rte_dmadev.h | 962 ++ lib/dmadev/version.map| 25 ++ lib/meson.build | 1 + 6 files changed, 994 insertions(+) create mode 100644 lib/dmadev/meson.build create mode 100644 lib/dmadev/rte_dmadev.h create mode 100644 lib/dmadev/version.map diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index 1992107..ce08250 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -27,6 +27,7 @@ The public API headers are grouped by topics: [event_timer_adapter](@ref rte_event_timer_adapter.h), [event_crypto_adapter] (@ref rte_event_crypto_adapter.h), [rawdev] (@ref rte_rawdev.h), + [dmadev] (@ref rte_dmadev.h), [metrics](@ref rte_metrics.h), [bitrate](@ref rte_bitrate.h), [latency](@ref rte_latencystats.h), diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in index 325a019..a44a92b 100644 --- a/doc/api/doxy-api.conf.in +++ b/doc/api/doxy-api.conf.in @@ -34,6 +34,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \ @TOPDIR@/lib/cmdline \ @TOPDIR@/lib/compressdev \ @TOPDIR@/lib/cryptodev \ + @TOPDIR@/lib/dmadev \ @TOPDIR@/lib/distributor \ @TOPDIR@/lib/efd \ @TOPDIR@/lib/ethdev \ diff --git a/lib/dmadev/meson.build b/lib/dmadev/meson.build new file mode 100644 index 000..6d5bd85 --- /dev/null +++ b/lib/dmadev/meson.build @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2021 HiSilicon Limited. + +headers = files('rte_dmadev.h') diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h new file mode 100644 index 000..1090b06 --- /dev/null +++ b/lib/dmadev/rte_dmadev.h @@ -0,0 +1,962 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2021 HiSilicon Limited. + * Copyright(c) 2021 Intel Corporation. + * Copyright(c) 2021 Marvell International Ltd. + * Copyright(c) 2021 SmartShare Systems. + */ + +#ifndef _RTE_DMADEV_H_ +#define _RTE_DMADEV_H_ + +/** + * @file rte_dmadev.h + * + * RTE DMA (Direct Memory Access) device APIs. + * + * The DMA framework is built on the following model: + * + * --- --- --- + * | virtual DMA | | virtual DMA | | virtual DMA | + * | channel | | channel | | channel | + * --- --- --- + *|| | + *-- | + * | | + * + * | dmadev || dmadev | + * + * | | + *-- -- + *| HW-DMA-channel | | HW-DMA-channel | + *-- -- + * | | + * + * | + * - + * | HW-DMA-Controller | + * - + * + * The DMA controller could have multiple HW-DMA-channels (aka. HW-DMA-queues), + * each HW-DMA-channel should be represented by a dmadev. + * + * The dmadev could create multiple virtual DMA channels, each virtual DMA + * channel represents a different transfer context. The DMA operation request + * must be submitted to the virtual DMA channel. e.g. Application could create + * virtual DMA channel 0 for memory-to-memory transfer scenario, and create + * virtual DMA channel 1 for memory-to-device transfer scenario. + * + * The dmadev are dynamically allocated by rte_dmadev_pmd_allocate() during the + * PCI/SoC device probing phase performed at EAL initialization time. And could + * be released by rte_dmadev_pmd_release() during the PCI/SoC device removing + * phase. + * + * This framework uses 'uint16_t dev_id' as the device identifier of a dmadev, + * and 'uint16_t vchan' as the virtual DMA channel identifier in one dmadev. + * + * The functions exported by the dmadev API to setup a device designated by its + * device identifie
[dpdk-dev] [PATCH v13 0/6] support dmadev
This patch set contains six patch for new add dmadev. Chengwen Feng (6): dmadev: introduce DMA device library public APIs dmadev: introduce DMA device library internal header dmadev: introduce DMA device library PMD header dmadev: introduce DMA device library implementation doc: add DMA device library guide maintainers: add for dmadev --- v13: * add dmadev_i1.svg. * delete one unnecessary comment line of rte_dmadev_info_get. v12: * add max_sges filed for struct rte_dmadev_info. * add more descriptor of dmadev.rst. * replace scatter with scatter gather in code comment. * split to six patch. * fix typo. v11: * rename RTE_DMA_STATUS_UNKNOWN to RTE_DMA_STATUS_ERROR_UNKNOWN. * add RTE_DMA_STATUS_INVALID_ADDR marco. * update release-note. * add acked-by for 1/2 patch. * add dmadev programming guide which is 2/2 patch. v10: * fix rte_dmadev_completed_status comment. MAINTAINERS |5 + config/rte_config.h |3 + doc/api/doxy-api-index.md |1 + doc/api/doxy-api.conf.in|1 + doc/guides/prog_guide/dmadev.rst| 126 doc/guides/prog_guide/img/dmadev_i1.svg | 278 doc/guides/prog_guide/index.rst |1 + doc/guides/rel_notes/release_21_08.rst |6 + lib/dmadev/meson.build |7 + lib/dmadev/rte_dmadev.c | 563 lib/dmadev/rte_dmadev.h | 1058 +++ lib/dmadev/rte_dmadev_core.h| 182 ++ lib/dmadev/rte_dmadev_pmd.h | 72 +++ lib/dmadev/version.map | 36 ++ lib/meson.build |1 + 15 files changed, 2340 insertions(+) create mode 100644 doc/guides/prog_guide/dmadev.rst create mode 100644 doc/guides/prog_guide/img/dmadev_i1.svg create mode 100644 lib/dmadev/meson.build create mode 100644 lib/dmadev/rte_dmadev.c create mode 100644 lib/dmadev/rte_dmadev.h create mode 100644 lib/dmadev/rte_dmadev_core.h create mode 100644 lib/dmadev/rte_dmadev_pmd.h create mode 100644 lib/dmadev/version.map -- 2.8.1
[dpdk-dev] [PATCH v13 5/6] doc: add DMA device library guide
This patch adds dmadev library guide. Signed-off-by: Chengwen Feng --- doc/guides/prog_guide/dmadev.rst| 126 +++ doc/guides/prog_guide/img/dmadev_i1.svg | 278 doc/guides/prog_guide/index.rst | 1 + 3 files changed, 405 insertions(+) create mode 100644 doc/guides/prog_guide/dmadev.rst create mode 100644 doc/guides/prog_guide/img/dmadev_i1.svg diff --git a/doc/guides/prog_guide/dmadev.rst b/doc/guides/prog_guide/dmadev.rst new file mode 100644 index 000..c6327db --- /dev/null +++ b/doc/guides/prog_guide/dmadev.rst @@ -0,0 +1,126 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright 2021 HiSilicon Limited + +DMA Device Library + + +The DMA library provides a DMA device framework for management and provisioning +of hardware and software DMA poll mode drivers, defining generic APIs which +support a number of different DMA operations. + + +Design Principles +- + +The DMA library follows the same basic principles as those used in DPDK's +Ethernet Device framework and the RegEx framework. The DMA framework provides +a generic DMA device framework which supports both physical (hardware) +and virtual (software) DMA devices as well as a generic DMA API which allows +DMA devices to be managed and configured and supports DMA operations to be +provisioned on DMA poll mode driver. + +.. _figure_dmadev_i1: + +.. figure:: img/dmadev_i1.* + + The model of the DMA framework built on + + * The DMA controller could have multiple hardware DMA channels (aka. hardware + DMA queues), each hardware DMA channel should be represented by a dmadev. + * The dmadev could create multiple virtual DMA channels, each virtual DMA + channel represents a different transfer context. The DMA operation request + must be submitted to the virtual DMA channel. e.g. Application could create + virtual DMA channel 0 for memory-to-memory transfer scenario, and create + virtual DMA channel 1 for memory-to-device transfer scenario. + + +Device Management +- + +Device Creation +~~~ + +Physical DMA controller is discovered during the PCI probe/enumeration of the +EAL function which is executed at DPDK initialization, based on their PCI +device identifier, each unique PCI BDF (bus/bridge, device, function). Specific +physical DMA controller, like other physical devices in DPDK can be listed using +the EAL command line options. + +And then dmadevs are dynamically allocated by rte_dmadev_pmd_allocate() based on +the number of hardware DMA channels. + + +Device Identification +~ + +Each DMA device, whether physical or virtual is uniquely designated by two +identifiers: + +- A unique device index used to designate the DMA device in all functions + exported by the DMA API. + +- A device name used to designate the DMA device in console messages, for + administration or debugging purposes. + + +Device Configuration + + +The rte_dmadev_configure API is used to configure a DMA device. + +.. code-block:: c + + int rte_dmadev_configure(uint16_t dev_id, +const struct rte_dmadev_conf *dev_conf); + +The ``rte_dmadev_conf`` structure is used to pass the configuration parameters +for the DMA device for example maximum number of virtual DMA channels, +indication of whether to enable silent mode. + + +Configuration of Virtual DMA Channels +~ + +The rte_dmadev_vchan_setup API is used to configure a virtual DMA channel. + +.. code-block:: c + + int rte_dmadev_vchan_setup(uint16_t dev_id, + const struct rte_dmadev_vchan_conf *conf); + +The ``rte_dmadev_vchan_conf`` structure is used to pass the configuration +parameters for the virtual DMA channel for example transfer direction, number of +descriptor for the virtual DMA channel, source device access port parameter, +destination device access port parameter. + + +Device Features and Capabilities + + +DMA devices may support different feature set. In order to get the supported PMD +features ``rte_dmadev_info_get`` API which returns the info of the device and +it's supported features. + +A special device capability is silent mode which application don't required to +invoke dequeue APIs. + + +Enqueue / Dequeue APIs +~~ + +The enqueue APIs include like ``rte_dmadev_copy`` and ``rte_dmadev_fill``, if +enqueue successful, an uint16_t ring_idx is returned. This ring_idx can be used +by applications to track per-operation metadata in an application defined +circular ring. + +The ``rte_dmadev_submit`` API was used to issue doorbell to hardware, and also +there are flags (``RTE_DMA_OP_FLAG_SUBMIT``) parameter of the enqueue APIs +could do the same work. + +There are two dequeue APIs (``rte_dmadev_completed`` and +``rte_dmadev_completed_status``) could used to obtain the result of request.
[dpdk-dev] [PATCH] doc: announce cryptodev-PMD interface as internal
The APIs which are internal to PMD and cryptodev library can be marked as internal so that ABI checking do not shout for changes in APIs which are internal to DPDK. Signed-off-by: Akhil Goyal --- doc/guides/rel_notes/deprecation.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index 6a35c7649a..f81bd87f10 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -148,6 +148,9 @@ Deprecation Notices content. On Linux and FreeBSD, supported prior to DPDK 20.11, original structure will be kept until DPDK 21.11. +* cryptodev: The APIs for interfacing between library and PMD will be marked + as internal APIs in DPDK 21.11. + * security: The functions ``rte_security_set_pkt_metadata`` and ``rte_security_get_userdata`` will be made inline functions and additional flags will be added in structure ``rte_security_ctx`` in DPDK 21.11. -- 2.25.1
Re: [dpdk-dev] [PATCH v13 0/6] support dmadev
@Bruce @Jerin @Morten Could you please review 'doc: add DMA device library guide' patch ? PS: other patchs are well reviewed. Thanks On 2021/8/3 19:29, Chengwen Feng wrote: > This patch set contains six patch for new add dmadev. > > Chengwen Feng (6): > dmadev: introduce DMA device library public APIs > dmadev: introduce DMA device library internal header > dmadev: introduce DMA device library PMD header > dmadev: introduce DMA device library implementation > doc: add DMA device library guide > maintainers: add for dmadev > > --- > v13: > * add dmadev_i1.svg. > * delete one unnecessary comment line of rte_dmadev_info_get. > v12: > * add max_sges filed for struct rte_dmadev_info. > * add more descriptor of dmadev.rst. > * replace scatter with scatter gather in code comment. > * split to six patch. > * fix typo. > v11: > * rename RTE_DMA_STATUS_UNKNOWN to RTE_DMA_STATUS_ERROR_UNKNOWN. > * add RTE_DMA_STATUS_INVALID_ADDR marco. > * update release-note. > * add acked-by for 1/2 patch. > * add dmadev programming guide which is 2/2 patch. > v10: > * fix rte_dmadev_completed_status comment. > > MAINTAINERS |5 + > config/rte_config.h |3 + > doc/api/doxy-api-index.md |1 + > doc/api/doxy-api.conf.in|1 + > doc/guides/prog_guide/dmadev.rst| 126 > doc/guides/prog_guide/img/dmadev_i1.svg | 278 > doc/guides/prog_guide/index.rst |1 + > doc/guides/rel_notes/release_21_08.rst |6 + > lib/dmadev/meson.build |7 + > lib/dmadev/rte_dmadev.c | 563 > lib/dmadev/rte_dmadev.h | 1058 > +++ > lib/dmadev/rte_dmadev_core.h| 182 ++ > lib/dmadev/rte_dmadev_pmd.h | 72 +++ > lib/dmadev/version.map | 36 ++ > lib/meson.build |1 + > 15 files changed, 2340 insertions(+) > create mode 100644 doc/guides/prog_guide/dmadev.rst > create mode 100644 doc/guides/prog_guide/img/dmadev_i1.svg > create mode 100644 lib/dmadev/meson.build > create mode 100644 lib/dmadev/rte_dmadev.c > create mode 100644 lib/dmadev/rte_dmadev.h > create mode 100644 lib/dmadev/rte_dmadev_core.h > create mode 100644 lib/dmadev/rte_dmadev_pmd.h > create mode 100644 lib/dmadev/version.map >
Re: [dpdk-dev] [PATCH 2/2] lib/security: add SA lifetime configuration
Hi Anoob, > > > Now that we have an agreement on bitfields (hoping no one else has an > > > objection), I would like to discuss one more topic. It is more related to > > checksum offload, but it's better that we discuss along with other similar > > items (like soft expiry). > > > > > > L3 & L4 checksum can be tristate (CSUM_OK, CSUM_ERROR, > > CSUM_UNKOWN) > > > > > > 1. Application didn't request. Nothing computed. > > > 2. Application requested. Checksum verification success. > > > 3. Application requested. Checksum verification failed. > > > 4. Application requested. Checksum could not be computed (PMD > > limitations etc). > > > > > > How would we indicate each case? > > > > > > My proposal would be, let's call the field that we called "warning" as > > "aux_flags" (auxiliary or secondary information from the operation). > > > > > > Sequence in the application would be, > > > > > > if (op.status != SUCCESS) { > > > /* handle errors */ > > > } > > > > > > #define RTE_SEC_IPSEC_AUX_FLAGS_L4_CHECKSUM_COMPUTED (1 << 0) > > #define > > > RTE_SEC_IPSEC_AUX_FLAGS_L4_CHECSUM_GOOD (1 << 1) > > > > > > if (op.aux_flags & > > RTE_SEC_IPSEC_AUX_FLAGS_L4_CHECKSUM_COMPUTED) { > > > if (op.aux_flags & > > RTE_SEC_IPSEC_AUX_FLAGS_L4_CHECSUM_GOOD) > > > mbuf->l4_checksum_good = 1; > > > else > > > mbuf->l4_checksum_good = 0; > > > } else { > > > if (verify_l4_checksum(mbuf) == SUCCESS) { > > > mbuf->l4_checksum_good = 1; > > > else > > > mbuf->l4_checksum_good = 0; > > > } > > > > > > For an application not worried about aux_flags (ex: ipsec-secgw), > > > additional checks are not required. For applications not interested in > > > checksum, a blind check on op.aux_flags would be enough to bail out early. > > For applications interested in checksum, it can follow above sequence > > (kinds, > > for demonstration purpose only). > > > > > > Would something like above fine? Or if we want to restrict additional > > > fields for just warnings, (L4_CHECKSUM_ERROR), how would application > > > differentiate between checksum good & checksum not computed? In that > > case, what should be PMDs treatment of "could not compute" v/s > > "computed and wrong". > > > > I am ok with what you suggest. > > My only thought - we already have CSUM flags in mbuf itself, so why not to > > use them instead to pass this information from crypto PMD to user? > > That way it would be compliant with ethdev CSUM approach and no need to > > spend > > 2 bits in 'aux_flags'. > > Konstantin > > [Anoob] You are right. We do have CSUM flags in mbuf and that would fully > suite our requirement here. > > Our problem was, it's called PKT_RX_ and the description text refers to RX. > > /** > * Mask of bits used to determine the status of RX IP checksum. > * - PKT_RX_IP_CKSUM_UNKNOWN: no information about the RX IP checksum > * - PKT_RX_IP_CKSUM_BAD: the IP checksum in the packet is wrong > * - PKT_RX_IP_CKSUM_GOOD: the IP checksum in the packet is valid > * - PKT_RX_IP_CKSUM_NONE: the IP checksum is not correct in the packet > * data, but the integrity of the IP header is verified. > */ > > But if we overlook that (& may be update documentation), it's a rather great > idea. We could use similar PKT_TX_* flags for requesting > checksum generation with outbound operations (checksum generation for plain > packet before IPsec processing). > > /** > * Offload the IP checksum in the hardware. The flag PKT_TX_IPV4 should > * also be set by the application, although a PMD will only check > * PKT_TX_IP_CKSUM. > * - fill the mbuf offload information: l2_len, l3_len > */ > #define PKT_TX_IP_CKSUM (1ULL << 54) > > /** > * Packet is IPv4. This flag must be set when using any offload feature > * (TSO, L3 or L4 checksum) to tell the NIC that the packet is an IPv4 > * packet. If the packet is a tunneled packet, this flag is related to > * the inner headers. > */ > #define PKT_TX_IPV4 (1ULL << 55) > > Do you think above might require some modifications to document behavior with > lookaside IPsec? > > Also, these flags are probably the best way for checksum for inner packet > with inline IPsec. So this looks like overall better idea. Do you > agree? Not sure I understand your proposal fully. Yes, right now inside mbuf we have different set of flags for checksum offloads: RX and TX. RX flags - indicate was checksum calculated/checked for incoming packet and what was the result, While TX flags define which CSUM calculations have to be done by HW. Yes, I suppose same flags can be reused by crypto-dev, if it capable to implement these HW offloads. Though not sure what changes do you think will be required inside mbuf? Konstantin
[dpdk-dev] [PATCH] doc: announce restructuring of crypto session structs
The structures rte_cryptodev_sym_session and rte_cryptodev_asym_session are not used by the application directly. The application just need an opaque pointer which it can attach to rte_crypto_op while enqueue. Hence, these structures can be internal to library hidden from the user. Signed-off-by: Akhil Goyal --- doc/guides/rel_notes/deprecation.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index f81bd87f10..7140e345b6 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -151,6 +151,11 @@ Deprecation Notices * cryptodev: The APIs for interfacing between library and PMD will be marked as internal APIs in DPDK 21.11. +* cryptodev: Hide structures ``rte_cryptodev_sym_session`` and + ``rte_cryptodev_asym_session`` to remove unnecessary indirection between + session and the private data of session. An opaque pointer can be exposed + directly to application which can be attached to the ``rte_crypto_op``. + * security: The functions ``rte_security_set_pkt_metadata`` and ``rte_security_get_userdata`` will be made inline functions and additional flags will be added in structure ``rte_security_ctx`` in DPDK 21.11. -- 2.25.1
[dpdk-dev] [PATCH v2] doc: announce restructuring of crypto session structs
The structures rte_cryptodev_sym_session and rte_cryptodev_asym_session are not used by the application directly. The application just need an opaque pointer which it can attach to rte_crypto_op while enqueue. Hence, these structures can be internal to library hidden from the user. Signed-off-by: Akhil Goyal --- v2: fixed trailing whitespace. doc/guides/rel_notes/deprecation.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index f81bd87f10..c540c90f8e 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -151,6 +151,11 @@ Deprecation Notices * cryptodev: The APIs for interfacing between library and PMD will be marked as internal APIs in DPDK 21.11. +* cryptodev: Hide structures ``rte_cryptodev_sym_session`` and + ``rte_cryptodev_asym_session`` to remove unnecessary indirection between + session and the private data of session. An opaque pointer can be exposed + directly to application which can be attached to the ``rte_crypto_op``. + * security: The functions ``rte_security_set_pkt_metadata`` and ``rte_security_get_userdata`` will be made inline functions and additional flags will be added in structure ``rte_security_ctx`` in DPDK 21.11. -- 2.25.1
Re: [dpdk-dev] [PATCH 2/2] lib/security: add SA lifetime configuration
Hi Konstantin, > Subject: [EXT] RE: [PATCH 2/2] lib/security: add SA lifetime configuration > > Hi Anoob, > > > > > Now that we have an agreement on bitfields (hoping no one else has > > > > an objection), I would like to discuss one more topic. It is more > > > > related to > > > checksum offload, but it's better that we discuss along with other > > > similar items (like soft expiry). > > > > > > > > L3 & L4 checksum can be tristate (CSUM_OK, CSUM_ERROR, > > > CSUM_UNKOWN) > > > > > > > > 1. Application didn't request. Nothing computed. > > > > 2. Application requested. Checksum verification success. > > > > 3. Application requested. Checksum verification failed. > > > > 4. Application requested. Checksum could not be computed (PMD > > > limitations etc). > > > > > > > > How would we indicate each case? > > > > > > > > My proposal would be, let's call the field that we called > > > > "warning" as > > > "aux_flags" (auxiliary or secondary information from the operation). > > > > > > > > Sequence in the application would be, > > > > > > > > if (op.status != SUCCESS) { > > > > /* handle errors */ > > > > } > > > > > > > > #define RTE_SEC_IPSEC_AUX_FLAGS_L4_CHECKSUM_COMPUTED (1 > << 0) > > > #define > > > > RTE_SEC_IPSEC_AUX_FLAGS_L4_CHECSUM_GOOD (1 << 1) > > > > > > > > if (op.aux_flags & > > > RTE_SEC_IPSEC_AUX_FLAGS_L4_CHECKSUM_COMPUTED) { > > > > if (op.aux_flags & > > > RTE_SEC_IPSEC_AUX_FLAGS_L4_CHECSUM_GOOD) > > > > mbuf->l4_checksum_good = 1; > > > > else > > > > mbuf->l4_checksum_good = 0; > > > > } else { > > > > if (verify_l4_checksum(mbuf) == SUCCESS) { > > > > mbuf->l4_checksum_good = 1; > > > > else > > > > mbuf->l4_checksum_good = 0; > > > > } > > > > > > > > For an application not worried about aux_flags (ex: ipsec-secgw), > > > > additional checks are not required. For applications not > > > > interested in checksum, a blind check on op.aux_flags would be enough > to bail out early. > > > For applications interested in checksum, it can follow above > > > sequence (kinds, for demonstration purpose only). > > > > > > > > Would something like above fine? Or if we want to restrict > > > > additional fields for just warnings, (L4_CHECKSUM_ERROR), how > > > > would application differentiate between checksum good & checksum > > > > not computed? In that > > > case, what should be PMDs treatment of "could not compute" v/s > > > "computed and wrong". > > > > > > I am ok with what you suggest. > > > My only thought - we already have CSUM flags in mbuf itself, so why > > > not to use them instead to pass this information from crypto PMD to > user? > > > That way it would be compliant with ethdev CSUM approach and no need > > > to spend > > > 2 bits in 'aux_flags'. > > > Konstantin > > > > [Anoob] You are right. We do have CSUM flags in mbuf and that would fully > suite our requirement here. > > > > Our problem was, it's called PKT_RX_ and the description text refers to RX. > > > > /** > > * Mask of bits used to determine the status of RX IP checksum. > > * - PKT_RX_IP_CKSUM_UNKNOWN: no information about the RX IP > checksum > > * - PKT_RX_IP_CKSUM_BAD: the IP checksum in the packet is wrong > > * - PKT_RX_IP_CKSUM_GOOD: the IP checksum in the packet is valid > > * - PKT_RX_IP_CKSUM_NONE: the IP checksum is not correct in the packet > > * data, but the integrity of the IP header is verified. > > */ > > > > But if we overlook that (& may be update documentation), it's a rather > > great idea. We could use similar PKT_TX_* flags for requesting checksum > generation with outbound operations (checksum generation for plain packet > before IPsec processing). > > > > /** > > * Offload the IP checksum in the hardware. The flag PKT_TX_IPV4 > > should > > * also be set by the application, although a PMD will only check > > * PKT_TX_IP_CKSUM. > > * - fill the mbuf offload information: l2_len, l3_len */ > > #define PKT_TX_IP_CKSUM (1ULL << 54) > > > > /** > > * Packet is IPv4. This flag must be set when using any offload > > feature > > * (TSO, L3 or L4 checksum) to tell the NIC that the packet is an IPv4 > > * packet. If the packet is a tunneled packet, this flag is related to > > * the inner headers. > > */ > > #define PKT_TX_IPV4 (1ULL << 55) > > > > Do you think above might require some modifications to document > behavior with lookaside IPsec? > > > > Also, these flags are probably the best way for checksum for inner > > packet with inline IPsec. So this looks like overall better idea. Do you > > agree? > > Not sure I understand your proposal fully. > Yes, right now inside mbuf we have different set of flags for checksum > offloads: RX and TX. > RX flags - indicate was checksum calculated/checked for incoming packet and > what was the result, While TX flags define which CSUM calculations have to > be done by HW. > Yes, I suppose same flags can be reused by crypto-dev, if it
Re: [dpdk-dev] [PATCH v2] doc/guides: add details for new test structure
>-Original Message- >From: Thomas Monjalon >Sent: Saturday 31 July 2021 18:42 >To: Power, Ciara >Cc: dev@dpdk.org; Zhang, Roy Fan ; Doherty, >Declan ; acon...@redhat.com >Subject: Re: [dpdk-dev] [PATCH v2] doc/guides: add details for new test >structure > >16/07/2021 15:40, Ciara Power: >> The testing guide is now updated to include details about using >> sub-testsuites. Some example code is given to demonstrate how they can >> be used. > >The trend is to avoid adding code in the doc, but include some existing code >with literalinclude instead. >Can it be applied here? > Hi Thomas, thanks for the review. I considered this when creating the patch, but chose to follow the style of the example in Aaron's doc patch for consistency. To include existing code, it would need to be functions from the cryptodev autotest, but I feel there would be extra code that isn't needed to demonstrate using this framework. I tried to keep the example as simple and short as possible to help with understanding. Thanks, Ciara
Re: [dpdk-dev] [dpdk-stable] [PATCH] compress/mlx5: fix level translation in xform API
01/08/2021 08:13, Matan Azrad: > From: Raja Zidane > > Compression Level is interpreted by each PMD differently. > > However, lower numbers give faster compression at the expense of > > compression ratio, while higher numbers may give better compression ratios > > but are likely slower. > > The level affects the block size, which affects performance, the bigger the > > block, the faster the compression is. > > > > The problem was that higher levels caused bigger blocks: > > size = min_block_size - 1 + level. > > > > the solution is to reverse the above: > > size = max_block_size + 1 - level. > > > > Fixes: 39a2c8715f8f ("compress/mlx5: add transformation operations") > > Cc: ma...@nvidia.com > > Cc: sta...@dpdk.org > > > > Signed-off-by: Raja Zidane > > Congrats on your first patch, Raja! The explanation above is very clear, thank you and congratulations! > Acked-by: Matan Azrad Applied, thanks.
[dpdk-dev] [PATCH] doc: announce changes in security session struct
The structure rte_security_session is not directly used by the application. The application just need an opaque pointer to attached to the mbuf or rte_crypto_op while enqueue. Hence, it can be hidden inside the library and would prevent unnecessary indirection to the priv session data in fastpath. Signed-off-by: Akhil Goyal --- doc/guides/rel_notes/deprecation.rst | 4 1 file changed, 4 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index c540c90f8e..8da1c2648c 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -159,3 +159,7 @@ Deprecation Notices * security: The functions ``rte_security_set_pkt_metadata`` and ``rte_security_get_userdata`` will be made inline functions and additional flags will be added in structure ``rte_security_ctx`` in DPDK 21.11. + +* security: Hide stucture ``rte_security_session`` and expose an opaque + pointer for the private data to the application which can be attached + to the packet while enqueuing. -- 2.25.1
Re: [dpdk-dev] [RFC v2 1/3] eventdev: allow for event devices requiring maintenance
On 2021-08-03 06:39, Jerin Jacob wrote: On Mon, Aug 2, 2021 at 9:45 PM Mattias Rönnblom wrote: Extend Eventdev API to allow for event devices which require various forms of internal processing to happen, even when events are not enqueued to or dequeued from a port. RFC v2: - Change rte_event_maintain() return type to be consistent with the documentation. - Remove unused typedef from eventdev_pmd.h. Signed-off-by: Mattias Rönnblom Tested-by: Richard Eklycke Tested-by: Liron Himi --- lib/eventdev/rte_eventdev.h | 62 + 1 file changed, 62 insertions(+) +/** + * Maintain an event device. + * + * This function is only relevant for event devices which has the + * RTE_EVENT_DEV_CAP_REQUIRES_MAINT flag set. Such devices requires + * the application to call rte_event_maintain() on a port during periods + * which it is neither enqueuing nor dequeuing events from this + * port. No port may be left unattended. + * + * An event device's rte_event_maintain() is a low overhead function. In + * situations when rte_event_maintain() must be called, the application + * should do so often. See rte_service_component_register() scheme, If a driver needs additional house keeping it can use DPDK's service core scheme to abstract different driver requirements.We may not need any public API for this. What DSW requires, and indeed any event device that does software-level event buffering, is a way schedule the execution of some function to some time later, on the lcore that currently "owns" that port. Put differently; it's not that the driver "needs some cycles at time T", but "it needs some cycles at time T on the lcore thread that currently is the user of eventdev port X". The DSW output buffers and other per-port data structures aren't, for simplicity and performance, MT safe. That's one of the reasons the processing can't be done by a random service lcore. Pushing output buffering into the application (or whatever is accessing the event device) is not a solution to the DSW<->adapter integration issue, since DSW also requires per-port deferred work for the flow migration machinery. In addition, if you have a look at the RX adapter, for example, you'll see that the buffering logic adds complexity to the "application". The services cores are a rather course-grained deferred work construct. A more elaborate one might well have been the basis of a better solution than the proposed rte_event_maintain(), user-driven API. rte_event_maintain() is a crude way to make the Ethernet/Crypto/Timer adapters work with DSW. I would argue it still puts us in a better position than we are today, where the DSW+adapter combo doesn't work at all. If/when a more fancy DPDK deferred work framework comes along, rte_event_maintain() may be deprecated. Something like work queues in Linux could work, run as a DPDK service. In such a case, you might also need to require a service-cores-only deployment, and thus disallow the use of user-launched lcore threads. That, however, is not a couple of tiny patches.
Re: [dpdk-dev] [PATCH v2] net: prepare the outer ipv4 hdr for checksum
Hi Thomas, Thanks for the review. I did the git grep rte_net_intel_cksum_prepare and git grep PKT_TX_OUTER_UDP_CKSUM. Following are the two drivers that use the function to prepare headers for checksum which also uses the outer_udp_checksum offload within drivers. 1) Hisilicon hns3 2) Wangxun txgbe 1) has implemented its own version of functions to prepare for outer header checksum. It may benefit/impact from the change. The function "rte_net_intel_cksum_prepare" is intel specific and intel cards do not support outer l4 checksum offload. DPDK may provide a generic version of the same function which can be used in different drivers. -br Mohsin On Thu, Jul 22, 2021 at 8:53 PM Thomas Monjalon wrote: > 07/07/2021 11:14, Mohsin Kazmi: > > On Wed, Jun 30, 2021 at 3:09 PM Olivier Matz > wrote: > > > > + if (ol_flags & (PKT_TX_OUTER_IPV4 | PKT_TX_OUTER_IPV6)) { > > > > inner_l3_offset += m->outer_l2_len + m->outer_l3_len; > > > > + /* > > > > + * prepare outer ipv4 header checksum by setting it to > 0, > > > > + * in order to be computed by hardware NICs. > > > > + */ > > > > + if (ol_flags & PKT_TX_OUTER_IP_CKSUM) { > > > > + ipv4_hdr = rte_pktmbuf_mtod_offset(m, > > > > + struct rte_ipv4_hdr *, > > > m->outer_l2_len); > > > > + ipv4_hdr->hdr_checksum = 0; > > > > + } > > > > + } > > > > > > What about outer L4 checksum? Does it requires the same than inner? > > > > > I am using XL710 for my testing with i40e dpdk driver. AFAIK, It doesn't > > support outer l4 checksum. I am not sure if other Intel NICs support it. > > This function is used by a lot of drivers. > Try git grep rte_net_intel_cksum_prepare > > I think we need more reviews on the v3. > Given it is far from being a new bug, I suggest to wait the next release > in order to have more feedbacks. > > >
Re: [dpdk-dev] [PATCH] net/mlx5: fix port domain_id initialization
Title proposal: net/mlx5: fix port initialization of switch domain 02/08/2021 16:55, Gregory Etelson: > All active ports that belong to the same E-switch share domain_id > value. > Port initialization procedure searches through a database for existing > port with matching properties. New domain_id allocated if match was > not located. Otherwise, new port inherits existing domain_id. > > Port initialization did not pass enough info to search procedure to > find existing matches. Therefore, each port was created with a private > domain_id value. As the result, port_id flow action failed because it > could not match ports in a rule to E-switch. > > The patch adds dpdk_dev with port properties to device search. > > Fixes: 56bb3c84e982 ("net/mlx5: reduce PCI dependency") > > Signed-off-by: Gregory Etelson > Acked-by: Viacheslav Ovsiienko Applied, thanks.
Re: [dpdk-dev] [PATCH] doc: announce changes in security session struct
> The structure rte_security_session is not directly used > by the application. The application just need an opaque > pointer to attached to the mbuf or rte_crypto_op while > enqueue. Hence, it can be hidden inside the library > and would prevent unnecessary indirection to the priv > session data in fastpath. > > Signed-off-by: Akhil Goyal > --- > doc/guides/rel_notes/deprecation.rst | 4 > 1 file changed, 4 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index c540c90f8e..8da1c2648c 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -159,3 +159,7 @@ Deprecation Notices > * security: The functions ``rte_security_set_pkt_metadata`` and >``rte_security_get_userdata`` will be made inline functions and additional >flags will be added in structure ``rte_security_ctx`` in DPDK 21.11. > + > +* security: Hide stucture ``rte_security_session`` and expose an opaque > + pointer for the private data to the application which can be attached > + to the packet while enqueuing. > -- Acked-by: Konstantin Ananyev > 2.25.1
Re: [dpdk-dev] [PATCH v2] net/mlx5: fix vni matching with non-std port at ConnectX-5
> > In the recent update, the misc5 matcher was introduced to match VxLAN > > header extra fields. However, ConnectX-5 doesn't support misc5 for the UDP > > ports different from VXLAN's standard one (4789). > > > > Need to fall back to the previous approach and use legacy misc matcher if > > non-standard UDP port is recognized in VxLAN flow. > > > > Fixes: 630a587bfb37 ("net/mlx5: support matching on VXLAN reserved field") > > Cc: sta...@dpdk.org > > > > Signed-off-by: Rongwei Liu > Acked-by: Viacheslav Ovsiienko new title: net/mlx5: fix VXLAN VNI matching on ConnectX-5 Applied, thanks.
Re: [dpdk-dev] [PATCH v3] net: fix Intel-specific Prepare the outer ipv4 hdr for checksum
On Sat, Jul 31, 2021 at 1:49 PM Andrew Rybchenko < andrew.rybche...@oktetlabs.ru> wrote: > On 7/30/21 2:11 PM, Olivier Matz wrote: > > On Wed, Jul 28, 2021 at 06:46:53PM +0300, Andrew Rybchenko wrote: > >> On 7/7/21 12:40 PM, Mohsin Kazmi wrote: > >>> Preparation the headers for the hardware offload > >>> misses the outer ipv4 checksum offload. > >>> It results in bad checksum computed by hardware NIC. > >>> > >>> This patch fixes the issue by setting the outer ipv4 > >>> checksum field to 0. > >>> > >>> Fixes: 4fb7e803eb1a ("ethdev: add Tx preparation") > >>> Cc: sta...@dpdk.org > >>> > >>> Signed-off-by: Mohsin Kazmi > >>> Acked-by: Qi Zhang > >>> --- > >>> v3: > >>> * Update the conditional test with PKT_TX_OUTER_IP_CKSUM. > >>> * Update the commit title with "Intel-specific". > >>> > >>> v2: > >>> * Update the commit message with Fixes. > >>> > >>>lib/net/rte_net.h | 15 +-- > >>>1 file changed, 13 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/lib/net/rte_net.h b/lib/net/rte_net.h > >>> index 434435ffa2..3f4c8c58b9 100644 > >>> --- a/lib/net/rte_net.h > >>> +++ b/lib/net/rte_net.h > >>> @@ -125,11 +125,22 @@ rte_net_intel_cksum_flags_prepare(struct > rte_mbuf *m, uint64_t ol_flags) > >>> * Mainly it is required to avoid fragmented headers check if > >>> * no offloads are requested. > >>> */ > >>> - if (!(ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_L4_MASK | > PKT_TX_TCP_SEG))) > >>> + if (!(ol_flags & (PKT_TX_IP_CKSUM | PKT_TX_L4_MASK | > PKT_TX_TCP_SEG | > >>> + PKT_TX_OUTER_IP_CKSUM))) > >>> return 0; > >>> - if (ol_flags & (PKT_TX_OUTER_IPV4 | PKT_TX_OUTER_IPV6)) > >>> + if (ol_flags & (PKT_TX_OUTER_IPV4 | PKT_TX_OUTER_IPV6)) { > >>> inner_l3_offset += m->outer_l2_len + m->outer_l3_len; > >>> + /* > >>> +* prepare outer ipv4 header checksum by setting it to 0, > >>> +* in order to be computed by hardware NICs. > >>> +*/ > >>> + if (ol_flags & PKT_TX_OUTER_IP_CKSUM) { > >>> + ipv4_hdr = rte_pktmbuf_mtod_offset(m, > >>> + struct rte_ipv4_hdr *, > m->outer_l2_len); > >>> + ipv4_hdr->hdr_checksum = 0; > >> > >> Here we assume that the field is located in the first segment. > >> Unlikely but it still could be false. We must handle it properly. > > > > This is specified in the API comment, so I think it has to be checked > > by the caller. > > If no, what's the point to spoil memory here if stricter check is > done few lines below. > We have two possibilities: 1) take the whole block of above code after the strict check: Then strict check will use m->outer_l2_len + m->outer_l3_len directly without any condition and we will be on the mercy of drivers to initialize these to 0 if outer headers are not use. Drivers usually don't set the fields which they are not interested in because of performance reasons as setting these values per packet will cost them additional cycles. 2) Taking just PKT_TX_OUTER_IP_CKSUM conditional check after the strict fragmented check: In that case, each packet will hit an extra conditional check without getting benefit from it, again with a performance penalty. I am more inclined towards solution 1. But I also welcome other suggestions/comments. > > >>> + } > >>> + } > >>> /* > >>> * Check if headers are fragmented. > >>> > >> > >
[dpdk-dev] [PATCH v2 0/2] Use macro to print MAC address
Added macros to simplyfy print of MAC address. The other method of first formatting mac address into a string and string printed, is avoided. Aman Singh (2): net: macro for MAC address print net: macro to extract MAC address bytes app/pdump/main.c | 5 +--- app/test-pmd/cmdline.c| 6 ++-- app/test-pmd/config.c | 6 ++-- app/test-pmd/testpmd.c| 9 ++ app/test/test_event_eth_rx_adapter.c | 5 +--- app/test/test_event_eth_tx_adapter.c | 5 +--- drivers/bus/dpaa/base/fman/netcfg_layer.c | 9 ++ drivers/common/mlx5/linux/mlx5_nl.c | 6 ++-- drivers/net/bnx2x/bnx2x.c | 4 +-- drivers/net/bnx2x/bnx2x_vfpf.c| 10 ++- drivers/net/bnx2x/ecore_sp.c | 14 - drivers/net/bnxt/bnxt_ethdev.c| 2 +- drivers/net/bonding/rte_eth_bond_8023ad.c | 4 +-- drivers/net/bonding/rte_eth_bond_pmd.c| 12 +++- drivers/net/dpaa/dpaa_ethdev.c| 10 ++- drivers/net/e1000/igb_ethdev.c| 9 ++ drivers/net/enic/base/vnic_dev.c | 4 +-- drivers/net/enic/enic_res.c | 2 +- drivers/net/failsafe/failsafe.c | 6 ++-- drivers/net/hinic/hinic_pmd_ethdev.c | 6 ++-- drivers/net/i40e/i40e_ethdev_vf.c | 21 -- drivers/net/iavf/iavf_ethdev.c| 18 +++- drivers/net/iavf/iavf_vchnl.c | 15 +++--- drivers/net/ice/ice_dcf.c | 6 ++-- drivers/net/ixgbe/ixgbe_ethdev.c | 29 --- drivers/net/mlx4/mlx4.c | 7 ++--- drivers/net/mlx5/linux/mlx5_os.c | 7 ++--- drivers/net/mlx5/windows/mlx5_os.c| 7 ++--- drivers/net/mvpp2/mrvl_flow.c | 4 +-- drivers/net/netvsc/hn_rndis.c | 2 +- drivers/net/nfp/nfp_net.c | 2 +- drivers/net/qede/base/ecore_mcp.c | 2 +- drivers/net/qede/base/ecore_sriov.c | 2 +- drivers/net/qede/qede_ethdev.c| 9 ++ drivers/net/thunderx/nicvf_ethdev.c | 2 +- drivers/net/txgbe/txgbe_ethdev_vf.c | 29 --- drivers/net/virtio/virtio_ethdev.c| 4 +-- drivers/net/vmxnet3/vmxnet3_ethdev.c | 4 +-- examples/bbdev_app/main.c | 9 ++ examples/bond/main.c | 3 +- examples/distributor/main.c | 5 +--- examples/ethtool/ethtool-app/ethapp.c | 10 ++- .../pipeline_worker_generic.c | 5 +--- .../eventdev_pipeline/pipeline_worker_tx.c| 5 +--- examples/flow_classify/flow_classify.c| 5 +--- examples/ioat/ioatfwd.c | 9 ++ examples/ip_pipeline/cli.c| 11 ++- examples/l2fwd-cat/l2fwd-cat.c| 5 +--- examples/l2fwd-crypto/main.c | 11 ++- examples/l2fwd-event/l2fwd_common.c | 9 ++ examples/l2fwd-jobstats/main.c| 11 ++- examples/l2fwd-keepalive/main.c | 9 ++ examples/l2fwd/main.c | 11 ++- examples/link_status_interrupt/main.c | 9 ++ examples/packet_ordering/main.c | 5 +--- examples/pipeline/cli.c | 6 ++-- examples/rxtx_callbacks/main.c| 4 +-- examples/server_node_efd/server/main.c| 6 ++-- examples/skeleton/basicfwd.c | 5 +--- examples/vhost/main.c | 17 +++ examples/vm_power_manager/channel_monitor.c | 4 +-- .../guest_cli/vm_power_cli_guest.c| 5 +--- examples/vm_power_manager/main.c | 5 +--- examples/vmdq/main.c | 14 ++--- examples/vmdq_dcb/main.c | 14 ++--- lib/net/rte_ether.h | 14 + lib/vhost/vhost_user.c| 2 +- 67 files changed, 155 insertions(+), 377 deletions(-) -- 2.17.1
[dpdk-dev] [PATCH v2 1/2] net: macro for MAC address print
Added macro to print six bytes of MAC address. The MAC addresses will be printed in lower case hexdecimal format. In case there is a specific check for upper case MAC address, the user may need to make a change in such test case after this patch. Signed-off-by: Aman Singh --- app/test-pmd/cmdline.c| 2 +- app/test-pmd/config.c | 2 +- app/test-pmd/testpmd.c| 2 +- drivers/bus/dpaa/base/fman/netcfg_layer.c | 2 +- drivers/common/mlx5/linux/mlx5_nl.c | 2 +- drivers/net/bnx2x/bnx2x.c | 4 ++-- drivers/net/bnx2x/bnx2x_vfpf.c| 3 ++- drivers/net/bnx2x/ecore_sp.c | 14 +++--- drivers/net/bnxt/bnxt_ethdev.c| 2 +- drivers/net/bonding/rte_eth_bond_8023ad.c | 4 ++-- drivers/net/bonding/rte_eth_bond_pmd.c| 4 ++-- drivers/net/dpaa/dpaa_ethdev.c| 2 +- drivers/net/e1000/igb_ethdev.c| 2 +- drivers/net/enic/base/vnic_dev.c | 4 ++-- drivers/net/enic/enic_res.c | 2 +- drivers/net/failsafe/failsafe.c | 2 +- drivers/net/hinic/hinic_pmd_ethdev.c | 2 +- drivers/net/i40e/i40e_ethdev_vf.c | 6 +++--- drivers/net/iavf/iavf_ethdev.c| 4 ++-- drivers/net/iavf/iavf_vchnl.c | 4 ++-- drivers/net/ice/ice_dcf.c | 2 +- drivers/net/ixgbe/ixgbe_ethdev.c | 6 +++--- drivers/net/mlx4/mlx4.c | 2 +- drivers/net/mlx5/linux/mlx5_os.c | 2 +- drivers/net/mlx5/windows/mlx5_os.c| 2 +- drivers/net/mvpp2/mrvl_flow.c | 4 ++-- drivers/net/netvsc/hn_rndis.c | 2 +- drivers/net/nfp/nfp_net.c | 2 +- drivers/net/qede/base/ecore_mcp.c | 2 +- drivers/net/qede/base/ecore_sriov.c | 2 +- drivers/net/qede/qede_ethdev.c| 2 +- drivers/net/thunderx/nicvf_ethdev.c | 2 +- drivers/net/txgbe/txgbe_ethdev_vf.c | 6 +++--- drivers/net/virtio/virtio_ethdev.c| 4 ++-- drivers/net/vmxnet3/vmxnet3_ethdev.c | 4 ++-- examples/bbdev_app/main.c | 2 +- examples/ethtool/ethtool-app/ethapp.c | 2 +- examples/ioat/ioatfwd.c | 2 +- examples/ip_pipeline/cli.c| 4 ++-- examples/l2fwd-crypto/main.c | 2 +- examples/l2fwd-event/l2fwd_common.c | 2 +- examples/l2fwd-jobstats/main.c| 2 +- examples/l2fwd-keepalive/main.c | 2 +- examples/l2fwd/main.c | 2 +- examples/link_status_interrupt/main.c | 2 +- examples/pipeline/cli.c | 2 +- examples/server_node_efd/server/main.c| 2 +- examples/vhost/main.c | 2 +- examples/vmdq/main.c | 2 +- examples/vmdq_dcb/main.c | 2 +- lib/net/rte_ether.h | 5 + lib/vhost/vhost_user.c| 2 +- 52 files changed, 79 insertions(+), 73 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 82253bc751..d4186eb9b2 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -10899,7 +10899,7 @@ static void cmd_mcast_addr_parsed(void *parsed_result, if (!rte_is_multicast_ether_addr(&res->mc_addr)) { fprintf(stderr, - "Invalid multicast addr %02X:%02X:%02X:%02X:%02X:%02X\n", + "Invalid multicast addr " RTE_ETHER_ADDR_PRT_FMT "\n", res->mc_addr.addr_bytes[0], res->mc_addr.addr_bytes[1], res->mc_addr.addr_bytes[2], res->mc_addr.addr_bytes[3], res->mc_addr.addr_bytes[4], res->mc_addr.addr_bytes[5]); diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index 31d8ba1b91..21d5db5297 100644 --- a/app/test-pmd/config.c +++ b/app/test-pmd/config.c @@ -782,7 +782,7 @@ port_summary_display(portid_t port_id) if (ret != 0) return; - printf("%-4d %02X:%02X:%02X:%02X:%02X:%02X %-12s %-14s %-8s %s\n", + printf("%-4d " RTE_ETHER_ADDR_PRT_FMT " %-12s %-14s %-8s %s\n", port_id, mac_addr.addr_bytes[0], mac_addr.addr_bytes[1], mac_addr.addr_bytes[2], mac_addr.addr_bytes[3], mac_addr.addr_bytes[4], mac_addr.addr_bytes[5], name, diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 6cbe9ba3c8..d0ede963ea 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -2622,7 +2622,7 @@ start_port(portid_t pid) pi); if (eth_macaddr_get_print_err(pi, &port->eth_addr) == 0) - printf("Port %d: %02X:%02X:%02X:%02X:%02X:%02X\n", pi, + printf("Port %d: " RTE_ETHER_ADDR_PRT_FMT "\n", pi, port->eth_addr.addr_bytes[0], port->eth_addr.addr_bytes[1],
[dpdk-dev] [PATCH v2 2/2] net: macro to extract MAC address bytes
Added macros to simplyfy print of MAC address. The other method of first formatting mac address into a string and string printed, is avoided. Signed-off-by: Aman Singh --- The change in the document will be done in seperate patch. To ensure document has direct reference of the code. V2: Fix build issue in examples code --- app/pdump/main.c | 5 +--- app/test-pmd/cmdline.c| 4 +--- app/test-pmd/config.c | 4 +--- app/test-pmd/testpmd.c| 7 +- app/test/test_event_eth_rx_adapter.c | 5 +--- app/test/test_event_eth_tx_adapter.c | 5 +--- drivers/bus/dpaa/base/fman/netcfg_layer.c | 7 +- drivers/common/mlx5/linux/mlx5_nl.c | 4 +--- drivers/net/bnx2x/bnx2x_vfpf.c| 7 +- drivers/net/bonding/rte_eth_bond_pmd.c| 8 ++- drivers/net/dpaa/dpaa_ethdev.c| 8 +-- drivers/net/e1000/igb_ethdev.c| 7 +- drivers/net/failsafe/failsafe.c | 4 +--- drivers/net/hinic/hinic_pmd_ethdev.c | 4 +--- drivers/net/i40e/i40e_ethdev_vf.c | 15 +++- drivers/net/iavf/iavf_ethdev.c| 14 ++- drivers/net/iavf/iavf_vchnl.c | 11 ++--- drivers/net/ice/ice_dcf.c | 4 +--- drivers/net/ixgbe/ixgbe_ethdev.c | 23 +++ drivers/net/mlx4/mlx4.c | 5 +--- drivers/net/mlx5/linux/mlx5_os.c | 5 +--- drivers/net/mlx5/windows/mlx5_os.c| 5 +--- drivers/net/qede/qede_ethdev.c| 7 +- drivers/net/txgbe/txgbe_ethdev_vf.c | 23 +++ examples/bbdev_app/main.c | 7 +- examples/bond/main.c | 3 +-- examples/distributor/main.c | 5 +--- examples/ethtool/ethtool-app/ethapp.c | 8 +-- .../pipeline_worker_generic.c | 5 +--- .../eventdev_pipeline/pipeline_worker_tx.c| 5 +--- examples/flow_classify/flow_classify.c| 5 +--- examples/ioat/ioatfwd.c | 7 +- examples/ip_pipeline/cli.c| 9 ++-- examples/l2fwd-cat/l2fwd-cat.c| 5 +--- examples/l2fwd-crypto/main.c | 9 ++-- examples/l2fwd-event/l2fwd_common.c | 7 +- examples/l2fwd-jobstats/main.c| 9 ++-- examples/l2fwd-keepalive/main.c | 7 +- examples/l2fwd/main.c | 9 ++-- examples/link_status_interrupt/main.c | 7 +- examples/packet_ordering/main.c | 5 +--- examples/pipeline/cli.c | 4 +--- examples/rxtx_callbacks/main.c| 4 +--- examples/server_node_efd/server/main.c| 4 +--- examples/skeleton/basicfwd.c | 5 +--- examples/vhost/main.c | 15 +++- examples/vm_power_manager/channel_monitor.c | 4 +--- .../guest_cli/vm_power_cli_guest.c| 5 +--- examples/vm_power_manager/main.c | 5 +--- examples/vmdq/main.c | 12 ++ examples/vmdq_dcb/main.c | 12 ++ lib/net/rte_ether.h | 9 52 files changed, 77 insertions(+), 305 deletions(-) diff --git a/app/pdump/main.c b/app/pdump/main.c index 63bbe65cd8..46f9d25db0 100644 --- a/app/pdump/main.c +++ b/app/pdump/main.c @@ -612,10 +612,7 @@ configure_vdev(uint16_t port_id) printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8 " %02"PRIx8" %02"PRIx8" %02"PRIx8"\n", - port_id, - addr.addr_bytes[0], addr.addr_bytes[1], - addr.addr_bytes[2], addr.addr_bytes[3], - addr.addr_bytes[4], addr.addr_bytes[5]); + port_id, RTE_ETHER_ADDR_BYTES(&addr)); ret = rte_eth_promiscuous_enable(port_id); if (ret != 0) { diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index d4186eb9b2..a5d6c20be1 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -10900,9 +10900,7 @@ static void cmd_mcast_addr_parsed(void *parsed_result, if (!rte_is_multicast_ether_addr(&res->mc_addr)) { fprintf(stderr, "Invalid multicast addr " RTE_ETHER_ADDR_PRT_FMT "\n", - res->mc_addr.addr_bytes[0], res->mc_addr.addr_bytes[1], - res->mc_addr.addr_bytes[2], res->mc_addr.addr_bytes[3], - res->mc_addr.addr_bytes[4], res->mc_addr.addr_bytes[5]); + RTE_ETHER_ADDR_BYTES(&res->mc_addr)); return; } if (strcmp(res->what, "add") == 0) diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index 21d5db52
[dpdk-dev] Fwd: [dpdk] Patch notification: 6 patches updated
Hi, Thomas Why the dmadev patchset v12/13 both deferred ? Does it have anything to do with the completion of 21.08? Thanks Forwarded Message Subject: [dpdk] Patch notification: 6 patches updated Date: Tue, 3 Aug 2021 12:20:03 + From: DPDK patchwork To: fengcheng...@huawei.com Hello, The following patches (submitted by you) have been updated in Patchwork: * dpdk: [v13,3/6] dmadev: introduce DMA device library PMD header - http://patches.dpdk.org/project/dpdk/patch/1627990189-36531-4-git-send-email-fengcheng...@huawei.com/ - for: DPDK was: New now: Deferred * dpdk: [v13,4/6] dmadev: introduce DMA device library implementation - http://patches.dpdk.org/project/dpdk/patch/1627990189-36531-5-git-send-email-fengcheng...@huawei.com/ - for: DPDK was: New now: Deferred * dpdk: [v13,6/6] maintainers: add for dmadev - http://patches.dpdk.org/project/dpdk/patch/1627990189-36531-7-git-send-email-fengcheng...@huawei.com/ - for: DPDK was: New now: Deferred * dpdk: [v13,5/6] doc: add DMA device library guide - http://patches.dpdk.org/project/dpdk/patch/1627990189-36531-6-git-send-email-fengcheng...@huawei.com/ - for: DPDK was: New now: Deferred * dpdk: [v13,1/6] dmadev: introduce DMA device library public APIs - http://patches.dpdk.org/project/dpdk/patch/1627990189-36531-2-git-send-email-fengcheng...@huawei.com/ - for: DPDK was: New now: Deferred * dpdk: [v13,2/6] dmadev: introduce DMA device library internal header - http://patches.dpdk.org/project/dpdk/patch/1627990189-36531-3-git-send-email-fengcheng...@huawei.com/ - for: DPDK was: New now: Deferred This email is a notification only - you do not need to respond. Happy patchworking. -- This is an automated mail sent by the Patchwork system at patches.dpdk.org. To stop receiving these notifications, edit your mail settings at: http://patches.dpdk.org/mail/ .
Re: [dpdk-dev] Fwd: [dpdk] Patch notification: 6 patches updated
03/08/2021 14:54, fengchengwen: > Hi, Thomas > > Why the dmadev patchset v12/13 both deferred ? Does it have anything to do > with > the completion of 21.08? We are fixing the last critical bugs to close 21.08 this week. We don't accept new features. What did you expect? Do you understand that "Deferred" means *not* for this release?
[dpdk-dev] [PATCH v2 2/2] ethdev: announce moving to general modify function
Currently there is a dedicated modify function for each field that the application wants to change. For example: rte_flow_action_type_set_tp_port to modify destination port of UDP/TCP. rte_flow_action_type_set_ipv4_dst to modify destination of IPv4. A new function rte_flow_action_modify_field DPDK added the ability to use the same function to modify any field, in addition to be able to modify the value based on different field and not just immediate value. Signed-off-by: Ori Kam Acked-by: Matan Azrad --- V2: Fix typo. --- doc/guides/rel_notes/deprecation.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index b530616281..77491c322f 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -162,3 +162,6 @@ Deprecation Notices * ethdev: The struct ``rte_flow_action_modify_data`` will be modified to support modifying larger fields than 64 bits. In addition, documentation will be updated to clarify byte order. + +* ethdev: Announce moving from dedicated modify function for each field, + to using the general ``rte_flow_modify_field`` action. -- 2.25.1
[dpdk-dev] [PATCH v2 1/2] ethdev: announce change to action modify data
In the current implementation, the action rte_flow_action_modify_field is not well defined for fields larger than 64 bits (for example IPv6 source) In addition, the byte order is also not well defined. Both of those issue should be fixed. Signed-off-by: Ori Kam Acked-by: Matan Azrad --- V2: Fix typo. --- doc/guides/rel_notes/deprecation.rst | 4 1 file changed, 4 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index d9c0e65921..b530616281 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -158,3 +158,7 @@ Deprecation Notices * security: The functions ``rte_security_set_pkt_metadata`` and ``rte_security_get_userdata`` will be made inline functions and additional flags will be added in structure ``rte_security_ctx`` in DPDK 21.11. + +* ethdev: The struct ``rte_flow_action_modify_data`` will be modified + to support modifying larger fields than 64 bits. + In addition, documentation will be updated to clarify byte order. -- 2.25.1
Re: [dpdk-dev] [PATCH] net/mlx5: workaround not supported drop action on the root table
02/08/2021 16:30, Suanming Mou: > Currently, there are two types of drop action implementation > in the PMD. One is the DR(Direct Rules) dummy placeholder drop > action and another is the dedicated dummy queue drop action. > When creates flow on the root table with DR drop action, the > action will be converted to MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP > Verbs attribute in rdma-core. > > In some inbox systems, MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP Verbs > attribute may not be supported in the kernel driver. Create flow > with drop action on the root table will be failed as it is not > supported. In this case, the dummy queue drop action should be > used instead of DR dummy placeholder drop action. > > This commit adds the DR drop action support detect on the root > table. If MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP Verbs is not > supported in the system, a dummy queue will be used as drop > action. > > Fixes: da845ae9d7c1 ("net/mlx5: fix drop action for Direct Rules/Verbs") > Cc: sta...@dpdk.org > > Signed-off-by: Suanming Mou > Acked-by: Matan Azrad Applied, thanks.
Re: [dpdk-dev] Fwd: [dpdk] Patch notification: 6 patches updated
On 2021/8/3 20:59, Thomas Monjalon wrote: > 03/08/2021 14:54, fengchengwen: >> Hi, Thomas >> >> Why the dmadev patchset v12/13 both deferred ? Does it have anything to do >> with >> the completion of 21.08? > > We are fixing the last critical bugs to close 21.08 this week. > We don't accept new features. Got it, thanks > > What did you expect? > Do you understand that "Deferred" means *not* for this release? I think we could merge dmadev in 21.08 because most of the patchset are well-reviewed (except 5/6 add guide patch), so we can all do specific driver development in 21.11. > > > > . >
Re: [dpdk-dev] [PATCH v3] doc: policy on the promotion of experimental APIs
On 11/07/2021 08:22, Jerin Jacob wrote: > On Sat, Jul 10, 2021 at 12:46 AM Tyler Retzlaff > wrote: >> >> On Fri, Jul 09, 2021 at 11:46:54AM +0530, Jerin Jacob wrote: + +Promotion to stable +~~~ + +Ordinarily APIs marked as ``experimental`` will be promoted to the stable ABI +once a maintainer and/or the original contributor is satisfied that the API is +reasonably mature. In exceptional circumstances, should an API still be >>> >>> Is this line with git commit message? >>> Why making an exceptional case? why not make it stable after two years >>> or remove it. >>> My worry is if we make an exception case, it will be difficult to >>> enumerate the exception case. >> >> i think the intent here is to indicate that an api/abi doesn't just >> automatically become stable after a period of time. there also has to >> be an evaluation by the maintainer / community before making it stable. >> >> so i guess the timer is something that should force that evaluation. as >> a part of that evaluation one would imagine there is justification for >> keeping the api as experimental for longer and if so a rationale as to >> why. > > I think, we need to have a deadline. Probably one year timer for evaluation > and > two year for max time for decision to make it as stable or remove. > Tyler is correct here (sorry for the delay I was out on vacation). In my usage of the word exception - I was conveying that an API aging or timing out should be an exceptional event. What I am hoping will happen in the 90%-ile of cases is conveyed in the previous line. "Ordinarily APIs marked as ``experimental`` will be promoted to the stable ABI once a maintainer and/or the original contributor is satisfied that the API is reasonably mature." i.e. that the symbol has be pro-actively managed with the maintainer and original author deciding when to promote. I will add a line to indicate that experimental apis should be reviewed after one year.
Re: [dpdk-dev] [PATCH v13 5/6] doc: add DMA device library guide
On Tue, Aug 3, 2021 at 5:03 PM Chengwen Feng wrote: > > This patch adds dmadev library guide. > > Signed-off-by: Chengwen Feng > --- > doc/guides/prog_guide/dmadev.rst| 126 +++ doc build has following warning in my machine ninja: Entering directory `build' [2789/2813] Generating html_guides with a custom command /export/dpdk.org/doc/guides/prog_guide/dmadev.rst:24: WARNING: Figure caption must be a paragraph or empty comment. .. figure:: img/dmadev_i1.* The model of the DMA framework built on * The DMA controller could have multiple hardware DMA channels (aka. hardware DMA queues), each hardware DMA channel should be represented by a dmadev. * The dmadev could create multiple virtual DMA channels, each virtual DMA channel represents a different transfer context. The DMA operation request must be submitted to the virtual DMA channel. e.g. Application could create virtual DMA channel 0 for memory-to-memory transfer scenario, and create virtual DMA channel 1 for memory-to-device transfer scenario. [2813/2813] Linking target app/dpdk-test-pipeline > new file mode 100644 > index 000..b305beb > --- /dev/null > +++ b/doc/guides/prog_guide/img/dmadev_i1.svg why _i1 in the name? > @@ -0,0 +1,278 @@ > + > + You could add an SPDX license and your company copyright as well. See other .svg files. Rest looks good to me. > + > + + width="206.19344mm" > + height="168.97479mm" > + viewBox="0 0 206.19344 168.97479" > + version="1.1" > + id="svg934" > + inkscape:version="1.1 (c68e22c387, 2021-05-23)" > + sodipodi:docname="dmadev_i1.svg" > + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"; > + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"; > + xmlns="http://www.w3.org/2000/svg"; > + xmlns:svg="http://www.w3.org/2000/svg";> > + + id="namedview936" > + pagecolor="#ff" > + bordercolor="#66" > + borderopacity="1.0" > + inkscape:pageshadow="2" > + inkscape:pageopacity="0.0" > + inkscape:pagecheckerboard="0" > + inkscape:document-units="mm" > + showgrid="false" > + fit-margin-top="0" > + fit-margin-left="0" > + fit-margin-right="0" > + fit-margin-bottom="0" > + inkscape:showpageshadow="false" > + inkscape:zoom="0.66635802" > + inkscape:cx="396.93377" > + inkscape:cy="480.22233" > + inkscape:window-width="1920" > + inkscape:window-height="1017" > + inkscape:window-x="1914" > + inkscape:window-y="-8" > + inkscape:window-maximized="1" > + inkscape:current-layer="layer1" /> > + + id="defs931"> > + + x="342.43954" > + y="106.56832" > + width="58.257381" > + height="137.82834" > + id="rect17873" /> > + > + + inkscape:label="Layer 1" > + inkscape:groupmode="layer" > + id="layer1" > + transform="translate(-0.13857517,-21.527306)"> > + + style="fill:#c9c9ff;fill-opacity:1;stroke-width:0.296755" > + id="rect31-9" > + width="50" > + height="28" > + x="0.13857517" > + y="21.527306" > + ry="0" /> > + + xml:space="preserve" > + > style="font-style:normal;font-weight:normal;font-size:7.05556px;line-height:1.25;font-family:sans-serif;white-space:pre;inline-size:70.1114;fill:#00;fill-opacity:1;stroke:none;stroke-width:0.264583" > + x="54.136707" > + y="18.045568" > + id="text803-1" > + transform="translate(-49.110795,15.205683)"> + x="54.136707" > + y="18.045568" > + id="tspan1277">virtual DMA + x="54.136707" > + y="26.865018" > + id="tspan1279">channel > + + style="fill:#c9c9ff;fill-opacity:1;stroke-width:0.296755" > + id="rect31-9-5" > + width="50" > + height="28" > + x="60.820271" > + y="21.69492" > + ry="0" /> > + + xml:space="preserve" > + > style="font-style:normal;font-weight:normal;font-size:7.05556px;line-height:1.25;font-family:sans-serif;white-space:pre;inline-size:70.1114;fill:#00;fill-opacity:1;stroke:none;stroke-width:0.264583" > + x="54.136707" > + y="18.045568" > + id="text803-1-4" > + transform="translate(11.570899,15.373298)"> + x="54.136707" > + y="18.045568" > + id="tspan1281">virtual DMA + x="54.136707" > + y="26.865018" > + id="tspan1283">channel > + + style="fill:#c9c9ff;fill-opacity:1;stroke-width:0.296755" > + id="rect31-9-5-3" > + width="50" > + height="28" > + x="150.74168" > + y="21.694923" > + ry="0" /> > + + xml:space="preserve" > + > style="font-style:normal;font-weight:normal;font-size:7.05556px;line-height:1.25;font-family:sans-serif;white-space:pre;inline-size:70.1114;fill:#00;fill-opacity:1;stroke:none;stroke-width:0.264583" > + x="54.136707" > + y="18.045568" > + id="text803-1-4-
Re: [dpdk-dev] Fwd: [dpdk] Patch notification: 6 patches updated
03/08/2021 15:19, fengchengwen: > On 2021/8/3 20:59, Thomas Monjalon wrote: > > 03/08/2021 14:54, fengchengwen: > >> Hi, Thomas > >> > >> Why the dmadev patchset v12/13 both deferred ? Does it have anything to do > >> with > >> the completion of 21.08? > > > > We are fixing the last critical bugs to close 21.08 this week. > > We don't accept new features. > > Got it, thanks > > > > > What did you expect? > > Do you understand that "Deferred" means *not* for this release? > > I think we could merge dmadev in 21.08 because most of the patchset are > well-reviewed > (except 5/6 add guide patch), so we can all do specific driver development in > 21.11. No way we merge a feature after -rc2.
Re: [dpdk-dev] [dpdk-ci] [PATCH v12 01/10] eal: add basic threading functions
It seems like meson encountered an error when building app/test/meson.build:472:11: ERROR: Index 1 out of bounds of array of size > 1. > > A full log can be found at > /home-local/jenkins-local/jenkins-agent/workspace/Apply-Custom-Patch-Set/dpdk/build/meson-logs/meson-log.txt > ninja: error: loading 'build.ninja': No such file or directory > I can also reproduce the issue building locally (Meson version: 0.58.1): 1. Get the DPDK main branch (f12b844b54f4ea7908ecb08ade04c7366ede031d) 2. apply all patches 3. meson $BUILD_DIR Meson doesn't really give any extra information aside from that, but looking at the build file, it looks like one of the fast_tests has no arguments and at least one is expected. On Mon, Aug 2, 2021 at 5:37 PM Dmitry Kozlyuk wrote: > + c...@dpdk.org > > 2021-08-02 14:08 (UTC-0700), Narcisa Ana Maria Vasile: > > On Mon, Aug 02, 2021 at 10:32:17AM -0700, Narcisa Ana Maria Vasile wrote: > > > From: Narcisa Vasile > > > > > > Use a portable, type-safe representation for the thread identifier. > > > Add functions for comparing thread ids and obtaining the thread id > > > for the current thread. > > > > > > Signed-off-by: Narcisa Vasile > > > --- > > > lib/eal/common/meson.build| 1 + > > > lib/eal/{unix => common}/rte_thread.c | 57 --- > > > lib/eal/include/rte_thread.h | 48 +- > > > lib/eal/unix/meson.build | 1 - > > > lib/eal/version.map | 3 ++ > > > lib/eal/windows/rte_thread.c | 17 > > > 6 files changed, 95 insertions(+), 32 deletions(-) > > > rename lib/eal/{unix => common}/rte_thread.c (66%) > > > > > > diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build > > > > Hello, > > > > I see the following error on this patch: > > > > ninja: error: loading 'build.ninja': No such file or directory > > > > https://lab.dpdk.org/results/dashboard/patchsets/18090/ > > > > Locally, the build succeedes. > > How can I see more information about this build error? >
Re: [dpdk-dev] [dpdk-ci] [PATCH v12 01/10] eal: add basic threading functions
2021-08-03 11:11 (UTC-0400), Owen Hilyard: > It seems like meson encountered an error when building > > app/test/meson.build:472:11: ERROR: Index 1 out of bounds of array of size > > 1. > > > > A full log can be found at > > /home-local/jenkins-local/jenkins-agent/workspace/Apply-Custom-Patch-Set/dpdk/build/meson-logs/meson-log.txt > > ninja: error: loading 'build.ninja': No such file or directory > > > > I can also reproduce the issue building locally (Meson version: 0.58.1): Meson 0.58+ has a known issue on Windows: https://github.com/mesonbuild/meson/issues/8981 The last known good version is 0.57.2. > 1. Get the DPDK main branch (f12b844b54f4ea7908ecb08ade04c7366ede031d) > 2. apply all patches > 3. meson $BUILD_DIR > > Meson doesn't really give any extra information aside from that, but > looking at the build file, it looks like one of the fast_tests has no > arguments and at least one is expected. > > On Mon, Aug 2, 2021 at 5:37 PM Dmitry Kozlyuk > wrote: > > > + c...@dpdk.org > > > > 2021-08-02 14:08 (UTC-0700), Narcisa Ana Maria Vasile: > > > On Mon, Aug 02, 2021 at 10:32:17AM -0700, Narcisa Ana Maria Vasile wrote: > > > > > > > From: Narcisa Vasile > > > > > > > > Use a portable, type-safe representation for the thread identifier. > > > > Add functions for comparing thread ids and obtaining the thread id > > > > for the current thread. > > > > > > > > Signed-off-by: Narcisa Vasile > > > > --- > > > > lib/eal/common/meson.build| 1 + > > > > lib/eal/{unix => common}/rte_thread.c | 57 --- > > > > lib/eal/include/rte_thread.h | 48 +- > > > > lib/eal/unix/meson.build | 1 - > > > > lib/eal/version.map | 3 ++ > > > > lib/eal/windows/rte_thread.c | 17 > > > > 6 files changed, 95 insertions(+), 32 deletions(-) > > > > rename lib/eal/{unix => common}/rte_thread.c (66%) > > > > > > > > diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build > > > > > > Hello, > > > > > > I see the following error on this patch: > > > > > > ninja: error: loading 'build.ninja': No such file or directory > > > > > > https://lab.dpdk.org/results/dashboard/patchsets/18090/ > > > > > > Locally, the build succeedes. > > > How can I see more information about this build error? > >
Re: [dpdk-dev] [PATCH] doc: abstract the behaviour of rte_ctrl_thread_create
Hi Olivier, Any comments on this? Thanks, Honnappa > > > > > > The current expected behaviour of the function > > > rte_ctrl_thread_create is rigid which makes the implementation of the > function complex. > > > Make the expected behaviour abstract to allow for simplified > > > implementation. > > > > > > With this change, the calls to pthread_setaffinity_np can be moved > > > to the control thread. This will avoid the use of > > > pthread_barrier_wait and simplify the synchronization mechanism > > > between rte_ctrl_thread_create and the calling thread. > > > > > > Signed-off-by: Honnappa Nagarahalli > > > --- > > > Possible patch is at: > > > http://patches.dpdk.org/project/dpdk/patch/20210730213709.19400-1- > > > honnappa.nagaraha...@arm.com/ > > > > > > doc/guides/rel_notes/deprecation.rst | 7 +++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/doc/guides/rel_notes/deprecation.rst > > > b/doc/guides/rel_notes/deprecation.rst > > > index 9584d6bfd7..1960e3c8bf 100644 > > > --- a/doc/guides/rel_notes/deprecation.rst > > > +++ b/doc/guides/rel_notes/deprecation.rst > > > @@ -11,6 +11,13 @@ here. > > > Deprecation Notices > > > --- > > > > > > +* eal: The expected behaviour of the function > > > +``rte_ctrl_thread_create`` > > > + abstracted to allow for simplified implementation. The new > > > +behaviour is > > > + as follows: > > > + Creates a control thread with the given name. The affinity of the > > > +new > > > + thread is based on the CPU affinity retrieved at the time > > > +rte_eal_init() > > > + was called, the dataplane and service lcores are then excluded. > > > + > > > * kvargs: The function ``rte_kvargs_process`` will get a new parameter > > >for returning key match count. It will ease handling of no-match case. > > > > > > -- > > > 2.17.1 > > Acked-by: Ruifeng Wang > > Acked-by: Jerin Jacob
Re: [dpdk-dev] [dpdk-ci] [PATCH v12 01/10] eal: add basic threading functions
Our windows servers are both running 0.57.1, but all of the *nix hosts are running 0.58.1. This issue also happens on 0.57.1 and 0.57.2, with the exact same steps to reproduce. On Tue, Aug 3, 2021 at 11:38 AM Dmitry Kozlyuk wrote: > 2021-08-03 11:11 (UTC-0400), Owen Hilyard: > > It seems like meson encountered an error when building > > > > app/test/meson.build:472:11: ERROR: Index 1 out of bounds of array of > size > > > 1. > > > > > > A full log can be found at > > > > /home-local/jenkins-local/jenkins-agent/workspace/Apply-Custom-Patch-Set/dpdk/build/meson-logs/meson-log.txt > > > ninja: error: loading 'build.ninja': No such file or directory > > > > > > > I can also reproduce the issue building locally (Meson version: 0.58.1): > > Meson 0.58+ has a known issue on Windows: > https://github.com/mesonbuild/meson/issues/8981 > The last known good version is 0.57.2. > > > 1. Get the DPDK main branch (f12b844b54f4ea7908ecb08ade04c7366ede031d) > > 2. apply all patches > > 3. meson $BUILD_DIR > > > > Meson doesn't really give any extra information aside from that, but > > looking at the build file, it looks like one of the fast_tests has no > > arguments and at least one is expected. > > > > On Mon, Aug 2, 2021 at 5:37 PM Dmitry Kozlyuk > > wrote: > > > > > + c...@dpdk.org > > > > > > 2021-08-02 14:08 (UTC-0700), Narcisa Ana Maria Vasile: > > > > On Mon, Aug 02, 2021 at 10:32:17AM -0700, Narcisa Ana Maria Vasile > wrote: > > > > > From: Narcisa Vasile > > > > > > > > > > Use a portable, type-safe representation for the thread identifier. > > > > > Add functions for comparing thread ids and obtaining the thread id > > > > > for the current thread. > > > > > > > > > > Signed-off-by: Narcisa Vasile > > > > > --- > > > > > lib/eal/common/meson.build| 1 + > > > > > lib/eal/{unix => common}/rte_thread.c | 57 > --- > > > > > lib/eal/include/rte_thread.h | 48 +- > > > > > lib/eal/unix/meson.build | 1 - > > > > > lib/eal/version.map | 3 ++ > > > > > lib/eal/windows/rte_thread.c | 17 > > > > > 6 files changed, 95 insertions(+), 32 deletions(-) > > > > > rename lib/eal/{unix => common}/rte_thread.c (66%) > > > > > > > > > > diff --git a/lib/eal/common/meson.build > b/lib/eal/common/meson.build > > > > > > > > Hello, > > > > > > > > I see the following error on this patch: > > > > > > > > ninja: error: loading 'build.ninja': No such file or directory > > > > > > > > https://lab.dpdk.org/results/dashboard/patchsets/18090/ > > > > > > > > Locally, the build succeedes. > > > > How can I see more information about this build error? > > > > >
[dpdk-dev] [PATCH v4] doc: policy on the promotion of experimental APIs
Clarifying the ABI policy on the promotion of experimental APIS to stable. We have a fair number of APIs that have been experimental for more than 2 years. This policy amendment indicates that these APIs should be promoted or removed, or should at least form a conservation between the maintainer and original contributor. Signed-off-by: Ray Kinsella Acked-by: Tyler Retzlaff --- v2: addressing comments on abi expiry from Tyler Retzlaff. v3: addressing typos in the git commit message v4: addressing typos and comments by Jerin Jacob doc/guides/contributing/abi_policy.rst | 25 ++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst index 4ad87dbfed..1acd12cbf4 100644 --- a/doc/guides/contributing/abi_policy.rst +++ b/doc/guides/contributing/abi_policy.rst @@ -26,9 +26,10 @@ General Guidelines symbols is managed with :ref:`ABI Versioning `. #. The removal of symbols is considered an :ref:`ABI breakage `, once approved these will form part of the next ABI version. -#. Libraries or APIs marked as :ref:`experimental ` may - be changed or removed without prior notice, as they are not considered part - of an ABI version. +#. Libraries or APIs marked as :ref:`experimental ` may be + changed or removed without prior notice, as they are not considered part of + an ABI version. The :ref:`experimental ` status of an API + is not an indefinite state. #. Updates to the :ref:`minimum hardware requirements `, which drop support for hardware which was previously supported, should be treated as an ABI change. @@ -358,3 +359,21 @@ Libraries Libraries marked as ``experimental`` are entirely not considered part of an ABI version. All functions in such libraries may be changed or removed without prior notice. + +Promotion to stable +~~~ + +An API's ``experimental`` status should be reviewed annually, by both the +maintainer and/or the original contributor. Ordinarily APIs marked as +``experimental`` will be promoted to the stable ABI once a maintainer has become +satisfied that the API is mature and is unlikely to change. + +In exceptional circumstances, should an API still be classified as +``experimental`` after two years and is without any prospect of becoming part of +the stable API. The API will then become a candidate for removal, to avoid the +acculumation of abandoned symbols. + +Should an API's Binary Interface change, usually due to a direct change to the +API's signature, it is reasonable for the review and expiry clocks to reset. The +promotion or removal of symbols will typically form part of a conversation +between the maintainer and the original contributor. -- 2.26.2
[dpdk-dev] [PATCH 2/2] doc: update the offload information for Metering Hierarchy
Updates the Minimal SW and HW Version offload support information for Metering hierarchy. Signed-off-by: Jiawei Wang --- doc/guides/nics/mlx5.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 42559cf261..b6b8ecb3a0 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -1744,6 +1744,11 @@ Supported hardware offloads | | | rdma-core 33 | | rdma-core 33 | | | | ConnectX-6 Dx| | ConnectX-6 Dx | +---+-+-+ + | Metering Hierarchy| | DPDK 21.08 | | DPDK 21.08| + | | | OFED 5.3 | | OFED 5.3 | + | | | N/A | | N/A | + | | | ConnectX-6 Dx| | ConnectX-6 Dx | + +---+-+-+ | Sampling | | DPDK 20.11 | | DPDK 20.11| | | | OFED 5.1-2 | | OFED 5.1-2| | | | rdma-core 32 | | N/A | -- 2.18.1
[dpdk-dev] [PATCH 1/2] doc: update the offload information for ASO Metering
Updates the Minimal SW and HW Version offload support information for ASO metering. Signed-off-by: Jiawei Wang --- doc/guides/nics/mlx5.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 3e9c736cae..42559cf261 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -1739,6 +1739,11 @@ Supported hardware offloads | | | rdma-core 26 | | rdma-core 26 | | | | ConnectX-5 | | ConnectX-5| +---+-+-+ + | ASO Metering | | DPDK 21.05 | | DPDK 21.05| + | | | OFED 5.3 | | OFED 5.3 | + | | | rdma-core 33 | | rdma-core 33 | + | | | ConnectX-6 Dx| | ConnectX-6 Dx | + +---+-+-+ | Sampling | | DPDK 20.11 | | DPDK 20.11| | | | OFED 5.1-2 | | OFED 5.1-2| | | | rdma-core 32 | | N/A | -- 2.18.1
Re: [dpdk-dev] [PATCH 1/2] doc: update the offload information for ASO Metering
>-Original Message- >From: Jiawei(Jonny) Wang >Sent: Tuesday, August 3, 2021 4:03 PM >To: Slava Ovsiienko ; Matan Azrad >; Asaf Penso ; NBU-Contact- >Thomas Monjalon ; Shahaf Shuler > >Cc: dev@dpdk.org; Raslan Darawsheh >Subject: [PATCH 1/2] doc: update the offload information for ASO Metering > >Updates the Minimal SW and HW Version offload support information for ASO >metering. > >Signed-off-by: Jiawei Wang Acked-by: Asaf Penso >--- > doc/guides/nics/mlx5.rst | 5 + > 1 file changed, 5 insertions(+) > >diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index >3e9c736cae..42559cf261 100644 >--- a/doc/guides/nics/mlx5.rst >+++ b/doc/guides/nics/mlx5.rst >@@ -1739,6 +1739,11 @@ Supported hardware offloads >| | | rdma-core 26 | | rdma-core 26 | >| | | ConnectX-5 | | ConnectX-5| >+---+-+-+ >+ | ASO Metering | | DPDK 21.05 | | DPDK 21.05| >+ | | | OFED 5.3 | | OFED 5.3 | >+ | | | rdma-core 33 | | rdma-core 33 | >+ | | | ConnectX-6 Dx| | ConnectX-6 Dx | >+ +---+-+-+ >| Sampling | | DPDK 20.11 | | DPDK 20.11| >| | | OFED 5.1-2 | | OFED 5.1-2| >| | | rdma-core 32 | | N/A | >-- >2.18.1
Re: [dpdk-dev] [PATCH 2/2] doc: update the offload information for Metering Hierarchy
>-Original Message- >From: Jiawei(Jonny) Wang >Sent: Tuesday, August 3, 2021 4:03 PM >To: Slava Ovsiienko ; Matan Azrad >; Asaf Penso ; NBU-Contact- >Thomas Monjalon ; Shahaf Shuler > >Cc: dev@dpdk.org; Raslan Darawsheh >Subject: [PATCH 2/2] doc: update the offload information for Metering >Hierarchy > >Updates the Minimal SW and HW Version offload support information for >Metering hierarchy. > >Signed-off-by: Jiawei Wang Acked-by: Asaf Penso >--- > doc/guides/nics/mlx5.rst | 5 + > 1 file changed, 5 insertions(+) > >diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index >42559cf261..b6b8ecb3a0 100644 >--- a/doc/guides/nics/mlx5.rst >+++ b/doc/guides/nics/mlx5.rst >@@ -1744,6 +1744,11 @@ Supported hardware offloads >| | | rdma-core 33 | | rdma-core 33 | >| | | ConnectX-6 Dx| | ConnectX-6 Dx | >+---+-+-+ >+ | Metering Hierarchy| | DPDK 21.08 | | DPDK 21.08| >+ | | | OFED 5.3 | | OFED 5.3 | >+ | | | N/A | | N/A | >+ | | | ConnectX-6 Dx| | ConnectX-6 Dx | >+ +---+-+-+ >| Sampling | | DPDK 20.11 | | DPDK 20.11| >| | | OFED 5.1-2 | | OFED 5.1-2| >| | | rdma-core 32 | | N/A | >-- >2.18.1
Re: [dpdk-dev] [dpdk-announce] release candidate 21.08-rc3
Hi IBM - Power Systems DPDK 21.08-rc3 * Basic PF on Mellanox: No new issues or regressions were seen. * Performance: not tested. Systems tested: - IBM Power9 PowerNV 9006-22P OS: RHEL 8.3 GCC: version 8.3.1 20191121 (Red Hat 8.3.1-5) NICs: - Mellanox Technologies MT28800 Family [ConnectX-5 Ex] - firmware version: 16.29.1017 - MLNX_OFED_LINUX-5.2-1.0.4.1 (OFED-5.2-1.0.4) - LPARs on IBM Power10 CHRP IBM,9105-42B OS: RHEL 8.4 GCC: gcc version 8.4.1 20200928 (Red Hat 8.4.1-1) NICs: - Mellanox Technologies MT28800 Family [ConnectX-5 Ex] - firmware version: 16.30.1004 - MLNX_OFED_LINUX-5.3-1.0.0.2 Regards, Thinh Tran On 7/31/2021 4:19 PM, Thomas Monjalon wrote: A new DPDK release candidate is ready for testing: https://git.dpdk.org/dpdk/tag/?id=v21.08-rc3 There are 70 new patches in this snapshot. Release notes: https://doc.dpdk.org/guides/rel_notes/release_21_08.html Please check all pending announces of deprecations for 21.11. https://patches.dpdk.org/project/dpdk/list/?q=announce You may share some release validation results by replying to this message at dev@dpdk.org. DPDK 21.08-rc4 is expected on Wednesday. Thank you everyone
[dpdk-dev] [PATCH v13 00/10] eal: Add EAL API for threading
From: Narcisa Vasile EAL thread API **Problem Statement** DPDK currently uses the pthread interface to create and manage threads. Windows does not support the POSIX thread programming model, so it currently relies on a header file that hides the Windows calls under pthread matched interfaces. Given that EAL should isolate the environment specifics from the applications and libraries and mediate all the communication with the operating systems, a new EAL interface is needed for thread management. **Goals** * Introduce a generic EAL API for threading support that will remove the current Windows pthread.h shim. * Replace references to pthread_* across the DPDK codebase with the new RTE_THREAD_* API. * Allow users to choose between using the RTE_THREAD_* API or a 3rd party thread library through a configuration option. **Design plan** New API main files: * rte_thread.h (librte_eal/include) * rte_thread.c (librte_eal/windows) * rte_thread.c (librte_eal/common) **A schematic example of the design** -- lib/librte_eal/include/rte_thread.h int rte_thread_create(); lib/librte_eal/common/rte_thread.c int rte_thread_create() { return pthread_create(); } lib/librte_eal/windows/rte_thread.c int rte_thread_create() { return CreateThread(); } - **Thread attributes** When or after a thread is created, specific characteristics of the thread can be adjusted. Given that the thread characteristics that are of interest for DPDK applications are affinity and priority, the following structure that represents thread attributes has been defined: typedef struct { enum rte_thread_priority priority; rte_cpuset_t cpuset; } rte_thread_attr_t; The *rte_thread_create()* function can optionally receive an rte_thread_attr_t object that will cause the thread to be created with the affinity and priority described by the attributes object. If no rte_thread_attr_t is passed (parameter is NULL), the default affinity and priority are used. An rte_thread_attr_t object can also be set to the default values by calling *rte_thread_attr_init()*. *Priority* is represented through an enum that currently advertises two values for priority: - RTE_THREAD_PRIORITY_NORMAL - RTE_THREAD_PRIORITY_REALTIME_CRITICAL The enum can be extended to allow for multiple priority levels. rte_thread_set_priority - sets the priority of a thread rte_thread_attr_set_priority - updates an rte_thread_attr_t object with a new value for priority The user can choose thread priority through an EAL parameter, when starting an application. If EAL parameter is not used, the per-platform default value for thread priority is used. Otherwise administrator has an option to set one of available options: --thread-prio normal --thread-prio realtime Example: ./dpdk-l2fwd -l 0-3 -n 4 –thread-prio normal -- -q 8 -p *Affinity* is described by the already known “rte_cpuset_t” type. rte_thread_attr_set/get_affinity - sets/gets the affinity field in a rte_thread_attr_t object rte_thread_set/get_affinity – sets/gets the affinity of a thread **Errors** A translation function that maps Windows error codes to errno-style error codes is provided. **Future work** The long term plan is for EAL to provide full threading support: * Add support for conditional variables * Add support for pthread_mutex_trylock * Additional functionality offered by pthread_* (such as pthread_setname_np, etc.) v13: - Fix syntax error in unit tests v12: - Fix freebsd warning about initializer in unit tests v11: - Add unit tests for thread API - Rebase v10: - Remove patch no. 10. It will be broken down in subpatches and sent as a different patchset that depends on this one. This is done due to the ABI breaks that would be caused by patch 10. - Replace unix/rte_thread.c with common/rte_thread.c - Remove initializations that may prevent compiler from issuing useful warnings. - Remove rte_thread_types.h and rte_windows_thread_types.h - Remove unneeded priority macros (EAL_THREAD_PRIORITY*) - Remove functions that retrieves thread handle from process handle - Remove rte_thread_cancel() until same behavior is obtained on all platforms. - Fix rte_thread_detach() function description, return value and remove empty line. - Reimplement mutex functions. Add compatible representation for mutex identifier. Add macro to replace static mutex initialization instances. - Fix commit messages (lines too long, remove unicode symbols) v9: - Sign patches v8: - Rebase - Add rte_thread_detach() API - Set default priority, when user did not specify a value v7: Based on DmitryK's review: - Change thread id representation - Change mutex id representation - Implement static mutex inititalizer for Windows - Change barrier identifier representation - Improve comm
[dpdk-dev] [PATCH v13 01/10] eal: add basic threading functions
From: Narcisa Vasile Use a portable, type-safe representation for the thread identifier. Add functions for comparing thread ids and obtaining the thread id for the current thread. Signed-off-by: Narcisa Vasile --- lib/eal/common/meson.build| 1 + lib/eal/{unix => common}/rte_thread.c | 57 --- lib/eal/include/rte_thread.h | 48 +- lib/eal/unix/meson.build | 1 - lib/eal/version.map | 3 ++ lib/eal/windows/rte_thread.c | 17 6 files changed, 95 insertions(+), 32 deletions(-) rename lib/eal/{unix => common}/rte_thread.c (66%) diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build index edfca9..eda250247b 100644 --- a/lib/eal/common/meson.build +++ b/lib/eal/common/meson.build @@ -80,6 +80,7 @@ sources += files( 'rte_random.c', 'rte_reciprocal.c', 'rte_service.c', +'rte_thread.c', 'rte_version.c', ) diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/common/rte_thread.c similarity index 66% rename from lib/eal/unix/rte_thread.c rename to lib/eal/common/rte_thread.c index c72d619ec1..92a7451b0a 100644 --- a/lib/eal/unix/rte_thread.c +++ b/lib/eal/common/rte_thread.c @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright 2021 Mellanox Technologies, Ltd + * Copyright(c) 2021 Microsoft Corporation */ #include @@ -16,25 +17,41 @@ struct eal_tls_key { pthread_key_t thread_index; }; +rte_thread_t +rte_thread_self(void) +{ + rte_thread_t thread_id; + + thread_id.opaque_id = (uintptr_t)pthread_self(); + + return thread_id; +} + +int +rte_thread_equal(rte_thread_t t1, rte_thread_t t2) +{ + return pthread_equal((pthread_t)t1.opaque_id, (pthread_t)t2.opaque_id); +} + int rte_thread_key_create(rte_thread_key *key, void (*destructor)(void *)) { int err; + rte_thread_key k; - *key = malloc(sizeof(**key)); - if ((*key) == NULL) { + k = malloc(sizeof(*k)); + if (k == NULL) { RTE_LOG(DEBUG, EAL, "Cannot allocate TLS key.\n"); - rte_errno = ENOMEM; - return -1; + return EINVAL; } - err = pthread_key_create(&((*key)->thread_index), destructor); - if (err) { + err = pthread_key_create(&(k->thread_index), destructor); + if (err != 0) { RTE_LOG(DEBUG, EAL, "pthread_key_create failed: %s\n", strerror(err)); - free(*key); - rte_errno = ENOEXEC; - return -1; + free(k); + return err; } + *key = k; return 0; } @@ -43,18 +60,16 @@ rte_thread_key_delete(rte_thread_key key) { int err; - if (!key) { + if (key == NULL) { RTE_LOG(DEBUG, EAL, "Invalid TLS key.\n"); - rte_errno = EINVAL; - return -1; + return EINVAL; } err = pthread_key_delete(key->thread_index); - if (err) { + if (err != 0) { RTE_LOG(DEBUG, EAL, "pthread_key_delete failed: %s\n", strerror(err)); free(key); - rte_errno = ENOEXEC; - return -1; + return err; } free(key); return 0; @@ -65,17 +80,15 @@ rte_thread_value_set(rte_thread_key key, const void *value) { int err; - if (!key) { + if (key == NULL) { RTE_LOG(DEBUG, EAL, "Invalid TLS key.\n"); - rte_errno = EINVAL; - return -1; + return EINVAL; } err = pthread_setspecific(key->thread_index, value); - if (err) { + if (err != 0) { RTE_LOG(DEBUG, EAL, "pthread_setspecific failed: %s\n", strerror(err)); - rte_errno = ENOEXEC; - return -1; + return err; } return 0; } @@ -83,7 +96,7 @@ rte_thread_value_set(rte_thread_key key, const void *value) void * rte_thread_value_get(rte_thread_key key) { - if (!key) { + if (key == NULL) { RTE_LOG(DEBUG, EAL, "Invalid TLS key.\n"); rte_errno = EINVAL; return NULL; diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h index 8be8ed8f36..748f64d230 100644 --- a/lib/eal/include/rte_thread.h +++ b/lib/eal/include/rte_thread.h @@ -1,6 +1,8 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2021 Mellanox Technologies, Ltd + * Copyright(c) 2021 Microsoft Corporation */ +#include #include #include @@ -20,11 +22,45 @@ extern "C" { #endif +#include + +/** + * Thread id descriptor. + */ +typedef struct rte_thread_tag { + uintptr_t opaque_id; /**< thread identifier */ +} rte_thread_t; + /** * TLS key type, an opaque pointer. */ typedef struct eal_tls_key *
[dpdk-dev] [PATCH v13 02/10] eal: add thread attributes
From: Narcisa Vasile Implement thread attributes for: * thread affinity * thread priority Implement functions for managing thread attributes. Priority is represented through an enum that allows for two levels: - RTE_THREAD_PRIORITY_NORMAL - RTE_THREAD_PRIORITY_REALTIME_CRITICAL Affinity is described by the rte_cpuset_t type. An rte_thread_attr_t object can be set to the default values by calling rte_thread_attr_init(). Signed-off-by: Narcisa Vasile --- lib/eal/common/rte_thread.c | 46 ++ lib/eal/include/rte_thread.h | 93 lib/eal/version.map | 4 ++ lib/eal/windows/rte_thread.c | 44 + 4 files changed, 187 insertions(+) diff --git a/lib/eal/common/rte_thread.c b/lib/eal/common/rte_thread.c index 92a7451b0a..e1a4d7eae4 100644 --- a/lib/eal/common/rte_thread.c +++ b/lib/eal/common/rte_thread.c @@ -9,6 +9,7 @@ #include #include +#include #include #include #include @@ -33,6 +34,51 @@ rte_thread_equal(rte_thread_t t1, rte_thread_t t2) return pthread_equal((pthread_t)t1.opaque_id, (pthread_t)t2.opaque_id); } +int +rte_thread_attr_init(rte_thread_attr_t *attr) +{ + RTE_VERIFY(attr != NULL); + + CPU_ZERO(&attr->cpuset); + attr->priority = RTE_THREAD_PRIORITY_NORMAL; + + return 0; +} + +int +rte_thread_attr_set_affinity(rte_thread_attr_t *thread_attr, +rte_cpuset_t *cpuset) +{ + RTE_VERIFY(thread_attr != NULL); + RTE_VERIFY(cpuset != NULL); + + thread_attr->cpuset = *cpuset; + + return 0; +} + +int +rte_thread_attr_get_affinity(rte_thread_attr_t *thread_attr, +rte_cpuset_t *cpuset) +{ + RTE_VERIFY(thread_attr != NULL); + RTE_VERIFY(cpuset != NULL); + + *cpuset = thread_attr->cpuset; + + return 0; +} + +int +rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr, +enum rte_thread_priority priority) +{ + RTE_VERIFY(thread_attr != NULL); + + thread_attr->priority = priority; + return 0; +} + int rte_thread_key_create(rte_thread_key *key, void (*destructor)(void *)) { diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h index 748f64d230..032ff73b36 100644 --- a/lib/eal/include/rte_thread.h +++ b/lib/eal/include/rte_thread.h @@ -31,6 +31,30 @@ typedef struct rte_thread_tag { uintptr_t opaque_id; /**< thread identifier */ } rte_thread_t; +/** + * Thread priority values. + */ +enum rte_thread_priority { + RTE_THREAD_PRIORITY_UNDEFINED = 0, + /**< priority hasn't been defined */ + RTE_THREAD_PRIORITY_NORMAL= 1, + /**< normal thread priority, the default */ + RTE_THREAD_PRIORITY_REALTIME_CRITICAL = 2, + /**< highest thread priority allowed */ +}; + +#ifdef RTE_HAS_CPUSET + +/** + * Representation for thread attributes. + */ +typedef struct { + enum rte_thread_priority priority; /**< thread priority */ + rte_cpuset_t cpuset; /**< thread affinity */ +} rte_thread_attr_t; + +#endif /* RTE_HAS_CPUSET */ + /** * TLS key type, an opaque pointer. */ @@ -63,6 +87,75 @@ int rte_thread_equal(rte_thread_t t1, rte_thread_t t2); #ifdef RTE_HAS_CPUSET +/** + * Initialize the attributes of a thread. + * These attributes can be passed to the rte_thread_create() function + * that will create a new thread and set its attributes according to attr. + * + * @param attr + * Thread attributes to initialize. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_attr_init(rte_thread_attr_t *attr); + +/** + * Set the CPU affinity value in the thread attributes pointed to + * by 'thread_attr'. + * + * @param thread_attr + * Points to the thread attributes in which affinity will be updated. + * + * @param cpuset + * Points to the value of the affinity to be set. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_attr_set_affinity(rte_thread_attr_t *thread_attr, + rte_cpuset_t *cpuset); + +/** + * Get the value of CPU affinity that is set in the thread attributes pointed + * to by 'thread_attr'. + * + * @param thread_attr + * Points to the thread attributes from which affinity will be retrieved. + * + * @param cpuset + * Pointer to the memory that will store the affinity. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_attr_get_affinity(rte_thread_attr_t *thread_attr, + rte_cpuset_t *cpuset); + +/** + * Set the thread priority value in the thread attributes pointed to + * by 'thread_attr'. + * + * @param thread_attr + * Points to the thread attributes in which priority will be updated. + * + * @param priority + * Po
[dpdk-dev] [PATCH v13 03/10] eal/windows: translate Windows errors to errno-style errors
From: Narcisa Vasile Add function to translate Windows error codes to errno-style error codes. The possible return values are chosen so that we have as much semantical compatibility between platforms as possible. Signed-off-by: Narcisa Vasile --- lib/eal/common/rte_thread.c | 6 +-- lib/eal/include/rte_thread.h | 5 +- lib/eal/windows/rte_thread.c | 95 +++- 3 files changed, 76 insertions(+), 30 deletions(-) diff --git a/lib/eal/common/rte_thread.c b/lib/eal/common/rte_thread.c index e1a4d7eae4..27ad1c7eb0 100644 --- a/lib/eal/common/rte_thread.c +++ b/lib/eal/common/rte_thread.c @@ -47,7 +47,7 @@ rte_thread_attr_init(rte_thread_attr_t *attr) int rte_thread_attr_set_affinity(rte_thread_attr_t *thread_attr, -rte_cpuset_t *cpuset) + rte_cpuset_t *cpuset) { RTE_VERIFY(thread_attr != NULL); RTE_VERIFY(cpuset != NULL); @@ -59,7 +59,7 @@ rte_thread_attr_set_affinity(rte_thread_attr_t *thread_attr, int rte_thread_attr_get_affinity(rte_thread_attr_t *thread_attr, -rte_cpuset_t *cpuset) + rte_cpuset_t *cpuset) { RTE_VERIFY(thread_attr != NULL); RTE_VERIFY(cpuset != NULL); @@ -71,7 +71,7 @@ rte_thread_attr_get_affinity(rte_thread_attr_t *thread_attr, int rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr, -enum rte_thread_priority priority) + enum rte_thread_priority priority) { RTE_VERIFY(thread_attr != NULL); diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h index 032ff73b36..bf649c2fe6 100644 --- a/lib/eal/include/rte_thread.h +++ b/lib/eal/include/rte_thread.h @@ -235,9 +235,8 @@ int rte_thread_value_set(rte_thread_key key, const void *value); * * @return * On success, value data pointer (can also be NULL). - * On failure, NULL and an error number is set in rte_errno. - * rte_errno can be: EINVAL - Invalid parameter passed. - * ENOEXEC - Specific OS error. + * On failure, NULL and a positive error number is set in rte_errno. + * */ __rte_experimental void *rte_thread_value_get(rte_thread_key key); diff --git a/lib/eal/windows/rte_thread.c b/lib/eal/windows/rte_thread.c index 01966e7745..c1ecfbd6ae 100644 --- a/lib/eal/windows/rte_thread.c +++ b/lib/eal/windows/rte_thread.c @@ -13,6 +13,54 @@ struct eal_tls_key { DWORD thread_index; }; +/* Translates the most common error codes related to threads */ +static int +thread_translate_win32_error(DWORD error) +{ + switch (error) { + case ERROR_SUCCESS: + return 0; + + case ERROR_INVALID_PARAMETER: + return EINVAL; + + case ERROR_INVALID_HANDLE: + return EFAULT; + + case ERROR_NOT_ENOUGH_MEMORY: + /* FALLTHROUGH */ + case ERROR_NO_SYSTEM_RESOURCES: + return ENOMEM; + + case ERROR_PRIVILEGE_NOT_HELD: + /* FALLTHROUGH */ + case ERROR_ACCESS_DENIED: + return EACCES; + + case ERROR_ALREADY_EXISTS: + return EEXIST; + + case ERROR_POSSIBLE_DEADLOCK: + return EDEADLK; + + case ERROR_INVALID_FUNCTION: + /* FALLTHROUGH */ + case ERROR_CALL_NOT_IMPLEMENTED: + return ENOSYS; + } + + return EINVAL; +} + +static int +thread_log_last_error(const char *message) +{ + DWORD error = GetLastError(); + RTE_LOG(DEBUG, EAL, "GetLastError()=%lu: %s\n", error, message); + + return thread_translate_win32_error(error); +} + rte_thread_t rte_thread_self(void) { @@ -42,7 +90,7 @@ rte_thread_attr_init(rte_thread_attr_t *attr) int rte_thread_attr_set_affinity(rte_thread_attr_t *thread_attr, -rte_cpuset_t *cpuset) + rte_cpuset_t *cpuset) { RTE_VERIFY(thread_attr != NULL); thread_attr->cpuset = *cpuset; @@ -52,7 +100,7 @@ rte_thread_attr_set_affinity(rte_thread_attr_t *thread_attr, int rte_thread_attr_get_affinity(rte_thread_attr_t *thread_attr, -rte_cpuset_t *cpuset) + rte_cpuset_t *cpuset) { RTE_VERIFY(thread_attr != NULL); @@ -63,7 +111,7 @@ rte_thread_attr_get_affinity(rte_thread_attr_t *thread_attr, int rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr, -enum rte_thread_priority priority) + enum rte_thread_priority priority) { RTE_VERIFY(thread_attr != NULL); @@ -76,18 +124,18 @@ int rte_thread_key_create(rte_thread_key *key, __rte_unused void (*destructor)(void *)) { + int ret; + *key = malloc(sizeof(**key)); if ((*key) == NULL) { RTE_LOG(DEBUG, EAL, "Cannot allocate TLS key.\n"); - rte_errno = ENOMEM; - return -1; + return ENOMEM; } (*key)->thread_index = TlsAlloc();
[dpdk-dev] [PATCH v13 04/10] eal: implement functions for thread affinity management
From: Narcisa Vasile Implement functions for getting/setting thread affinity. Threads can be pinned to specific cores by setting their affinity attribute. Signed-off-by: Narcisa Vasile Signed-off-by: Dmitry Malloy --- lib/eal/common/rte_thread.c | 16 lib/eal/include/rte_thread.h | 36 +++ lib/eal/version.map | 2 + lib/eal/windows/eal_lcore.c | 176 +- lib/eal/windows/eal_windows.h | 10 ++ lib/eal/windows/rte_thread.c | 125 +++- 6 files changed, 319 insertions(+), 46 deletions(-) diff --git a/lib/eal/common/rte_thread.c b/lib/eal/common/rte_thread.c index 27ad1c7eb0..73b7b3141c 100644 --- a/lib/eal/common/rte_thread.c +++ b/lib/eal/common/rte_thread.c @@ -34,6 +34,22 @@ rte_thread_equal(rte_thread_t t1, rte_thread_t t2) return pthread_equal((pthread_t)t1.opaque_id, (pthread_t)t2.opaque_id); } +int +rte_thread_set_affinity_by_id(rte_thread_t thread_id, + const rte_cpuset_t *cpuset) +{ + return pthread_setaffinity_np((pthread_t)thread_id.opaque_id, + sizeof(*cpuset), cpuset); +} + +int +rte_thread_get_affinity_by_id(rte_thread_t thread_id, + rte_cpuset_t *cpuset) +{ + return pthread_getaffinity_np((pthread_t)thread_id.opaque_id, + sizeof(*cpuset), cpuset); +} + int rte_thread_attr_init(rte_thread_attr_t *attr) { diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h index bf649c2fe6..ca4ade60e2 100644 --- a/lib/eal/include/rte_thread.h +++ b/lib/eal/include/rte_thread.h @@ -87,6 +87,42 @@ int rte_thread_equal(rte_thread_t t1, rte_thread_t t2); #ifdef RTE_HAS_CPUSET +/** + * Set the affinity of thread 'thread_id' to the cpu set + * specified by 'cpuset'. + * + * @param thread_id + *Id of the thread for which to set the affinity. + * + * @param cpuset + * Pointer to CPU affinity to set. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_set_affinity_by_id(rte_thread_t thread_id, + const rte_cpuset_t *cpuset); + +/** + * Get the affinity of thread 'thread_id' and store it + * in 'cpuset'. + * + * @param thread_id + *Id of the thread for which to get the affinity. + * + * @param cpuset + * Pointer for storing the affinity value. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_get_affinity_by_id(rte_thread_t thread_id, + rte_cpuset_t *cpuset); + /** * Initialize the attributes of a thread. * These attributes can be passed to the rte_thread_create() function diff --git a/lib/eal/version.map b/lib/eal/version.map index 9ffa5eb15e..7ed4cd779e 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -433,6 +433,8 @@ EXPERIMENTAL { rte_thread_attr_get_affinity; rte_thread_attr_set_affinity; rte_thread_attr_set_priority; + rte_thread_get_affinity_by_id; + rte_thread_set_affinity_by_id; }; INTERNAL { diff --git a/lib/eal/windows/eal_lcore.c b/lib/eal/windows/eal_lcore.c index 476c2d2bdf..295af50698 100644 --- a/lib/eal/windows/eal_lcore.c +++ b/lib/eal/windows/eal_lcore.c @@ -2,7 +2,6 @@ * Copyright(c) 2019 Intel Corporation */ -#include #include #include @@ -27,13 +26,15 @@ struct socket_map { }; struct cpu_map { - unsigned int socket_count; unsigned int lcore_count; + unsigned int socket_count; + unsigned int cpu_count; struct lcore_map lcores[RTE_MAX_LCORE]; struct socket_map sockets[RTE_MAX_NUMA_NODES]; + GROUP_AFFINITY cpus[CPU_SETSIZE]; }; -static struct cpu_map cpu_map = { 0 }; +static struct cpu_map cpu_map; /* eal_create_cpu_map() is called before logging is initialized */ static void @@ -47,13 +48,118 @@ log_early(const char *format, ...) va_end(va); } +static int +eal_query_group_affinity(void) +{ + SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX *infos = NULL; + unsigned int *cpu_count = &cpu_map.cpu_count; + DWORD infos_size = 0; + int ret = 0; + USHORT group_count; + KAFFINITY affinity; + USHORT group_no; + unsigned int i; + + if (!GetLogicalProcessorInformationEx(RelationGroup, NULL, + &infos_size)) { + DWORD error = GetLastError(); + if (error != ERROR_INSUFFICIENT_BUFFER) { + log_early("Cannot get group information size, " + "error %lu\n", error); + rte_errno = EINVAL; + ret = -1; + goto cleanup; + } + } + + infos = malloc(infos_size); + if (infos == NULL) { + log_early("Cannot allocate memory for NUMA node information\n"); + rte_errno = ENOMEM; +
[dpdk-dev] [PATCH v13 06/10] eal: add thread lifetime management
From: Narcisa Vasile Add functions for thread creation, joining, detaching. The *rte_thread_create()* function can optionally receive an rte_thread_attr_t object that will cause the thread to be created with the affinity and priority described by the attributes object. If no rte_thread_attr_t is passed (parameter is NULL), the default affinity and priority are used. On Windows, the function executed by a thread when the thread starts is represeneted by a function pointer of type DWORD (*func) (void*). On other platforms, the function pointer is a void* (*func) (void*). Performing a cast between these two types of function pointers to uniformize the API on all platforms may result in undefined behavior. TO fix this issue, a wrapper that respects the signature required by CreateThread() has been created on Windows. Signed-off-by: Narcisa Vasile --- lib/eal/common/rte_thread.c | 107 + lib/eal/include/rte_thread.h| 55 + lib/eal/version.map | 3 + lib/eal/windows/include/sched.h | 2 +- lib/eal/windows/rte_thread.c| 138 5 files changed, 304 insertions(+), 1 deletion(-) diff --git a/lib/eal/common/rte_thread.c b/lib/eal/common/rte_thread.c index fcebf7097c..a0a51bc190 100644 --- a/lib/eal/common/rte_thread.c +++ b/lib/eal/common/rte_thread.c @@ -144,6 +144,113 @@ rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr, return 0; } +int +rte_thread_create(rte_thread_t *thread_id, + const rte_thread_attr_t *thread_attr, + rte_thread_func thread_func, void *args) +{ + int ret = 0; + pthread_attr_t attr; + pthread_attr_t *attrp = NULL; + struct sched_param param = { + .sched_priority = 0, + }; + int policy = SCHED_OTHER; + + if (thread_attr != NULL) { + ret = pthread_attr_init(&attr); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "pthread_attr_init failed\n"); + goto cleanup; + } + + attrp = &attr; + + if (thread_attr->priority != RTE_THREAD_PRIORITY_UNDEFINED) { + /* +* Set the inherit scheduler parameter to explicit, +* otherwise the priority attribute is ignored. +*/ + ret = pthread_attr_setinheritsched(attrp, + PTHREAD_EXPLICIT_SCHED); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "pthread_attr_setinheritsched failed\n"); + goto cleanup; + } + + ret = thread_map_priority_to_os_value( + thread_attr->priority, + ¶m.sched_priority, &policy + ); + if (ret != 0) + goto cleanup; + + ret = pthread_attr_setschedpolicy(attrp, policy); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "pthread_attr_setschedpolicy failed\n"); + goto cleanup; + } + + ret = pthread_attr_setschedparam(attrp, ¶m); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "pthread_attr_setschedparam failed\n"); + goto cleanup; + } + } + + if (CPU_COUNT(&thread_attr->cpuset) > 0) { + ret = pthread_attr_setaffinity_np(attrp, + sizeof(thread_attr->cpuset), + &thread_attr->cpuset); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "pthread_attr_setaffinity_np failed\n"); + goto cleanup; + } + } + } + + ret = pthread_create((pthread_t *)&thread_id->opaque_id, attrp, + thread_func, args); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "pthread_create failed\n"); + goto cleanup; + } + +cleanup: + if (attrp != NULL) + pthread_attr_destroy(&attr); + + return ret; +} + +int +rte_thread_join(rte_thread_t thread_id, unsigned long *value_ptr) +{ + int ret = 0; + void *res = NULL; + void **pres = NULL; + + if (value_ptr != NULL) + pres = &res; + + ret = pthread_join((pthread_t)thread_id.opaque_id, pres); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "pthread_join failed\n"); + return ret; + } + + if (pres != NULL) + *value_ptr = *(unsigned long *)(*pres); + + return 0; +} + +i
[dpdk-dev] [PATCH v13 05/10] eal: implement thread priority management functions
From: Narcisa Vasile Add function for setting the priority for a thread. Priorities on multiple platforms are similarly determined by a priority value and a priority class/policy. On Linux, the following mapping is created: RTE_THREAD_PRIORITY_NORMAL corresponds to * policy SCHED_OTHER * priority value: (sched_get_priority_min(SCHED_OTHER) + sched_get_priority_max(SCHED_OTHER))/2; RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to * policy SCHED_RR * priority value: sched_get_priority_max(SCHED_RR); On Windows, the following mapping is created: RTE_THREAD_PRIORITY_NORMAL corresponds to * class NORMAL_PRIORITY_CLASS * priority THREAD_PRIORITY_NORMAL RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to * class REALTIME_PRIORITY_CLASS * priority THREAD_PRIORITY_TIME_CRITICAL Signed-off-by: Narcisa Vasile --- lib/eal/common/rte_thread.c | 49 ++ lib/eal/include/rte_thread.h | 17 ++ lib/eal/version.map | 1 + lib/eal/windows/rte_thread.c | 66 4 files changed, 133 insertions(+) diff --git a/lib/eal/common/rte_thread.c b/lib/eal/common/rte_thread.c index 73b7b3141c..fcebf7097c 100644 --- a/lib/eal/common/rte_thread.c +++ b/lib/eal/common/rte_thread.c @@ -50,6 +50,55 @@ rte_thread_get_affinity_by_id(rte_thread_t thread_id, sizeof(*cpuset), cpuset); } +static int +thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, + int *os_pri, int *pol) +{ + /* Clear the output parameters */ + *os_pri = sched_get_priority_min(SCHED_OTHER) - 1; + *pol = -1; + + switch (eal_pri) { + case RTE_THREAD_PRIORITY_NORMAL: + *pol = SCHED_OTHER; + + /* +* Choose the middle of the range to represent +* the priority 'normal'. +* On Linux, this should be 0, since both +* sched_get_priority_min/_max return 0 for SCHED_OTHER. +*/ + *os_pri = (sched_get_priority_min(SCHED_OTHER) + + sched_get_priority_max(SCHED_OTHER))/2; + break; + case RTE_THREAD_PRIORITY_REALTIME_CRITICAL: + *pol = SCHED_RR; + *os_pri = sched_get_priority_max(SCHED_RR); + break; + default: + RTE_LOG(DEBUG, EAL, "The requested priority value is invalid.\n"); + return EINVAL; + } + return 0; +} + +int +rte_thread_set_priority(rte_thread_t thread_id, + enum rte_thread_priority priority) +{ + int ret; + int policy; + struct sched_param param; + + ret = thread_map_priority_to_os_value(priority, ¶m.sched_priority, + &policy); + if (ret != 0) + return ret; + + return pthread_setschedparam((pthread_t)thread_id.opaque_id, + policy, ¶m); +} + int rte_thread_attr_init(rte_thread_attr_t *attr) { diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h index ca4ade60e2..5514b2f57f 100644 --- a/lib/eal/include/rte_thread.h +++ b/lib/eal/include/rte_thread.h @@ -215,6 +215,23 @@ void rte_thread_get_affinity(rte_cpuset_t *cpusetp); #endif /* RTE_HAS_CPUSET */ +/** + * Set the priority of a thread. + * + * @param thread_id + *Id of the thread for which to set priority. + * + * @param priority + * Priority value to be set. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_set_priority(rte_thread_t thread_id, + enum rte_thread_priority priority); + /** * Create a TLS data key visible to all threads in the process. * the created key is later used to get/set a value. diff --git a/lib/eal/version.map b/lib/eal/version.map index 7ed4cd779e..df01e4 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -435,6 +435,7 @@ EXPERIMENTAL { rte_thread_attr_set_priority; rte_thread_get_affinity_by_id; rte_thread_set_affinity_by_id; + rte_thread_set_priority; }; INTERNAL { diff --git a/lib/eal/windows/rte_thread.c b/lib/eal/windows/rte_thread.c index 0127119f49..fb04718f58 100644 --- a/lib/eal/windows/rte_thread.c +++ b/lib/eal/windows/rte_thread.c @@ -200,6 +200,72 @@ rte_thread_get_affinity_by_id(rte_thread_t thread_id, return ret; } +static int +thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, + int *os_pri, int *pri_class) +{ + /* Clear the output parameters */ + *os_pri = -1; + *pri_class = -1; + + switch (eal_pri) { + case RTE_THREAD_PRIORITY_NORMAL: + *pri_class = NORMAL_PRIORITY_CLASS; + *os_pri = THREAD_PRIORITY_NORMAL; + break; + case RTE_THREAD_PRIORITY_REALTIME_CRITICAL: + *pri_class = REALTIME_PRIORITY_CLASS; + *os_pri = THREAD_PRIORITY
[dpdk-dev] [PATCH v13 07/10] eal: implement functions for mutex management
From: Narcisa Vasile Add functions for mutex init, destroy, lock, unlock. Add RTE_STATIC_MUTEX macro to replace static initialization of mutexes. Windows does not have a static initializer. Initialization is only done through InitializeCriticalSection(). The RTE_STATIC_MUTEX calls into the rte_thread_mutex_init() function that performs the actual mutex initialization. Signed-off-by: Narcisa Vasile --- lib/eal/common/rte_thread.c | 61 +++ lib/eal/include/rte_thread.h | 94 lib/eal/version.map | 4 ++ lib/eal/windows/rte_thread.c | 53 4 files changed, 212 insertions(+) diff --git a/lib/eal/common/rte_thread.c b/lib/eal/common/rte_thread.c index a0a51bc190..ebae4a8af1 100644 --- a/lib/eal/common/rte_thread.c +++ b/lib/eal/common/rte_thread.c @@ -251,6 +251,67 @@ rte_thread_detach(rte_thread_t thread_id) return pthread_detach((pthread_t)thread_id.opaque_id); } +int +rte_thread_mutex_init(rte_thread_mutex *mutex) +{ + int ret = 0; + pthread_mutex_t *m = NULL; + + RTE_VERIFY(mutex != NULL); + + m = calloc(1, sizeof(*m)); + if (m == NULL) { + RTE_LOG(DEBUG, EAL, "Unable to initialize mutex. Insufficient memory!\n"); + ret = ENOMEM; + goto cleanup; + } + + ret = pthread_mutex_init(m, NULL); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "Failed to init mutex. ret = %d\n", ret); + goto cleanup; + } + + mutex->mutex_id = m; + m = NULL; + +cleanup: + free(m); + return ret; +} + +int +rte_thread_mutex_lock(rte_thread_mutex *mutex) +{ + RTE_VERIFY(mutex != NULL); + + return pthread_mutex_lock((pthread_mutex_t *)mutex->mutex_id); +} + +int +rte_thread_mutex_unlock(rte_thread_mutex *mutex) +{ + RTE_VERIFY(mutex != NULL); + + return pthread_mutex_unlock((pthread_mutex_t *)mutex->mutex_id); +} + +int +rte_thread_mutex_destroy(rte_thread_mutex *mutex) +{ + int ret = 0; + RTE_VERIFY(mutex != NULL); + + ret = pthread_mutex_destroy((pthread_mutex_t *)mutex->mutex_id); + if (ret != 0) + RTE_LOG(DEBUG, EAL, "Unable to destroy mutex, ret = %d\n", ret); + + free(mutex->mutex_id); + mutex->mutex_id = NULL; + + return ret; +} + int rte_thread_key_create(rte_thread_key *key, void (*destructor)(void *)) { diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h index 098c3ba343..7e813b573d 100644 --- a/lib/eal/include/rte_thread.h +++ b/lib/eal/include/rte_thread.h @@ -56,6 +56,26 @@ typedef struct { #endif /* RTE_HAS_CPUSET */ +#define RTE_DECLARE_MUTEX(private_lock) rte_thread_mutex private_lock + +#define RTE_DEFINE_MUTEX(private_lock)\ +RTE_INIT(__rte_ ## private_lock ## _init)\ +{\ + RTE_VERIFY(rte_thread_mutex_init(&private_lock) == 0);\ +} + +#define RTE_STATIC_MUTEX(private_lock)\ +static RTE_DECLARE_MUTEX(private_lock);\ +RTE_DEFINE_MUTEX(private_lock) + + +/** + * Thread mutex representation. + */ +typedef struct rte_thread_mutex_tag { + void *mutex_id; /**< mutex identifier */ +} rte_thread_mutex; + /** * TLS key type, an opaque pointer. */ @@ -268,6 +288,28 @@ int rte_thread_join(rte_thread_t thread_id, unsigned long *value_ptr); __rte_experimental int rte_thread_detach(rte_thread_t thread_id); +/** + * Set core affinity of the current thread. + * Support both EAL and non-EAL thread and update TLS. + * + * @param cpusetp + * Pointer to CPU affinity to set. + * + * @return + * On success, return 0; otherwise return -1; + */ +int rte_thread_set_affinity(rte_cpuset_t *cpusetp); + +/** + * Get core affinity of the current thread. + * + * @param cpusetp + * Pointer to CPU affinity of current thread. + * It presumes input is not NULL, otherwise it causes panic. + * + */ +void rte_thread_get_affinity(rte_cpuset_t *cpusetp); + #endif /* RTE_HAS_CPUSET */ /** @@ -287,6 +329,58 @@ __rte_experimental int rte_thread_set_priority(rte_thread_t thread_id, enum rte_thread_priority priority); +/** + * Initializes a mutex. + * + * @param mutex + *The mutex to be initialized. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_mutex_init(rte_thread_mutex *mutex); + +/** + * Locks a mutex. + * + * @param mutex + *The mutex to be locked. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_mutex_lock(rte_thread_mutex *mutex); + +/** + * Unlocks a mutex. + * + * @param mutex + *The mutex to be unlocked. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_mutex_unlock(rte_thread_mutex *mutex); + +/** + * Releases all resources associated with a
[dpdk-dev] [PATCH v13 08/10] eal: implement functions for thread barrier management
From: Narcisa Vasile Add functions for barrier init, destroy, wait. A portable type is used to represent a barrier identifier. The rte_thread_barrier_wait() function returns the same value on all platforms. Signed-off-by: Narcisa Vasile --- lib/eal/common/rte_thread.c | 61 lib/eal/include/rte_thread.h | 58 ++ lib/eal/version.map | 3 ++ lib/eal/windows/rte_thread.c | 56 + 4 files changed, 178 insertions(+) diff --git a/lib/eal/common/rte_thread.c b/lib/eal/common/rte_thread.c index ebae4a8af1..3fdb267337 100644 --- a/lib/eal/common/rte_thread.c +++ b/lib/eal/common/rte_thread.c @@ -312,6 +312,67 @@ rte_thread_mutex_destroy(rte_thread_mutex *mutex) return ret; } +int +rte_thread_barrier_init(rte_thread_barrier *barrier, int count) +{ + int ret = 0; + pthread_barrier_t *pthread_barrier = NULL; + + RTE_VERIFY(barrier != NULL); + RTE_VERIFY(count > 0); + + pthread_barrier = calloc(1, sizeof(*pthread_barrier)); + if (pthread_barrier == NULL) { + RTE_LOG(DEBUG, EAL, "Unable to initialize barrier. Insufficient memory!\n"); + ret = ENOMEM; + goto cleanup; + } + ret = pthread_barrier_init(pthread_barrier, NULL, count); + if (ret != 0) { + RTE_LOG(DEBUG, EAL, "Failed to init barrier, ret = %d\n", ret); + goto cleanup; + } + + barrier->barrier_id = pthread_barrier; + pthread_barrier = NULL; + +cleanup: + free(pthread_barrier); + return ret; +} + +int +rte_thread_barrier_wait(rte_thread_barrier *barrier) +{ + int ret = 0; + + RTE_VERIFY(barrier != NULL); + RTE_VERIFY(barrier->barrier_id != NULL); + + ret = pthread_barrier_wait(barrier->barrier_id); + if (ret == PTHREAD_BARRIER_SERIAL_THREAD) + ret = RTE_THREAD_BARRIER_SERIAL_THREAD; + + return ret; +} + +int +rte_thread_barrier_destroy(rte_thread_barrier *barrier) +{ + int ret = 0; + + RTE_VERIFY(barrier != NULL); + + ret = pthread_barrier_destroy(barrier->barrier_id); + if (ret != 0) + RTE_LOG(DEBUG, EAL, "Failed to destroy barrier: %d\n", ret); + + free(barrier->barrier_id); + barrier->barrier_id = NULL; + + return ret; +} + int rte_thread_key_create(rte_thread_key *key, void (*destructor)(void *)) { diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h index 7e813b573d..40da83467b 100644 --- a/lib/eal/include/rte_thread.h +++ b/lib/eal/include/rte_thread.h @@ -76,6 +76,18 @@ typedef struct rte_thread_mutex_tag { void *mutex_id; /**< mutex identifier */ } rte_thread_mutex; +/** + * Returned by rte_thread_barrier_wait() when call is successful. + */ +#define RTE_THREAD_BARRIER_SERIAL_THREAD -1 + +/** + * Thread barrier representation. + */ +typedef struct rte_thread_barrier_tag { + void *barrier_id; /**< barrrier identifier */ +} rte_thread_barrier; + /** * TLS key type, an opaque pointer. */ @@ -381,6 +393,52 @@ int rte_thread_mutex_unlock(rte_thread_mutex *mutex); __rte_experimental int rte_thread_mutex_destroy(rte_thread_mutex *mutex); +/** + * Initializes a synchronization barrier. + * + * @param barrier + *A pointer that references the newly created 'barrier' object. + * + * @param count + *The number of threads that must enter the barrier before + *the threads can continue execution. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_barrier_init(rte_thread_barrier *barrier, int count); + +/** + * Causes the calling thread to wait at the synchronization barrier 'barrier'. + * + * @param barrier + *The barrier used for synchronizing the threads. + * + * @return + * Return RTE_THREAD_BARRIER_SERIAL_THREAD for the thread synchronized + * at the barrier. + * Return 0 for all other threads. + * Return a positive errno-style error number, in case of failure. + */ +__rte_experimental +int rte_thread_barrier_wait(rte_thread_barrier *barrier); + +/** + * Releases all resources used by a synchronization barrier + * and uninitializes it. + * + * @param barrier + *The barrier to be destroyed. + * + * @return + * On success, return 0. + * On failure, return a positive errno-style error number. + */ +__rte_experimental +int rte_thread_barrier_destroy(rte_thread_barrier *barrier); + /** * Create a TLS data key visible to all threads in the process. * the created key is later used to get/set a value. diff --git a/lib/eal/version.map b/lib/eal/version.map index a1c7a8e87d..c081fdd96c 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -443,6 +443,9 @@ EXPERIMENTAL { rte_thread_mutex_lock; rte_thread_mutex_unlock; rte_thread_mutex_destroy; + rte_thread_barrier_init; +
[dpdk-dev] [PATCH v13 10/10] Add unit tests for thread API
From: Narcisa Vasile As a new API for threading is introduced, a set of unit tests have been added to test the new interface. Signed-off-by: Narcisa Vasile --- app/test/meson.build| 2 + app/test/test_threads.c | 419 2 files changed, 421 insertions(+) create mode 100644 app/test/test_threads.c diff --git a/app/test/meson.build b/app/test/meson.build index a7611686ad..57e61ce601 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -140,6 +140,7 @@ test_sources = files( 'test_table_tables.c', 'test_tailq.c', 'test_thash.c', +'test_threads.c', 'test_timer.c', 'test_timer_perf.c', 'test_timer_racecond.c', @@ -276,6 +277,7 @@ fast_tests = [ ['reorder_autotest', true], ['service_autotest', true], ['thash_autotest', true], +['threads_autotest', true], ['trace_autotest', true], ] diff --git a/app/test/test_threads.c b/app/test/test_threads.c new file mode 100644 index 00..beaa303506 --- /dev/null +++ b/app/test/test_threads.c @@ -0,0 +1,419 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2021 Microsoft. + */ + +#include + +#include + +#include "test.h" + +#define THREADS_COUNT 20 + +#define TEST_THREADS_LOG(func) \ + printf("Error at line %d. %s failed!\n", __LINE__, func) + +static void * +thread_loop_self(void *arg) +{ + rte_thread_t *id = arg; + + *id = rte_thread_self(); + + return NULL; +} + +static int +test_thread_self(void) +{ + rte_thread_t threads_ids[THREADS_COUNT]; + rte_thread_t self_ids[THREADS_COUNT] = {}; + size_t i; + size_t j; + int ret = 0; + + for (i = 0; i < THREADS_COUNT; ++i) { + if (rte_thread_create(&threads_ids[i], NULL, thread_loop_self, + &self_ids[i]) != 0) { + printf("Error, Only %zu threads created.\n", i); + break; + } + } + + for (j = 0; j < i; ++j) { + ret = rte_thread_join(threads_ids[j], NULL); + if (ret != 0) { + TEST_THREADS_LOG("rte_thread_join()"); + return -1; + } + + if (rte_thread_equal(threads_ids[j], self_ids[j]) == 0) + ret = -1; + } + + return ret; +} + +struct thread_context { + rte_thread_barrier *barrier; + size_t *thread_count; +}; + +static void * +thread_loop_barrier(void *arg) +{ + + struct thread_context *ctx = arg; + + (void)__atomic_add_fetch(ctx->thread_count, 1, __ATOMIC_RELAXED); + + if (rte_thread_barrier_wait(ctx->barrier) > 0) + TEST_THREADS_LOG("rte_thread_barrier_wait()"); + + return NULL; +} + +static int +test_thread_barrier(void) +{ + rte_thread_t threads_ids[THREADS_COUNT]; + struct thread_context ctx[THREADS_COUNT] = {}; + rte_thread_barrier barrier; + size_t count = 0; + size_t i; + size_t j; + int ret = 0; + + ret = rte_thread_barrier_init(&barrier, THREADS_COUNT + 1); + if (ret != 0) { + TEST_THREADS_LOG("rte_thread_barrier_init()"); + return -1; + } + + for (i = 0; i < THREADS_COUNT; ++i) { + ctx[i].thread_count = &count; + ctx[i].barrier = &barrier; + if (rte_thread_create(&threads_ids[i], NULL, + thread_loop_barrier, &ctx[i]) != 0) { + printf("Error, Only %zu threads created.\n", i); + ret = -1; + goto error; + } + } + + ret = rte_thread_barrier_wait(ctx->barrier); + if (ret > 0) { + TEST_THREADS_LOG("rte_thread_barrier_wait()"); + ret = -1; + goto error; + } + + if (count != i) { + ret = -1; + printf("Error, expected thread count(%zu) to be equal " + "to the number of threads that wait at the barrier(%zu)\n", + count, i); + goto error; + } + +error: + for (j = 0; j < i; ++j) { + ret = rte_thread_join(threads_ids[j], NULL); + if (ret != 0) { + TEST_THREADS_LOG("rte_thread_join()"); + ret = -1; + break; + } + } + + ret = rte_thread_barrier_destroy(&barrier); + if (ret != 0) { + TEST_THREADS_LOG("rte_thread_barrier_destroy()"); + ret = -1; + } + + return ret; +} + +static size_t val; + +static void * +thread_loop_mutex(void *arg) +{ + rte_thread_mutex *mutex = arg; + + rte_thread_mutex_lock(mutex); + val++; + rte_thread_mutex_unlock(mutex); + + return NULL; +} + +static int +test_threa
[dpdk-dev] [PATCH v13 09/10] eal: add EAL argument for setting thread priority
From: Narcisa Vasile Allow the user to choose the thread priority through an EAL command line argument. The user can choose thread priority through an EAL parameter, when starting an application. If EAL parameter is not used, the per-platform default value for thread priority is used. Otherwise administrator has an option to set one of available options: --thread-prio normal --thread-prio realtime Example: ./dpdk-l2fwd -l 0-3 -n 4 --thread-prio normal -- -q 8 -p Signed-off-by: Narcisa Vasile --- lib/eal/common/eal_common_options.c | 28 +++- lib/eal/common/eal_internal_cfg.h | 2 ++ lib/eal/common/eal_options.h| 2 ++ 3 files changed, 31 insertions(+), 1 deletion(-) diff --git a/lib/eal/common/eal_common_options.c b/lib/eal/common/eal_common_options.c index ff5861b5f3..9d29696b84 100644 --- a/lib/eal/common/eal_common_options.c +++ b/lib/eal/common/eal_common_options.c @@ -107,6 +107,7 @@ eal_long_options[] = { {OPT_TELEMETRY, 0, NULL, OPT_TELEMETRY_NUM}, {OPT_NO_TELEMETRY, 0, NULL, OPT_NO_TELEMETRY_NUM }, {OPT_FORCE_MAX_SIMD_BITWIDTH, 1, NULL, OPT_FORCE_MAX_SIMD_BITWIDTH_NUM}, + {OPT_THREAD_PRIORITY, 1, NULL, OPT_THREAD_PRIORITY_NUM}, /* legacy options that will be removed in future */ {OPT_PCI_BLACKLIST, 1, NULL, OPT_PCI_BLACKLIST_NUM}, @@ -1412,6 +1413,24 @@ eal_parse_simd_bitwidth(const char *arg) return 0; } +static int +eal_parse_thread_priority(const char *arg) +{ + struct internal_config *internal_conf = + eal_get_internal_configuration(); + enum rte_thread_priority priority; + + if (!strncmp("normal", arg, sizeof("normal"))) + priority = RTE_THREAD_PRIORITY_NORMAL; + else if (!strncmp("realtime", arg, sizeof("realtime"))) + priority = RTE_THREAD_PRIORITY_REALTIME_CRITICAL; + else + return -1; + + internal_conf->thread_priority = priority; + return 0; +} + static int eal_parse_base_virtaddr(const char *arg) { @@ -1825,7 +1844,13 @@ eal_parse_common_option(int opt, const char *optarg, return -1; } break; - + case OPT_THREAD_PRIORITY_NUM: + if (eal_parse_thread_priority(optarg) < 0) { + RTE_LOG(ERR, EAL, "invalid parameter for --" + OPT_THREAD_PRIORITY "\n"); + return -1; + } + break; /* don't know what to do, leave this to caller */ default: return 1; @@ -2088,6 +2113,7 @@ eal_common_usage(void) " (can be used multiple times)\n" " --"OPT_VMWARE_TSC_MAP"Use VMware TSC map instead of native RDTSC\n" " --"OPT_PROC_TYPE" Type of this process (primary|secondary|auto)\n" + " --"OPT_THREAD_PRIORITY" Set threads priority (normal|realtime)\n" #ifndef RTE_EXEC_ENV_WINDOWS " --"OPT_SYSLOG"Set syslog facility\n" #endif diff --git a/lib/eal/common/eal_internal_cfg.h b/lib/eal/common/eal_internal_cfg.h index d6c0470eb8..b2996cd65b 100644 --- a/lib/eal/common/eal_internal_cfg.h +++ b/lib/eal/common/eal_internal_cfg.h @@ -94,6 +94,8 @@ struct internal_config { unsigned int no_telemetry; /**< true to disable Telemetry */ struct simd_bitwidth max_simd_bitwidth; /**< max simd bitwidth path to use */ + enum rte_thread_priority thread_priority; + /**< thread priority to configure */ }; void eal_reset_internal_config(struct internal_config *internal_cfg); diff --git a/lib/eal/common/eal_options.h b/lib/eal/common/eal_options.h index 7b348e707f..9f5b209f64 100644 --- a/lib/eal/common/eal_options.h +++ b/lib/eal/common/eal_options.h @@ -93,6 +93,8 @@ enum { OPT_NO_TELEMETRY_NUM, #define OPT_FORCE_MAX_SIMD_BITWIDTH "force-max-simd-bitwidth" OPT_FORCE_MAX_SIMD_BITWIDTH_NUM, +#define OPT_THREAD_PRIORITY "thread-prio" + OPT_THREAD_PRIORITY_NUM, /* legacy option that will be removed in future */ #define OPT_PCI_BLACKLIST "pci-blacklist" -- 2.31.0.vfs.0.1
[dpdk-dev] [PATCH] net/mlx5: fix find sibling devices
The routine mlx5_eth_find_next() and related iterating macro MLX5_ETH_FOREACH_DEV is used to iterate through sibling devices (all representors share the same configuration and switching domain) on top of specified root device. The root device parameter was specified as NULL, and it caused the missing siblings in iteration during representor device probing, causing: 1. allocating the new domain_id for the device being probed. 2. discrepancy in representor configurations and potential overall driver malfunctions. Fixes: 56bb3c84e982 ("net/mlx5: reduce PCI dependency") Signed-off-by: Gregory Etelson Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/linux/mlx5_os.c | 2 +- drivers/net/mlx5/mlx5.c| 5 +++-- drivers/net/mlx5/mlx5.h| 3 ++- drivers/net/mlx5/windows/mlx5_os.c | 2 +- 4 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index eeeca27ac2..5f8766aa48 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -1314,7 +1314,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, } /* Override some values set by hardware configuration. */ mlx5_args(config, dpdk_dev->devargs); - err = mlx5_dev_check_sibling_config(priv, config); + err = mlx5_dev_check_sibling_config(priv, config, dpdk_dev); if (err) goto error; config->hw_csum = !!(sh->device_attr.device_cap_flags_ex & diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 90990ffdc2..f84e061fe7 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -2297,7 +2297,8 @@ rte_pmd_mlx5_get_dyn_flag_names(char *names[], unsigned int n) */ int mlx5_dev_check_sibling_config(struct mlx5_priv *priv, - struct mlx5_dev_config *config) + struct mlx5_dev_config *config, + struct rte_device *dpdk_dev) { struct mlx5_dev_ctx_shared *sh = priv->sh; struct mlx5_dev_config *sh_conf = NULL; @@ -2308,7 +2309,7 @@ mlx5_dev_check_sibling_config(struct mlx5_priv *priv, if (sh->refcnt == 1) return 0; /* Find the device with shared context. */ - MLX5_ETH_FOREACH_DEV(port_id, NULL) { + MLX5_ETH_FOREACH_DEV(port_id, dpdk_dev) { struct mlx5_priv *opriv = rte_eth_devices[port_id].data->dev_private; diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 34d66e93ad..e02714e231 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -1503,7 +1503,8 @@ void mlx5_set_min_inline(struct mlx5_dev_spawn_data *spawn, struct mlx5_dev_config *config); void mlx5_set_metadata_mask(struct rte_eth_dev *dev); int mlx5_dev_check_sibling_config(struct mlx5_priv *priv, - struct mlx5_dev_config *config); + struct mlx5_dev_config *config, + struct rte_device *dpdk_dev); int mlx5_dev_configure(struct rte_eth_dev *dev); int mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info); int mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver, size_t fw_size); diff --git a/drivers/net/mlx5/windows/mlx5_os.c b/drivers/net/mlx5/windows/mlx5_os.c index 5a18f538bc..5518bc3e76 100644 --- a/drivers/net/mlx5/windows/mlx5_os.c +++ b/drivers/net/mlx5/windows/mlx5_os.c @@ -430,7 +430,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, * Look for sibling devices in order to reuse their switch domain * if any, otherwise allocate one. */ - MLX5_ETH_FOREACH_DEV(port_id, NULL) { + MLX5_ETH_FOREACH_DEV(port_id, dpdk_dev) { const struct mlx5_priv *opriv = rte_eth_devices[port_id].data->dev_private; -- 2.32.0