RE: [RFC] ethdev: introduce entropy calculation
Hi > -Original Message- > From: Thomas Monjalon > Sent: Thursday, January 4, 2024 8:19 PM > > 04/01/2024 15:33, Ori Kam: > > Hi Cristian, > > > > > From: Dumitrescu, Cristian > > > Sent: Thursday, January 4, 2024 2:57 PM > > > > > >> > > > > > >> And unless this is specifically defined as 'entropy' in spec, I am > > > > > >> too > > > > > >> for rename. > > > > > >> > > > > > >> At least in VXLAN spec, it is mentioned that this field is to > > > > > >> "enable a > > > > > >> level of entropy", but not exactly names it as entropy. > > > > > > > > > > > > Exactly my thought about the naming. > > > > > > Good to see I am not alone thinking this naming is disturbing :) > > > > > > > > > > I'd avoid usage of term "entropy" in this patch. It is very confusing. > > > > > > > > What about rte_flow_calc_encap_hash? > > > > > > > > > > > How about simply rte_flow_calc_hash? My understanding is this is a > general- > > > purpose hash that is not limited to encapsulation work. > > > > Unfortunately, this is not a general-purpose hash. HW may implement a > different hash for each use case. > > also, the hash result is length differs depending on the feature and even > > the > target field. > > > > We can take your naming idea and change the parameters a bit: > > rte_flow_calc_hash(port, feature, *attribute, pattern, hash_len, *hash) > > > > For the feature we will have at this point: > > NVGRE_HASH, > > SPORT_HASH > > > > The attribute parameter will be empty for now, but it may be used later to > add extra information > > for the hash if more information is required, for example, some key. > > In addition, we will also be able to merge the current function > rte_flow_calc_table_hash, > > if we pass the missing parameters (table id, template id) in the attribute > > field. > > > > What do you think? > > I like the idea of having a single function for HW hashes. > Is there an impact on performance? How much is it sensitive? > > It is sensitive since we expect this function to be called for each packet. This may not be such a great idea. So, back to square one and keep the rte_flow_calc_encap_hash
[PATCH v10] gro: fix reordering of packets in GRO layer
In the current implementation when a packet is received with special TCP flag(s) set, only that packet is delivered out of order. There could be already coalesced packets in the GRO table belonging to the same flow but not delivered. This fix makes sure that the entire segment is delivered with the special flag(s) set which is how the Linux GRO is also implemented Signed-off-by: Kumara Parameshwaran Co-authored-by: Kumara Parameshwaran --- If the received packet is not a pure ACK packet, we check if there are any previous packets in the flow, if present we indulge the received packet also in the coalescing logic and update the flags of the last recived packet to the entire segment which would avoid re-ordering. Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode, P1 contains PSH flag and since it does not contain any prior packets in the flow we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together. In the existing case the P2,P3 would be delivered as single segment first and the unprocess_packets will be copied later which will cause reordering. With the patch copy the unprocess packets first and then the packets from the GRO table. Testing done The csum test-pmd was modified to support the following GET request of 10MB from client to server via test-pmd (static arp entries added in client and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac would be sent to server mac and vice versa. In above testing, without the patch the client observerd re-ordering of 25 packets and with the patch there were no packet re-ordering observerd. v2: Fix warnings in commit and comment. Do not consider packet as candidate to merge if it contains SYN/RST flag. v3: Fix warnings. v4: Rebase with master. v5: Adding co-author email v6: Address review comments from the maintainer to restructure the code and handle only special flags PSH,FIN v7: Fix warnings and errors v8: Fix warnings and errors v9: Fix commit message v10: Update tcp header flags and address review comments lib/gro/gro_tcp.h | 9 lib/gro/gro_tcp4.c | 46 -- lib/gro/gro_tcp_internal.h | 2 +- lib/gro/gro_vxlan_tcp4.c | 5 +++-- 4 files changed, 47 insertions(+), 15 deletions(-) diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h index d926c4b8cc..2c68b5f23e 100644 --- a/lib/gro/gro_tcp.h +++ b/lib/gro/gro_tcp.h @@ -19,6 +19,8 @@ #define INVALID_TCP_HDRLEN(len) \ (((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN)) +#define VALID_GRO_TCP_FLAGS (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG) + struct cmn_tcp_key { struct rte_ether_addr eth_saddr; struct rte_ether_addr eth_daddr; @@ -81,11 +83,13 @@ merge_two_tcp_packets(struct gro_tcp_item *item, struct rte_mbuf *pkt, int cmp, uint32_t sent_seq, + uint8_t tcp_flags, uint16_t ip_id, uint16_t l2_offset) { struct rte_mbuf *pkt_head, *pkt_tail, *lastseg; uint16_t hdr_len, l2_len; + struct rte_tcp_hdr *tcp_hdr; if (cmp > 0) { pkt_head = item->firstseg; @@ -128,6 +132,11 @@ merge_two_tcp_packets(struct gro_tcp_item *item, /* update MBUF metadata for the merged packet */ pkt_head->nb_segs += pkt_tail->nb_segs; pkt_head->pkt_len += pkt_tail->pkt_len; + if (tcp_flags != RTE_TCP_ACK_FLAG) { + tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, + l2_offset + pkt_head->l2_len + pkt_head->l3_len); + tcp_hdr->tcp_flags |= tcp_flags; + } return 1; } diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index 6645de592b..707cd050da 100644 --- a/lib/gro/gro_tcp4.c +++ b/lib/gro/gro_tcp4.c @@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, uint32_t item_idx; uint32_t i, max_flow_num, remaining_flow_num; uint8_t find; + uint32_t item_start_idx; /* * Don't process the packet whose TCP header length is greater @@ -139,11 +140,8 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len); hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len; - /* -* Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE -* or CWR set. -*/ - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) + /* Return early if the TCP flags are not handled in GRO layer */ + if (tcp_hdr->tcp_flags & (~(VALID_GRO_TCP_FLAGS))) return -1; /* trim the tail padding
[PATCH v11] gro: fix reordering of packets in GRO layer
In the current implementation when a packet is received with special TCP flag(s) set, only that packet is delivered out of order. There could be already coalesced packets in the GRO table belonging to the same flow but not delivered. This fix makes sure that the entire segment is delivered with the special flag(s) set which is how the Linux GRO is also implemented Signed-off-by: Kumara Parameshwaran --- If the received packet is not a pure ACK packet, we check if there are any previous packets in the flow, if present we indulge the received packet also in the coalescing logic and update the flags of the last recived packet to the entire segment which would avoid re-ordering. Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode, P1 contains PSH flag and since it does not contain any prior packets in the flow we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together. In the existing case the P2,P3 would be delivered as single segment first and the unprocess_packets will be copied later which will cause reordering. With the patch copy the unprocess packets first and then the packets from the GRO table. Testing done The csum test-pmd was modified to support the following GET request of 10MB from client to server via test-pmd (static arp entries added in client and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac would be sent to server mac and vice versa. In above testing, without the patch the client observerd re-ordering of 25 packets and with the patch there were no packet re-ordering observerd. v2: Fix warnings in commit and comment. Do not consider packet as candidate to merge if it contains SYN/RST flag. v3: Fix warnings. v4: Rebase with master. v5: Adding co-author email v6: Address review comments from the maintainer to restructure the code and handle only special flags PSH,FIN v7: Fix warnings and errors v8: Fix warnings and errors v9: Fix commit message v10: Update tcp header flags and address review comments v11: Fix warnings lib/gro/gro_tcp.h | 9 lib/gro/gro_tcp4.c | 46 -- lib/gro/gro_tcp_internal.h | 2 +- lib/gro/gro_vxlan_tcp4.c | 5 +++-- 4 files changed, 47 insertions(+), 15 deletions(-) diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h index d926c4b8cc..2c68b5f23e 100644 --- a/lib/gro/gro_tcp.h +++ b/lib/gro/gro_tcp.h @@ -19,6 +19,8 @@ #define INVALID_TCP_HDRLEN(len) \ (((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN)) +#define VALID_GRO_TCP_FLAGS (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG) + struct cmn_tcp_key { struct rte_ether_addr eth_saddr; struct rte_ether_addr eth_daddr; @@ -81,11 +83,13 @@ merge_two_tcp_packets(struct gro_tcp_item *item, struct rte_mbuf *pkt, int cmp, uint32_t sent_seq, + uint8_t tcp_flags, uint16_t ip_id, uint16_t l2_offset) { struct rte_mbuf *pkt_head, *pkt_tail, *lastseg; uint16_t hdr_len, l2_len; + struct rte_tcp_hdr *tcp_hdr; if (cmp > 0) { pkt_head = item->firstseg; @@ -128,6 +132,11 @@ merge_two_tcp_packets(struct gro_tcp_item *item, /* update MBUF metadata for the merged packet */ pkt_head->nb_segs += pkt_tail->nb_segs; pkt_head->pkt_len += pkt_tail->pkt_len; + if (tcp_flags != RTE_TCP_ACK_FLAG) { + tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, + l2_offset + pkt_head->l2_len + pkt_head->l3_len); + tcp_hdr->tcp_flags |= tcp_flags; + } return 1; } diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index 6645de592b..d426127dbd 100644 --- a/lib/gro/gro_tcp4.c +++ b/lib/gro/gro_tcp4.c @@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, uint32_t item_idx; uint32_t i, max_flow_num, remaining_flow_num; uint8_t find; + uint32_t item_start_idx; /* * Don't process the packet whose TCP header length is greater @@ -139,11 +140,8 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len); hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len; - /* -* Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE -* or CWR set. -*/ - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) + /* Return early if the TCP flags are not handled in GRO layer */ + if (tcp_hdr->tcp_flags & (~(VALID_GRO_TCP_FLAGS))) return -1; /* trim the tail padding bytes */ @@ -18
[PATCH] net/ice: fix memory leak
Free memory for AQ buffer at icd_move_recfg_lan_txq Free memory for profile list at ice_tm_conf_uninit Fixes: 8c481c3bb65b ("net/ice: support queue and queue group bandwidth limit") Cc: sta...@dpdk.org Signed-off-by: Qi Zhang --- drivers/net/ice/ice_tm.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index b570798f07..c00ecb6a97 100644 --- a/drivers/net/ice/ice_tm.c +++ b/drivers/net/ice/ice_tm.c @@ -59,8 +59,15 @@ void ice_tm_conf_uninit(struct rte_eth_dev *dev) { struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); + struct ice_tm_shaper_profile *shaper_profile; struct ice_tm_node *tm_node; + /* clear profile */ + while ((shaper_profile = TAILQ_FIRST(&pf->tm_conf.shaper_profile_list))) { + TAILQ_REMOVE(&pf->tm_conf.shaper_profile_list, shaper_profile, node); + rte_free(shaper_profile); + } + /* clear node configuration */ while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) { TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node); @@ -636,6 +643,8 @@ static int ice_move_recfg_lan_txq(struct rte_eth_dev *dev, uint16_t buf_size = ice_struct_size(buf, txqs, 1); buf = (struct ice_aqc_move_txqs_data *)ice_malloc(hw, sizeof(*buf)); + if (buf == NULL) + return -ENOMEM; queue_parent_node = queue_sched_node->parent; buf->src_teid = queue_parent_node->info.node_teid; @@ -647,6 +656,7 @@ static int ice_move_recfg_lan_txq(struct rte_eth_dev *dev, NULL, buf, buf_size, &txqs_moved, NULL); if (ret || txqs_moved == 0) { PMD_DRV_LOG(ERR, "move lan queue %u failed", queue_id); + rte_free(buf); return ICE_ERR_PARAM; } @@ -656,12 +666,14 @@ static int ice_move_recfg_lan_txq(struct rte_eth_dev *dev, } else { PMD_DRV_LOG(ERR, "invalid children number %d for queue %u", queue_parent_node->num_children, queue_id); + rte_free(buf); return ICE_ERR_PARAM; } dst_node->children[dst_node->num_children++] = queue_sched_node; queue_sched_node->parent = dst_node; ice_sched_query_elem(hw, queue_sched_node->info.node_teid, &queue_sched_node->info); + rte_free(buf); return ret; } -- 2.31.1
RE: [PATCH v1] doc/mlx5: update IPv6 routing extension matching limitation
Hi, > -Original Message- > From: Rongwei Liu > Sent: Thursday, November 23, 2023 4:15 AM > To: dev@dpdk.org; Matan Azrad ; Slava Ovsiienko > ; Ori Kam ; Suanming Mou > ; NBU-Contact-Thomas Monjalon (EXTERNAL) > > Subject: [PATCH v1] doc/mlx5: update IPv6 routing extension matching > limitation > > Due to hardware limitations, the relaxed mode is not supported, otherwise, > packets may mismatch. > > Signed-off-by: Rongwei Liu > Acked-by: Suanming Mou > Acked-by: Thomas Monjalon Patch applied to next-net-mlx, Kindest regards Raslan Darawsheh
RE: [PATCH v2] net/mlx5: fix jump action validation
Hi, > -Original Message- > From: Michael Baum > Sent: Monday, November 27, 2023 2:43 PM > To: dev@dpdk.org > Cc: Matan Azrad ; Raslan Darawsheh > ; Slava Ovsiienko ; Ori Kam > ; Suanming Mou ; > dek...@mellanox.com; sta...@dpdk.org > Subject: [PATCH v2] net/mlx5: fix jump action validation > > Currently PMD doesn't allow to jump to the same group in order to > avoid dead loop. But this also prevent experienced user to create > flow with less Hops in order to have better performance. > > For example, rules in [1] should have better performance then [2]. > > Furthermore, this protection will not really prevent dead loop, i.e > [3]. So just remove this protection and user should take the > responsibility to avoid dead loop. > > This patch enables jumping to the same group. > > [1]: > flow create 0 group 1 priority 1 pattern eth / ipv4 / udp / gtp / end > actions raw_decap / raw_encap / jump group 1 / end > flow create 0 group 1 priority 0 pattern eth / ipv4 src is 1.0.0.1 / tcp > / end actions queues index 1 / end > > [2]: > flow create 0 group 1 priority 0 pattern eth / ipv4 / udp / gtp / end > actions raw_decap / raw_encap / jump group 2 / end > flow create 0 group 2 priority 0 pattern eth / ipv4 src is 1.0.0.1 / tcp > / end actions queues index 1 / end > > [3]: > flow create 0 group 1 pattern eth / end actions jump group 2 / end > flow create 0 group 2 pattern eth / end actions jump group 1 / end > > Fixes: f78f747f41d0 ("net/mlx5: allow jump to group lower than current") > Cc: dek...@mellanox.com > Cc: sta...@dpdk.org > > Signed-off-by: Michael Baum > Acked-by: Matan Azrad > --- > > V2: change commit message to fix template. Patch applied to next-net-mlx, Kindest regards Raslan Darawsheh
RE: [PATCH v2] net/mlx5: fix index choosing in TAG modification
Hi, > -Original Message- > From: Michael Baum > Sent: Monday, November 27, 2023 6:01 PM > To: dev@dpdk.org > Cc: Matan Azrad ; Raslan Darawsheh > ; Slava Ovsiienko ; Ori Kam > ; Suanming Mou ; Gregory > Etelson ; sta...@dpdk.org > Subject: [PATCH v2] net/mlx5: fix index choosing in TAG modification > > When MPLS modification support was added [1], the "tag_index" field was > added into "rte_flow_action_modify_data" structure. > As part of this change, the "RTE_FLOW_FIELD_TAG" type moved to use it for > tag array instead of using "level" field. > Using "level" is still supported for backwards compatibility when "tag_index" > field is zero. > > The "mlx5_flow_field_id_to_modify_info()" function calls > "flow_hw_get_reg_id()" function with "level" without checking first whether > "tag_index" field is valid. > > This patch calls first to "flow_tag_index_get()" function to get the index > before > sending it to "flow_hw_get_reg_id()" function. > > [1] commit c23626f27b09 ("ethdev: add MPLS header modification") > > Fixes: 04e740e69512 ("net/mlx5: separate registers usage per port") > Cc: getel...@nvidia.com > Cc: sta...@dpdk.org > > Signed-off-by: Michael Baum > Acked-by: Ori Kam > --- > v2: fix the commit reference format. > Patch applied to next-net-mlx, Kindest regards Raslan Darawsheh
RE: [PATCH 0/4] net/mlx5: add modify field ADD fields support
Hi, > -Original Message- > From: Suanming Mou > Sent: Thursday, December 14, 2023 5:04 AM > Cc: dev@dpdk.org; Raslan Darawsheh > Subject: [PATCH 0/4] net/mlx5: add modify field ADD fields support > > Before this series, the modify_field ADD operation in mlx5 PMD only allowed > sum of immediate value to field. > > ADD_FIELD operation allows user to add the src field value to the dest field. > Dest field has the sum of src field value and original dst field value. > > > Suanming Mou (4): > net/mlx5: add TCP/IP length modify field > net/mlx5: rename modify copy destination to destination > net/mlx5: add modify field action ADD fields support > net/mlx5: add modify field action ADD fields validation > > doc/guides/rel_notes/release_24_03.rst | 4 ++ > drivers/common/mlx5/mlx5_prm.h | 4 ++ > drivers/net/mlx5/mlx5_flow.h | 2 +- > drivers/net/mlx5/mlx5_flow_dv.c| 83 -- > drivers/net/mlx5/mlx5_flow_hw.c| 42 - > 5 files changed, 112 insertions(+), 23 deletions(-) > > -- > 2.34.1 Series applied to next-net-mlx, Kindest regards Raslan Darawsheh
net/ena: roadmap for 24.03
ENA 24.03 roadmap: * Add support for uio_pci_generic uio module * Add support for wide LLQ recommendation from the device * Add support for sub-optimal configuration notifications from the device * Restructure rx_drop basic stat to include rx_overruns * Restructure the metrics multi-process functions * AWS new instance types support: - Negotiate recommended Tx queue depth with the device - Report new link speed capabilities
[PATCH 1/1] ml/cnxk: exclude caching run stats from xstats
From: Anup Prabhu Exclude the hardware and firmware latency of model data caching run from xstats calculation. Fixes: 9cfad6c334f2 ("ml/cnxk: update device and model xstats functions") Cc: sta...@dpdk.org Signed-off-by: Anup Prabhu Acked-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 7f7e5efceac..53700387335 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -288,6 +288,7 @@ cn10k_ml_model_xstat_get(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *l static int cn10k_ml_cache_model_data(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *layer) { + struct cn10k_ml_layer_xstats *xstats; char str[RTE_MEMZONE_NAMESIZE]; const struct plt_memzone *mz; uint64_t isize = 0; @@ -309,6 +310,16 @@ cn10k_ml_cache_model_data(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer * PLT_PTR_ADD(mz->addr, isize), 1); plt_memzone_free(mz); + /* Reset sync xstats. */ + xstats = layer->glow.sync_xstats; + xstats->hw_latency_tot = 0; + xstats->hw_latency_min = UINT64_MAX; + xstats->hw_latency_max = 0; + xstats->fw_latency_tot = 0; + xstats->fw_latency_min = UINT64_MAX; + xstats->fw_latency_max = 0; + xstats->dequeued_count = 0; + return ret; } -- 2.42.0
[PATCH 1/1] ml/cnxk: enable data caching for TVM models
Enabled data caching for TVM models with MRVL only layers. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 53700387335..834e55e88e9 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -996,8 +996,13 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, const char *layer_name) if (ret < 0) { cn10k_ml_layer_stop(device, model_id, layer_name); } else { - if (cn10k_mldev->cache_model_data && model->type == ML_CNXK_MODEL_TYPE_GLOW) - ret = cn10k_ml_cache_model_data(cnxk_mldev, layer); + if (cn10k_mldev->cache_model_data) { + if ((model->type == ML_CNXK_MODEL_TYPE_GLOW && +model->subtype == ML_CNXK_MODEL_SUBTYPE_GLOW_MRVL) || + (model->type == ML_CNXK_MODEL_TYPE_TVM && +model->subtype == ML_CNXK_MODEL_SUBTYPE_TVM_MRVL)) + ret = cn10k_ml_cache_model_data(cnxk_mldev, layer); + } } return ret; -- 2.42.0
[PATCH 0/3] add support for additional data types
Added support for 64-bit integer data types for inference input and output. Extended support for quantization of 32-bit and 64-bit integer data types. Srikanth Yalavarthi (3): mldev: add conversion routines for 32-bit integers mldev: add support for 64-integer data type ml/cnxk: add support for additional integer types drivers/ml/cnxk/cnxk_ml_io.c | 24 ++ drivers/ml/cnxk/mvtvm_ml_model.c | 4 + lib/mldev/mldev_utils.c | 4 + lib/mldev/mldev_utils.h | 184 ++ lib/mldev/mldev_utils_neon.c | 566 +++ lib/mldev/mldev_utils_scalar.c | 196 +++ lib/mldev/rte_mldev.h| 4 + lib/mldev/version.map| 8 + 8 files changed, 990 insertions(+) -- 2.42.0
[PATCH 3/3] ml/cnxk: add support for additional integer types
Added support quantization and dequantization of 32-bit and 64-bit integer types. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cnxk_ml_io.c | 24 drivers/ml/cnxk/mvtvm_ml_model.c | 4 2 files changed, 28 insertions(+) diff --git a/drivers/ml/cnxk/cnxk_ml_io.c b/drivers/ml/cnxk/cnxk_ml_io.c index c78009ab0cd..4b0adc2ae47 100644 --- a/drivers/ml/cnxk/cnxk_ml_io.c +++ b/drivers/ml/cnxk/cnxk_ml_io.c @@ -40,6 +40,18 @@ cnxk_ml_io_quantize_single(struct cnxk_ml_io *input, uint8_t *dbuffer, uint8_t * case RTE_ML_IO_TYPE_UINT16: ret = rte_ml_io_float32_to_uint16(qscale, nb_elements, dbuffer, qbuffer); break; + case RTE_ML_IO_TYPE_INT32: + ret = rte_ml_io_float32_to_int32(qscale, nb_elements, dbuffer, qbuffer); + break; + case RTE_ML_IO_TYPE_UINT32: + ret = rte_ml_io_float32_to_uint32(qscale, nb_elements, dbuffer, qbuffer); + break; + case RTE_ML_IO_TYPE_INT64: + ret = rte_ml_io_float32_to_int64(qscale, nb_elements, dbuffer, qbuffer); + break; + case RTE_ML_IO_TYPE_UINT64: + ret = rte_ml_io_float32_to_uint64(qscale, nb_elements, dbuffer, qbuffer); + break; case RTE_ML_IO_TYPE_FP16: ret = rte_ml_io_float32_to_float16(nb_elements, dbuffer, qbuffer); break; @@ -82,6 +94,18 @@ cnxk_ml_io_dequantize_single(struct cnxk_ml_io *output, uint8_t *qbuffer, uint8_ case RTE_ML_IO_TYPE_UINT16: ret = rte_ml_io_uint16_to_float32(dscale, nb_elements, qbuffer, dbuffer); break; + case RTE_ML_IO_TYPE_INT32: + ret = rte_ml_io_int32_to_float32(dscale, nb_elements, qbuffer, dbuffer); + break; + case RTE_ML_IO_TYPE_UINT32: + ret = rte_ml_io_uint32_to_float32(dscale, nb_elements, qbuffer, dbuffer); + break; + case RTE_ML_IO_TYPE_INT64: + ret = rte_ml_io_int64_to_float32(dscale, nb_elements, qbuffer, dbuffer); + break; + case RTE_ML_IO_TYPE_UINT64: + ret = rte_ml_io_uint64_to_float32(dscale, nb_elements, qbuffer, dbuffer); + break; case RTE_ML_IO_TYPE_FP16: ret = rte_ml_io_float16_to_float32(nb_elements, qbuffer, dbuffer); break; diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c index 0dbe08e9889..e3234ae4422 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.c +++ b/drivers/ml/cnxk/mvtvm_ml_model.c @@ -150,6 +150,8 @@ mvtvm_ml_io_type_map(DLDataType dltype) return RTE_ML_IO_TYPE_INT16; else if (dltype.bits == 32) return RTE_ML_IO_TYPE_INT32; + else if (dltype.bits == 64) + return RTE_ML_IO_TYPE_INT64; break; case kDLUInt: if (dltype.bits == 8) @@ -158,6 +160,8 @@ mvtvm_ml_io_type_map(DLDataType dltype) return RTE_ML_IO_TYPE_UINT16; else if (dltype.bits == 32) return RTE_ML_IO_TYPE_UINT32; + else if (dltype.bits == 64) + return RTE_ML_IO_TYPE_UINT64; break; case kDLFloat: if (dltype.bits == 8) -- 2.42.0
[PATCH 2/3] mldev: add support for 64-integer data type
Added support in mldev spec for 64-bit integer types. Added routines to convert data from 64-bit integer type to float32_t and vice-versa. Signed-off-by: Srikanth Yalavarthi --- lib/mldev/mldev_utils.c| 4 + lib/mldev/mldev_utils.h| 92 ++ lib/mldev/mldev_utils_neon.c | 324 + lib/mldev/mldev_utils_scalar.c | 98 ++ lib/mldev/rte_mldev.h | 4 + lib/mldev/version.map | 4 + 6 files changed, 526 insertions(+) diff --git a/lib/mldev/mldev_utils.c b/lib/mldev/mldev_utils.c index ccd2c39ca89..13ac615e9fc 100644 --- a/lib/mldev/mldev_utils.c +++ b/lib/mldev/mldev_utils.c @@ -32,6 +32,10 @@ rte_ml_io_type_size_get(enum rte_ml_io_type type) return sizeof(int32_t); case RTE_ML_IO_TYPE_UINT32: return sizeof(uint32_t); + case RTE_ML_IO_TYPE_INT64: + return sizeof(int64_t); + case RTE_ML_IO_TYPE_UINT64: + return sizeof(uint64_t); case RTE_ML_IO_TYPE_FP8: return sizeof(uint8_t); case RTE_ML_IO_TYPE_FP16: diff --git a/lib/mldev/mldev_utils.h b/lib/mldev/mldev_utils.h index 1d041531b43..6daae6d0a1c 100644 --- a/lib/mldev/mldev_utils.h +++ b/lib/mldev/mldev_utils.h @@ -328,6 +328,98 @@ __rte_internal int rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void *output); +/** + * @internal + * + * Convert a buffer containing numbers in single precision floating format (float32) to signed + * 64-bit integer format (INT64). + * + * @param[in] scale + * Scale factor for conversion. + * @param[in] nb_elements + * Number of elements in the buffer. + * @param[in] input + * Input buffer containing float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * @param[out] output + * Output buffer to store INT64 numbers. Size of buffer is equal to (nb_elements * 8) bytes. + * + * @return + * - 0, Success. + * - < 0, Error code on failure. + */ +__rte_internal +int +rte_ml_io_float32_to_int64(float scale, uint64_t nb_elements, void *input, void *output); + +/** + * @internal + * + * Convert a buffer containing numbers in signed 64-bit integer format (INT64) to single precision + * floating format (float32). + * + * @param[in] scale + * Scale factor for conversion. + * @param[in] nb_elements + * Number of elements in the buffer. + * @param[in] input + * Input buffer containing INT64 numbers. Size of buffer is equal to (nb_elements * 8) bytes. + * @param[out] output + * Output buffer to store float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * + * @return + * - 0, Success. + * - < 0, Error code on failure. + */ +__rte_internal +int +rte_ml_io_int64_to_float32(float scale, uint64_t nb_elements, void *input, void *output); + +/** + * @internal + * + * Convert a buffer containing numbers in single precision floating format (float32) to unsigned + * 64-bit integer format (UINT64). + * + * @param[in] scale + * Scale factor for conversion. + * @param[in] nb_elements + * Number of elements in the buffer. + * @param[in] input + * Input buffer containing float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * @param[out] output + * Output buffer to store UINT64 numbers. Size of buffer is equal to (nb_elements * 8) bytes. + * + * @return + * - 0, Success. + * - < 0, Error code on failure. + */ +__rte_internal +int +rte_ml_io_float32_to_uint64(float scale, uint64_t nb_elements, void *input, void *output); + +/** + * @internal + * + * Convert a buffer containing numbers in unsigned 64-bit integer format (UINT64) to single + * precision floating format (float32). + * + * @param[in] scale + * Scale factor for conversion. + * @param[in] nb_elements + * Number of elements in the buffer. + * @param[in] input + * Input buffer containing UINT64 numbers. Size of buffer is equal to (nb_elements * 8) bytes. + * @param[out] output + * Output buffer to store float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * + * @return + * - 0, Success. + * - < 0, Error code on failure. + */ +__rte_internal +int +rte_ml_io_uint64_to_float32(float scale, uint64_t nb_elements, void *input, void *output); + /** * @internal * diff --git a/lib/mldev/mldev_utils_neon.c b/lib/mldev/mldev_utils_neon.c index 250fa43fa73..4cde2ebabd3 100644 --- a/lib/mldev/mldev_utils_neon.c +++ b/lib/mldev/mldev_utils_neon.c @@ -842,6 +842,330 @@ rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void return 0; } +static inline void +__float32_to_int64_neon_s64x2(float scale, float *input, int64_t *output) +{ + float32x2_t f32x2; + float64x2_t f64x2; + int64x2_t s64x2; + + /* load 2 x float elements */ + f32x2 = vld1_f32(input); + + /* scale */ + f32x2 = vmul_n_f32(f32x2, scale); + + /* convert t
[PATCH 1/3] mldev: add conversion routines for 32-bit integers
Added routines to convert data from 32-bit integer type to float32_t and vice-versa. Signed-off-by: Srikanth Yalavarthi --- lib/mldev/mldev_utils.h| 92 + lib/mldev/mldev_utils_neon.c | 242 + lib/mldev/mldev_utils_scalar.c | 98 + lib/mldev/version.map | 4 + 4 files changed, 436 insertions(+) diff --git a/lib/mldev/mldev_utils.h b/lib/mldev/mldev_utils.h index 220afb42f0d..1d041531b43 100644 --- a/lib/mldev/mldev_utils.h +++ b/lib/mldev/mldev_utils.h @@ -236,6 +236,98 @@ __rte_internal int rte_ml_io_uint16_to_float32(float scale, uint64_t nb_elements, void *input, void *output); +/** + * @internal + * + * Convert a buffer containing numbers in single precision floating format (float32) to signed + * 32-bit integer format (INT32). + * + * @param[in] scale + * Scale factor for conversion. + * @param[in] nb_elements + * Number of elements in the buffer. + * @param[in] input + * Input buffer containing float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * @param[out] output + * Output buffer to store INT32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * + * @return + * - 0, Success. + * - < 0, Error code on failure. + */ +__rte_internal +int +rte_ml_io_float32_to_int32(float scale, uint64_t nb_elements, void *input, void *output); + +/** + * @internal + * + * Convert a buffer containing numbers in signed 32-bit integer format (INT32) to single precision + * floating format (float32). + * + * @param[in] scale + * Scale factor for conversion. + * @param[in] nb_elements + * Number of elements in the buffer. + * @param[in] input + * Input buffer containing INT32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * @param[out] output + * Output buffer to store float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * + * @return + * - 0, Success. + * - < 0, Error code on failure. + */ +__rte_internal +int +rte_ml_io_int32_to_float32(float scale, uint64_t nb_elements, void *input, void *output); + +/** + * @internal + * + * Convert a buffer containing numbers in single precision floating format (float32) to unsigned + * 32-bit integer format (UINT32). + * + * @param[in] scale + * Scale factor for conversion. + * @param[in] nb_elements + * Number of elements in the buffer. + * @param[in] input + * Input buffer containing float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * @param[out] output + * Output buffer to store UINT32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * + * @return + * - 0, Success. + * - < 0, Error code on failure. + */ +__rte_internal +int +rte_ml_io_float32_to_uint32(float scale, uint64_t nb_elements, void *input, void *output); + +/** + * @internal + * + * Convert a buffer containing numbers in unsigned 32-bit integer format (UINT32) to single + * precision floating format (float32). + * + * @param[in] scale + * Scale factor for conversion. + * @param[in] nb_elements + * Number of elements in the buffer. + * @param[in] input + * Input buffer containing UINT32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * @param[out] output + * Output buffer to store float32 numbers. Size of buffer is equal to (nb_elements * 4) bytes. + * + * @return + * - 0, Success. + * - < 0, Error code on failure. + */ +__rte_internal +int +rte_ml_io_uint32_to_float32(float scale, uint64_t nb_elements, void *input, void *output); + /** * @internal * diff --git a/lib/mldev/mldev_utils_neon.c b/lib/mldev/mldev_utils_neon.c index c7baec012b8..250fa43fa73 100644 --- a/lib/mldev/mldev_utils_neon.c +++ b/lib/mldev/mldev_utils_neon.c @@ -600,6 +600,248 @@ rte_ml_io_uint16_to_float32(float scale, uint64_t nb_elements, void *input, void return 0; } +static inline void +__float32_to_int32_neon_s32x4(float scale, float *input, int32_t *output) +{ + float32x4_t f32x4; + int32x4_t s32x4; + + /* load 4 x float elements */ + f32x4 = vld1q_f32(input); + + /* scale */ + f32x4 = vmulq_n_f32(f32x4, scale); + + /* convert to int32x4_t using round to nearest with ties away rounding mode */ + s32x4 = vcvtaq_s32_f32(f32x4); + + /* store 4 elements */ + vst1q_s32(output, s32x4); +} + +static inline void +__float32_to_int32_neon_s32x1(float scale, float *input, int32_t *output) +{ + /* scale and convert, round to nearest with ties away rounding mode */ + *output = vcvtas_s32_f32(scale * (*input)); +} + +int +rte_ml_io_float32_to_int32(float scale, uint64_t nb_elements, void *input, void *output) +{ + float *input_buffer; + int32_t *output_buffer; + uint64_t nb_iterations; + uint32_t vlen; + uint64_t i; + + if ((scale == 0) || (nb_elements == 0) || (input == NULL) || (output == NULL)) + retur
[PATCH 00/11] Introduce Event ML Adapter
Machine learning event adapter library == DPDK Eventdev library provides event driven programming model with features to schedule events. ML Device library provides an interface to ML poll mode drivers that support Machine Learning inference operations. Event ML Adapter is intended to bridge between the event device and the ML device. Packet flow from ML device to the event device can be accomplished using software and hardware based transfer mechanisms. The adapter queries an eventdev PMD to determine which mechanism to be used. The adapter uses an EAL service core function for software based packet transfer and uses the eventdev PMD functions to configure hardware based packet transfer between ML device and the event device. The application can choose to submit a ML operation directly to an ML device or send it to the ML adapter via eventdev based on RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_OP_FWD capability. The first mode is known as the event new (RTE_EVENT_ML_ADAPTER_OP_NEW) mode and the second as the event forward (RTE_EVENT_ML_ADAPTER_OP_FORWARD) mode. The choice of mode can be specified while creating the adapter. In the former mode, it is an application responsibility to enable ingress packet ordering. In the latter mode, it is the adapter responsibility to enable the ingress packet ordering. Working model of RTE_EVENT_ML_ADAPTER_OP_NEW mode: +--+ +--+ | | | ML stage | | Application |---[2]-->| + enqueue to | | | | mldev| +--+ +--+ ^ ^ | | | [3] [6] [1] | | | | +--+| | || | Event device || | || +--+| ^| || [5] | |v +--+ +--+ | | | | | ML adapter |<--[4]---|mldev | | | | | +--+ +--+ [1] Application dequeues events from the previous stage. [2] Application prepares the ML operations. [3] ML operations are submitted to mldev by application. [4] ML adapter dequeues ML completions from mldev. [5] ML adapter enqueues events to the eventdev. [6] Application dequeues from eventdev for further processing. In the RTE_EVENT_ML_ADAPTER_OP_NEW mode, application submits ML operations directly to ML device. The ML adapter then dequeues ML completions from ML device and enqueue events to the event device. This mode does not ensure ingress ordering, if the application directly enqueues to mldev without going through ML / atomic stage i.e. removing item [1] and [2]. Events dequeued from the adapter will be treated as new events. In this mode, application needs to specify event information (response information) which is needed to enqueue an event after the ML operation is completed. Working model of RTE_EVENT_ML_ADAPTER_OP_FORWARD mode: +--+ +--+ --[1]-->| |---[2]-->| Application | | Event device | | in | <--[8]--| |<--[3]---| Ordered stage| +--+ +--+ ^ | | [4] [7] | | v ++ +--+ ||--[5]->| | | ML adapter | | mldev| ||<-[6]--| | ++ +--+ [1] Events from the previous stage. [2] Application in ordered stage dequeues events from eventdev. [3] Application enqueues ML operations as events to eventdev. [4] ML adapter dequeues event from eventdev. [5] ML adapter submits ML operations to mldev (Atomic stage). [6] ML adapter dequeues ML completions from mldev [7] ML adapter enqueues events to the eventdev [8] Events to the next stage In the event forward (RTE_EVENT_ML_ADAPTER_OP_FORWARD) mode, if the HW supports the capability RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_OP_FWD, application can directly submit the ML operations to the mldev.
[PATCH 01/11] eventdev: introduce ML event adapter library
Introduce event ML adapter APIs. This patch provides information on adapter modes and usage. Application can use this event adapter interface to transfer packets between ML device and event device. Signed-off-by: Srikanth Yalavarthi --- MAINTAINERS |6 + config/rte_config.h |1 + doc/api/doxy-api-index.md |1 + doc/guides/prog_guide/event_ml_adapter.rst| 268 doc/guides/prog_guide/eventdev.rst| 10 +- .../img/event_ml_adapter_op_forward.svg | 1086 + .../img/event_ml_adapter_op_new.svg | 1079 doc/guides/prog_guide/index.rst |1 + lib/eventdev/meson.build |4 +- lib/eventdev/rte_event_ml_adapter.c |6 + lib/eventdev/rte_event_ml_adapter.h | 594 + lib/eventdev/rte_eventdev.h | 45 + lib/meson.build |2 +- lib/mldev/rte_mldev.h |6 + 14 files changed, 3102 insertions(+), 7 deletions(-) create mode 100644 doc/guides/prog_guide/event_ml_adapter.rst create mode 100644 doc/guides/prog_guide/img/event_ml_adapter_op_forward.svg create mode 100644 doc/guides/prog_guide/img/event_ml_adapter_op_new.svg create mode 100644 lib/eventdev/rte_event_ml_adapter.c create mode 100644 lib/eventdev/rte_event_ml_adapter.h diff --git a/MAINTAINERS b/MAINTAINERS index 0d1c8126e3e..a1125e93621 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -554,6 +554,12 @@ F: drivers/raw/skeleton/ F: app/test/test_rawdev.c F: doc/guides/prog_guide/rawdev.rst +Eventdev ML Adapter API +M: Srikanth Yalavarthi +T: git://dpdk.org/next/dpdk-next-eventdev +F: lib/eventdev/*ml_adapter* +F: doc/guides/prog_guide/event_ml_adapter.rst + Memory Pool Drivers --- diff --git a/config/rte_config.h b/config/rte_config.h index da265d7dd24..29c5aa558e6 100644 --- a/config/rte_config.h +++ b/config/rte_config.h @@ -80,6 +80,7 @@ #define RTE_EVENT_CRYPTO_ADAPTER_MAX_INSTANCE 32 #define RTE_EVENT_ETH_TX_ADAPTER_MAX_INSTANCE 32 #define RTE_EVENT_DMA_ADAPTER_MAX_INSTANCE 32 +#define RTE_EVENT_ML_ADAPTER_MAX_INSTANCE 32 /* rawdev defines */ #define RTE_RAWDEV_MAX_DEVS 64 diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index a6a768bd7c6..d8c3d887ade 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -30,6 +30,7 @@ The public API headers are grouped by topics: [event_timer_adapter](@ref rte_event_timer_adapter.h), [event_crypto_adapter](@ref rte_event_crypto_adapter.h), [event_dma_adapter](@ref rte_event_dma_adapter.h), + [event_ml_adapter](@ref rte_event_ml_adapter.h), [rawdev](@ref rte_rawdev.h), [metrics](@ref rte_metrics.h), [bitrate](@ref rte_bitrate.h), diff --git a/doc/guides/prog_guide/event_ml_adapter.rst b/doc/guides/prog_guide/event_ml_adapter.rst new file mode 100644 index 000..71f6c4b5974 --- /dev/null +++ b/doc/guides/prog_guide/event_ml_adapter.rst @@ -0,0 +1,268 @@ +.. SPDX-License-Identifier: BSD-3-Clause +Copyright (c) 2024 Marvell. + +Event ML Adapter Library + + +DPDK :doc:`Eventdev library ` provides event driven programming model with features +to schedule events. :doc:`ML Device library ` provides an interface to ML poll mode +drivers that support Machine Learning inference operations. Event ML Adapter is intended to +bridge between the event device and the ML device. + +Packet flow from ML device to the event device can be accomplished using software and hardware +based transfer mechanisms. The adapter queries an eventdev PMD to determine which mechanism to +be used. The adapter uses an EAL service core function for software based packet transfer and +uses the eventdev PMD functions to configure hardware based packet transfer between ML device +and the event device. ML adapter uses a new event type called ``RTE_EVENT_TYPE_MLDEV`` to +indicate the source of event. + +Application can choose to submit an ML operation directly to an ML device or send it to an ML +adapter via eventdev based on RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_OP_FWD capability. The +first mode is known as the event new (RTE_EVENT_ML_ADAPTER_OP_NEW) mode and the second as the +event forward (RTE_EVENT_ML_ADAPTER_OP_FORWARD) mode. Choice of mode can be specified while +creating the adapter. In the former mode, it is the application's responsibility to enable +ingress packet ordering. In the latter mode, it is the adapter's responsibility to enable +ingress packet ordering. + + +Adapter Modes +- + +RTE_EVENT_ML_ADAPTER_OP_NEW mode + + +In the RTE_EVENT_ML_ADAPTER_OP_NEW mode, application submits ML operations directly to an ML +device. The adapter then dequeues ML completions from the ML device and enqueues them as events +to the event device. This mode does not ensure ing
[PATCH 02/11] event/ml: add ml adapter capabilities get
Added library function to get ML adapter capabilities. Signed-off-by: Srikanth Yalavarthi --- lib/eventdev/eventdev_pmd.h | 29 + lib/eventdev/rte_eventdev.c | 27 +++ 2 files changed, 56 insertions(+) diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h index 1790587808a..94d505753dc 100644 --- a/lib/eventdev/eventdev_pmd.h +++ b/lib/eventdev/eventdev_pmd.h @@ -84,6 +84,8 @@ extern "C" { #define RTE_EVENT_TIMER_ADAPTER_SW_CAP \ RTE_EVENT_TIMER_ADAPTER_CAP_PERIODIC +#define RTE_EVENT_ML_ADAPTER_SW_CAP 0x0 + #define RTE_EVENTDEV_DETACHED (0) #define RTE_EVENTDEV_ATTACHED (1) @@ -1522,6 +1524,30 @@ typedef int (*eventdev_dma_adapter_stats_get)(const struct rte_eventdev *dev, typedef int (*eventdev_dma_adapter_stats_reset)(const struct rte_eventdev *dev, const int16_t dma_dev_id); +struct rte_ml_dev; + +/** + * Retrieve the event device's ML adapter capabilities for the + * specified MLDEV + * + * @param dev + * Event device pointer + * + * @param mldev + * ML device pointer + * + * @param[out] caps + * A pointer to memory filled with event adapter capabilities. + * It is expected to be pre-allocated & initialized by caller. + * + * @return + * - 0: Success, driver provides event adapter capabilities for the + * MLDEV. + * - <0: Error code returned by the driver function. + * + */ +typedef int (*eventdev_ml_adapter_caps_get_t)(const struct rte_eventdev *dev, + const struct rte_ml_dev *mldev, uint32_t *caps); /** Event device operations function pointer table */ struct eventdev_ops { @@ -1662,6 +1688,9 @@ struct eventdev_ops { eventdev_dma_adapter_stats_reset dma_adapter_stats_reset; /**< Reset DMA stats */ + eventdev_ml_adapter_caps_get_t ml_adapter_caps_get; + /**< Get ML adapter capabilities */ + eventdev_selftest dev_selftest; /**< Start eventdev Selftest */ diff --git a/lib/eventdev/rte_eventdev.c b/lib/eventdev/rte_eventdev.c index 157752868d5..7fbc6f3d98a 100644 --- a/lib/eventdev/rte_eventdev.c +++ b/lib/eventdev/rte_eventdev.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include "rte_eventdev.h" @@ -249,6 +250,32 @@ rte_event_dma_adapter_caps_get(uint8_t dev_id, uint8_t dma_dev_id, uint32_t *cap return 0; } +int +rte_event_ml_adapter_caps_get(uint8_t evdev_id, int16_t mldev_id, uint32_t *caps) +{ + struct rte_eventdev *dev; + struct rte_ml_dev *mldev; + + RTE_EVENTDEV_VALID_DEVID_OR_ERR_RET(evdev_id, -EINVAL); + if (!rte_ml_dev_is_valid_dev(mldev_id)) + return -EINVAL; + + dev = &rte_eventdevs[evdev_id]; + mldev = rte_ml_dev_pmd_get_dev(mldev_id); + + if (caps == NULL) + return -EINVAL; + + if (dev->dev_ops->ml_adapter_caps_get == NULL) + *caps = RTE_EVENT_ML_ADAPTER_SW_CAP; + else + *caps = 0; + + return dev->dev_ops->ml_adapter_caps_get ? + (*dev->dev_ops->ml_adapter_caps_get)(dev, mldev, caps) : + 0; +} + static inline int event_dev_queue_config(struct rte_eventdev *dev, uint8_t nb_queues) { -- 2.42.0
[PATCH 03/11] event/ml: add adapter create and free
Added ML event adapter create and free functions. Signed-off-by: Srikanth Yalavarthi --- lib/eventdev/rte_event_ml_adapter.c | 317 1 file changed, 317 insertions(+) diff --git a/lib/eventdev/rte_event_ml_adapter.c b/lib/eventdev/rte_event_ml_adapter.c index 5b8b02a0130..fed3b67c858 100644 --- a/lib/eventdev/rte_event_ml_adapter.c +++ b/lib/eventdev/rte_event_ml_adapter.c @@ -4,3 +4,320 @@ #include "rte_event_ml_adapter.h" #include "rte_eventdev.h" +#include + +#include "eventdev_pmd.h" +#include "rte_mldev_pmd.h" + +#define ML_ADAPTER_NAME_LEN32 +#define ML_DEFAULT_MAX_NB 128 +#define ML_ADAPTER_BUFFER_SIZE 1024 + +#define ML_ADAPTER_ARRAY "event_ml_adapter_array" + +/* ML ops circular buffer */ +struct ml_ops_circular_buffer { + /* Index of head element */ + uint16_t head; + + /* Index of tail element */ + uint16_t tail; + + /* Number of elements in buffer */ + uint16_t count; + + /* Size of circular buffer */ + uint16_t size; + + /* Pointer to hold rte_ml_op for processing */ + struct rte_ml_op **op_buffer; +} __rte_cache_aligned; + +/* ML device information */ +struct ml_device_info { + /* Pointer to mldev */ + struct rte_ml_dev *dev; +} __rte_cache_aligned; + +struct event_ml_adapter { + /* Event device identifier */ + uint8_t eventdev_id; + + /* Event port identifier */ + uint8_t event_port_id; + + /* Adapter mode */ + enum rte_event_ml_adapter_mode mode; + + /* Memory allocation name */ + char mem_name[ML_ADAPTER_NAME_LEN]; + + /* Socket identifier cached from eventdev */ + int socket_id; + + /* Lock to serialize config updates with service function */ + rte_spinlock_t lock; + + /* ML device structure array */ + struct ml_device_info *mldevs; + + /* Circular buffer for processing ML ops to eventdev */ + struct ml_ops_circular_buffer ebuf; + + /* Configuration callback for rte_service configuration */ + rte_event_ml_adapter_conf_cb conf_cb; + + /* Configuration callback argument */ + void *conf_arg; + + /* Set if default_cb is being used */ + int default_cb_arg; +} __rte_cache_aligned; + +static struct event_ml_adapter **event_ml_adapter; + +static inline int +emla_valid_id(uint8_t id) +{ + return id < RTE_EVENT_ML_ADAPTER_MAX_INSTANCE; +} + +static inline struct event_ml_adapter * +emla_id_to_adapter(uint8_t id) +{ + return event_ml_adapter ? event_ml_adapter[id] : NULL; +} + +static int +emla_array_init(void) +{ + const struct rte_memzone *mz; + uint32_t sz; + + mz = rte_memzone_lookup(ML_ADAPTER_ARRAY); + if (mz == NULL) { + sz = sizeof(struct event_ml_adapter *) * RTE_EVENT_ML_ADAPTER_MAX_INSTANCE; + sz = RTE_ALIGN(sz, RTE_CACHE_LINE_SIZE); + + mz = rte_memzone_reserve_aligned(ML_ADAPTER_ARRAY, sz, rte_socket_id(), 0, +RTE_CACHE_LINE_SIZE); + if (mz == NULL) { + RTE_EDEV_LOG_ERR("Failed to reserve memzone : %s, err = %d", +ML_ADAPTER_ARRAY, rte_errno); + return -rte_errno; + } + } + + event_ml_adapter = mz->addr; + + return 0; +} + +static inline int +emla_circular_buffer_init(const char *name, struct ml_ops_circular_buffer *buf, uint16_t sz) +{ + buf->op_buffer = rte_zmalloc(name, sizeof(struct rte_ml_op *) * sz, 0); + if (buf->op_buffer == NULL) + return -ENOMEM; + + buf->size = sz; + + return 0; +} + +static inline void +emla_circular_buffer_free(struct ml_ops_circular_buffer *buf) +{ + rte_free(buf->op_buffer); +} + +static int +emla_default_config_cb(uint8_t id, uint8_t evdev_id, struct rte_event_ml_adapter_conf *conf, + void *arg) +{ + struct rte_event_port_conf *port_conf; + struct rte_event_dev_config dev_conf; + struct event_ml_adapter *adapter; + struct rte_eventdev *dev; + uint8_t port_id; + int started; + int ret; + + adapter = emla_id_to_adapter(id); + if (adapter == NULL) + return -EINVAL; + + dev = &rte_eventdevs[adapter->eventdev_id]; + dev_conf = dev->data->dev_conf; + + started = dev->data->dev_started; + if (started) + rte_event_dev_stop(evdev_id); + + port_id = dev_conf.nb_event_ports; + dev_conf.nb_event_ports += 1; + + port_conf = arg; + if (port_conf->event_port_cfg & RTE_EVENT_PORT_CFG_SINGLE_LINK) + dev_conf.nb_single_link_event_port_queues += 1; + + ret = rte_event_dev_configure(evdev_id, &dev_conf); + if (ret) { + RTE_EDEV_LOG_ERR("Failed to configure event dev %u", evdev_id); + if (started) { + if (
[PATCH 04/11] event/ml: add adapter port get
Added ML adapter port get function. Signed-off-by: Srikanth Yalavarthi --- lib/eventdev/rte_event_ml_adapter.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/lib/eventdev/rte_event_ml_adapter.c b/lib/eventdev/rte_event_ml_adapter.c index fed3b67c858..93ba58b3e9e 100644 --- a/lib/eventdev/rte_event_ml_adapter.c +++ b/lib/eventdev/rte_event_ml_adapter.c @@ -321,3 +321,22 @@ rte_event_ml_adapter_free(uint8_t id) return 0; } + +int +rte_event_ml_adapter_event_port_get(uint8_t id, uint8_t *event_port_id) +{ + struct event_ml_adapter *adapter; + + if (!emla_valid_id(id)) { + RTE_EDEV_LOG_ERR("Invalid ML adapter id = %d", id); + return -EINVAL; + } + + adapter = emla_id_to_adapter(id); + if (adapter == NULL || event_port_id == NULL) + return -EINVAL; + + *event_port_id = adapter->event_port_id; + + return 0; +} -- 2.42.0
[PATCH 05/11] event/ml: add adapter queue pair add and delete
Added ML adapter queue-pair add and delete functions Signed-off-by: Srikanth Yalavarthi --- lib/eventdev/eventdev_pmd.h | 54 lib/eventdev/rte_event_ml_adapter.c | 193 2 files changed, 247 insertions(+) diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h index 94d505753dc..48e970a5097 100644 --- a/lib/eventdev/eventdev_pmd.h +++ b/lib/eventdev/eventdev_pmd.h @@ -1549,6 +1549,56 @@ struct rte_ml_dev; typedef int (*eventdev_ml_adapter_caps_get_t)(const struct rte_eventdev *dev, const struct rte_ml_dev *mldev, uint32_t *caps); +/** + * This API may change without prior notice + * + * Add ML queue pair to event device. This callback is invoked if + * the caps returned from rte_event_ml_adapter_caps_get(, mldev_id) + * has RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_* set. + * + * @param dev + * Event device pointer + * + * @param mldev + * MLDEV pointer + * + * @param queue_pair_id + * MLDEV queue pair identifier. + * + * @param event + * Event information required for binding mldev queue pair to event queue. + * This structure will have a valid value for only those HW PMDs supporting + * @see RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND capability. + * + * @return + * - 0: Success, mldev queue pair added successfully. + * - <0: Error code returned by the driver function. + * + */ +typedef int (*eventdev_ml_adapter_queue_pair_add_t)(const struct rte_eventdev *dev, + const struct rte_ml_dev *mldev, + int32_t queue_pair_id, + const struct rte_event *event); + +/** + * This API may change without prior notice + * + * Delete ML queue pair to event device. This callback is invoked if + * the caps returned from rte_event_ml_adapter_caps_get(, mldev_id) + * has RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_* set. + * + * @param queue_pair_id + * mldev queue pair identifier. + * + * @return + * - 0: Success, mldev queue pair deleted successfully. + * - <0: Error code returned by the driver function. + * + */ +typedef int (*eventdev_ml_adapter_queue_pair_del_t)(const struct rte_eventdev *dev, + const struct rte_ml_dev *cdev, + int32_t queue_pair_id); + /** Event device operations function pointer table */ struct eventdev_ops { eventdev_info_get_t dev_infos_get; /**< Get device info. */ @@ -1690,6 +1740,10 @@ struct eventdev_ops { eventdev_ml_adapter_caps_get_t ml_adapter_caps_get; /**< Get ML adapter capabilities */ + eventdev_ml_adapter_queue_pair_add_t ml_adapter_queue_pair_add; + /**< Add queue pair to ML adapter */ + eventdev_ml_adapter_queue_pair_del_t ml_adapter_queue_pair_del; + /**< Delete queue pair from ML adapter */ eventdev_selftest dev_selftest; /**< Start eventdev Selftest */ diff --git a/lib/eventdev/rte_event_ml_adapter.c b/lib/eventdev/rte_event_ml_adapter.c index 93ba58b3e9e..9d441c5d967 100644 --- a/lib/eventdev/rte_event_ml_adapter.c +++ b/lib/eventdev/rte_event_ml_adapter.c @@ -33,10 +33,27 @@ struct ml_ops_circular_buffer { struct rte_ml_op **op_buffer; } __rte_cache_aligned; +/* Queue pair information */ +struct ml_queue_pair_info { + /* Set to indicate queue pair is enabled */ + bool qp_enabled; + + /* Circular buffer for batching ML ops to mldev */ + struct ml_ops_circular_buffer mlbuf; +} __rte_cache_aligned; + /* ML device information */ struct ml_device_info { /* Pointer to mldev */ struct rte_ml_dev *dev; + + /* Pointer to queue pair info */ + struct ml_queue_pair_info *qpairs; + + /* If num_qpairs > 0, the start callback will +* be invoked if not already invoked +*/ + uint16_t num_qpairs; } __rte_cache_aligned; struct event_ml_adapter { @@ -72,6 +89,9 @@ struct event_ml_adapter { /* Set if default_cb is being used */ int default_cb_arg; + + /* No. of queue pairs configured */ + uint16_t nb_qps; } __rte_cache_aligned; static struct event_ml_adapter **event_ml_adapter; @@ -340,3 +360,176 @@ rte_event_ml_adapter_event_port_get(uint8_t id, uint8_t *event_port_id) return 0; } + +static void +emla_update_qp_info(struct event_ml_adapter *adapter, struct ml_device_info *dev_info, + int32_t queue_pair_id, uint8_t add) +{ + struct ml_queue_pair_info *qp_info; + int enabled; + uint16_t i; + + if (dev_info->qpairs == NULL) + return; + + if (queue_pair_id == -1) { + for (i = 0; i < dev_info->dev->data->nb_queue_pairs; i++) + emla_update_qp_info(adapter, dev_info, i, add); + } else { + qp_inf
[PATCH 06/11] event/ml: add support for service function
Added support for ML adapter service function for software based event devices. Signed-off-by: Srikanth Yalavarthi --- lib/eventdev/rte_event_ml_adapter.c | 538 1 file changed, 538 insertions(+) diff --git a/lib/eventdev/rte_event_ml_adapter.c b/lib/eventdev/rte_event_ml_adapter.c index 9d441c5d967..95f566b1025 100644 --- a/lib/eventdev/rte_event_ml_adapter.c +++ b/lib/eventdev/rte_event_ml_adapter.c @@ -5,6 +5,7 @@ #include "rte_event_ml_adapter.h" #include "rte_eventdev.h" #include +#include #include "eventdev_pmd.h" #include "rte_mldev_pmd.h" @@ -13,6 +14,9 @@ #define ML_DEFAULT_MAX_NB 128 #define ML_ADAPTER_BUFFER_SIZE 1024 +#define ML_BATCH_SIZE 32 +#define ML_ADAPTER_OPS_BUFFER_SIZE (ML_BATCH_SIZE + ML_BATCH_SIZE) + #define ML_ADAPTER_ARRAY "event_ml_adapter_array" /* ML ops circular buffer */ @@ -54,6 +58,9 @@ struct ml_device_info { * be invoked if not already invoked */ uint16_t num_qpairs; + + /* Next queue pair to be processed */ + uint16_t next_queue_pair_id; } __rte_cache_aligned; struct event_ml_adapter { @@ -78,6 +85,9 @@ struct event_ml_adapter { /* ML device structure array */ struct ml_device_info *mldevs; + /* Next ML device to be processed */ + int16_t next_mldev_id; + /* Circular buffer for processing ML ops to eventdev */ struct ml_ops_circular_buffer ebuf; @@ -92,6 +102,26 @@ struct event_ml_adapter { /* No. of queue pairs configured */ uint16_t nb_qps; + + /* Per adapter EAL service ID */ + uint32_t service_id; + + /* Service initialization state */ + uint8_t service_initialized; + + /* Max ML ops processed in any service function invocation */ + uint32_t max_nb; + + /* Store event port's implicit release capability */ + uint8_t implicit_release_disabled; + + /* Flag to indicate backpressure at mldev +* Stop further dequeuing events from eventdev +*/ + bool stop_enq_to_mldev; + + /* Loop counter to flush ml ops */ + uint16_t transmit_loop_count; } __rte_cache_aligned; static struct event_ml_adapter **event_ml_adapter; @@ -133,6 +163,18 @@ emla_array_init(void) return 0; } +static inline bool +emla_circular_buffer_batch_ready(struct ml_ops_circular_buffer *bufp) +{ + return bufp->count >= ML_BATCH_SIZE; +} + +static inline bool +emla_circular_buffer_space_for_batch(struct ml_ops_circular_buffer *bufp) +{ + return (bufp->size - bufp->count) >= ML_BATCH_SIZE; +} + static inline int emla_circular_buffer_init(const char *name, struct ml_ops_circular_buffer *buf, uint16_t sz) { @@ -151,6 +193,49 @@ emla_circular_buffer_free(struct ml_ops_circular_buffer *buf) rte_free(buf->op_buffer); } +static inline int +emla_circular_buffer_add(struct ml_ops_circular_buffer *bufp, struct rte_ml_op *op) +{ + uint16_t *tail = &bufp->tail; + + bufp->op_buffer[*tail] = op; + + /* circular buffer, go round */ + *tail = (*tail + 1) % bufp->size; + bufp->count++; + + return 0; +} + +static inline int +emla_circular_buffer_flush_to_mldev(struct ml_ops_circular_buffer *bufp, uint8_t mldev_id, + uint16_t qp_id, uint16_t *nb_ops_flushed) +{ + uint16_t n = 0; + uint16_t *head = &bufp->head; + uint16_t *tail = &bufp->tail; + struct rte_ml_op **ops = bufp->op_buffer; + + if (*tail > *head) + n = *tail - *head; + else if (*tail < *head) + n = bufp->size - *head; + else { + *nb_ops_flushed = 0; + return 0; /* buffer empty */ + } + + *nb_ops_flushed = rte_ml_enqueue_burst(mldev_id, qp_id, &ops[*head], n); + bufp->count -= *nb_ops_flushed; + if (!bufp->count) { + *head = 0; + *tail = 0; + } else + *head = (*head + *nb_ops_flushed) % bufp->size; + + return *nb_ops_flushed == n ? 0 : -1; +} + static int emla_default_config_cb(uint8_t id, uint8_t evdev_id, struct rte_event_ml_adapter_conf *conf, void *arg) @@ -361,6 +446,394 @@ rte_event_ml_adapter_event_port_get(uint8_t id, uint8_t *event_port_id) return 0; } +static inline unsigned int +emla_enq_to_mldev(struct event_ml_adapter *adapter, struct rte_event *ev, unsigned int cnt) +{ + union rte_event_ml_metadata *m_data = NULL; + struct ml_queue_pair_info *qp_info = NULL; + struct rte_ml_op *ml_op; + unsigned int i, n; + uint16_t qp_id, nb_enqueued = 0; + int16_t mldev_id; + int ret; + + ret = 0; + n = 0; + + for (i = 0; i < cnt; i++) { + ml_op = ev[i].event_ptr; + if (ml_op == NULL) + continue; + + if (ml_op->private_data_offset) + m_data = (unio
[PATCH 07/11] event/ml: add adapter start and stop
Added ML adapter start and stop functions. Signed-off-by: Srikanth Yalavarthi --- lib/eventdev/eventdev_pmd.h | 42 lib/eventdev/rte_event_ml_adapter.c | 75 + 2 files changed, 117 insertions(+) diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h index 48e970a5097..44f26473075 100644 --- a/lib/eventdev/eventdev_pmd.h +++ b/lib/eventdev/eventdev_pmd.h @@ -1599,6 +1599,44 @@ typedef int (*eventdev_ml_adapter_queue_pair_del_t)(const struct rte_eventdev *d const struct rte_ml_dev *cdev, int32_t queue_pair_id); +/** + * Start ML adapter. This callback is invoked if + * the caps returned from rte_event_ml_adapter_caps_get(.., mldev_id) + * has RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_* set and queue pairs + * from mldev_id have been added to the event device. + * + * @param dev + * Event device pointer + * + * @param mldev + * ML device pointer + * + * @return + * - 0: Success, ML adapter started successfully. + * - <0: Error code returned by the driver function. + */ +typedef int (*eventdev_ml_adapter_start_t)(const struct rte_eventdev *dev, + const struct rte_ml_dev *mldev); + +/** + * Stop ML adapter. This callback is invoked if + * the caps returned from rte_event_ml_adapter_caps_get(.., mldev_id) + * has RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_* set and queue pairs + * from mldev_id have been added to the event device. + * + * @param dev + * Event device pointer + * + * @param mldev + * ML device pointer + * + * @return + * - 0: Success, ML adapter stopped successfully. + * - <0: Error code returned by the driver function. + */ +typedef int (*eventdev_ml_adapter_stop_t)(const struct rte_eventdev *dev, + const struct rte_ml_dev *mldev); + /** Event device operations function pointer table */ struct eventdev_ops { eventdev_info_get_t dev_infos_get; /**< Get device info. */ @@ -1744,6 +1782,10 @@ struct eventdev_ops { /**< Add queue pair to ML adapter */ eventdev_ml_adapter_queue_pair_del_t ml_adapter_queue_pair_del; /**< Delete queue pair from ML adapter */ + eventdev_ml_adapter_start_t ml_adapter_start; + /**< Start ML adapter */ + eventdev_ml_adapter_stop_t ml_adapter_stop; + /**< Stop ML adapter */ eventdev_selftest dev_selftest; /**< Start eventdev Selftest */ diff --git a/lib/eventdev/rte_event_ml_adapter.c b/lib/eventdev/rte_event_ml_adapter.c index 95f566b1025..60c10caef68 100644 --- a/lib/eventdev/rte_event_ml_adapter.c +++ b/lib/eventdev/rte_event_ml_adapter.c @@ -61,6 +61,14 @@ struct ml_device_info { /* Next queue pair to be processed */ uint16_t next_queue_pair_id; + + /* Set to indicate processing has been started */ + uint8_t dev_started; + + /* Set to indicate mldev->eventdev packet +* transfer uses a hardware mechanism +*/ + uint8_t internal_event_port; } __rte_cache_aligned; struct event_ml_adapter { @@ -1071,3 +1079,70 @@ rte_event_ml_adapter_queue_pair_del(uint8_t id, int16_t mldev_id, int32_t queue_ return ret; } + +static int +emla_adapter_ctrl(uint8_t id, int start) +{ + struct event_ml_adapter *adapter; + struct ml_device_info *dev_info; + struct rte_eventdev *dev; + int stop = !start; + int use_service; + uint32_t i; + + if (!emla_valid_id(id)) { + RTE_EDEV_LOG_ERR("Invalid ML adapter id = %d", id); + return -EINVAL; + } + + adapter = emla_id_to_adapter(id); + if (adapter == NULL) + return -EINVAL; + + dev = &rte_eventdevs[adapter->eventdev_id]; + + use_service = 0; + for (i = 0; i < rte_ml_dev_count(); i++) { + dev_info = &adapter->mldevs[i]; + /* if start check for num queue pairs */ + if (start && !dev_info->num_qpairs) + continue; + /* if stop check if dev has been started */ + if (stop && !dev_info->dev_started) + continue; + use_service |= !dev_info->internal_event_port; + dev_info->dev_started = start; + if (dev_info->internal_event_port == 0) + continue; + start ? (*dev->dev_ops->ml_adapter_start)(dev, &dev_info->dev[i]) : + (*dev->dev_ops->ml_adapter_stop)(dev, &dev_info->dev[i]); + } + + if (use_service) + rte_service_runstate_set(adapter->service_id, start); + + return 0; +} + +int +rte_event_ml_adapter_start(uint8_t id) +{ + struct event_ml_adapter *adapter; + + if (!emla_valid_id(id)) { + RTE_EDEV_LOG_ERR("Invalid ML adapter id = %d", id); + retu
[PATCH 08/11] event/ml: add support to get adapter service ID
Added support to get ML adapter service ID. Signed-off-by: Srikanth Yalavarthi --- lib/eventdev/rte_event_ml_adapter.c | 20 1 file changed, 20 insertions(+) diff --git a/lib/eventdev/rte_event_ml_adapter.c b/lib/eventdev/rte_event_ml_adapter.c index 60c10caef68..474aeb6325b 100644 --- a/lib/eventdev/rte_event_ml_adapter.c +++ b/lib/eventdev/rte_event_ml_adapter.c @@ -1080,6 +1080,26 @@ rte_event_ml_adapter_queue_pair_del(uint8_t id, int16_t mldev_id, int32_t queue_ return ret; } +int +rte_event_ml_adapter_service_id_get(uint8_t id, uint32_t *service_id) +{ + struct event_ml_adapter *adapter; + + if (!emla_valid_id(id)) { + RTE_EDEV_LOG_ERR("Invalid ML adapter id = %d", id); + return -EINVAL; + } + + adapter = emla_id_to_adapter(id); + if (adapter == NULL || service_id == NULL) + return -EINVAL; + + if (adapter->service_initialized) + *service_id = adapter->service_id; + + return adapter->service_initialized ? 0 : -ESRCH; +} + static int emla_adapter_ctrl(uint8_t id, int start) { -- 2.42.0
[PATCH 09/11] event/ml: add support for runtime params
Added support to set and get runtime params for ML adapter. Signed-off-by: Srikanth Yalavarthi --- lib/eventdev/rte_event_ml_adapter.c | 99 + 1 file changed, 99 insertions(+) diff --git a/lib/eventdev/rte_event_ml_adapter.c b/lib/eventdev/rte_event_ml_adapter.c index 474aeb6325b..feb488f730a 100644 --- a/lib/eventdev/rte_event_ml_adapter.c +++ b/lib/eventdev/rte_event_ml_adapter.c @@ -1166,3 +1166,102 @@ rte_event_ml_adapter_stop(uint8_t id) { return emla_adapter_ctrl(id, 0); } + +#define DEFAULT_MAX_NB 128 + +int +rte_event_ml_adapter_runtime_params_init(struct rte_event_ml_adapter_runtime_params *params) +{ + if (params == NULL) + return -EINVAL; + + memset(params, 0, sizeof(*params)); + params->max_nb = DEFAULT_MAX_NB; + + return 0; +} + +static int +ml_adapter_cap_check(struct event_ml_adapter *adapter) +{ + uint32_t caps; + int ret; + + if (!adapter->nb_qps) + return -EINVAL; + + ret = rte_event_ml_adapter_caps_get(adapter->eventdev_id, adapter->next_mldev_id, &caps); + if (ret) { + RTE_EDEV_LOG_ERR("Failed to get adapter caps dev %" PRIu8 " cdev %" PRIu8, +adapter->eventdev_id, adapter->next_mldev_id); + return ret; + } + + if ((caps & RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD) || + (caps & RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_NEW)) + return -ENOTSUP; + + return 0; +} + +int +rte_event_ml_adapter_runtime_params_set(uint8_t id, + struct rte_event_ml_adapter_runtime_params *params) +{ + struct event_ml_adapter *adapter; + int ret; + + if (!emla_valid_id(id)) { + RTE_EDEV_LOG_ERR("Invalid ML adapter id = %d", id); + return -EINVAL; + } + + if (params == NULL) { + RTE_EDEV_LOG_ERR("params pointer is NULL"); + return -EINVAL; + } + + adapter = emla_id_to_adapter(id); + if (adapter == NULL) + return -EINVAL; + + ret = ml_adapter_cap_check(adapter); + if (ret) + return ret; + + rte_spinlock_lock(&adapter->lock); + adapter->max_nb = params->max_nb; + rte_spinlock_unlock(&adapter->lock); + + return 0; +} + +int +rte_event_ml_adapter_runtime_params_get(uint8_t id, + struct rte_event_ml_adapter_runtime_params *params) +{ + struct event_ml_adapter *adapter; + int ret; + + if (!emla_valid_id(id)) { + RTE_EDEV_LOG_ERR("Invalid ML adapter id = %d", id); + return -EINVAL; + } + + if (params == NULL) { + RTE_EDEV_LOG_ERR("params pointer is NULL"); + return -EINVAL; + } + + adapter = emla_id_to_adapter(id); + if (adapter == NULL) + return -EINVAL; + + ret = ml_adapter_cap_check(adapter); + if (ret) + return ret; + + params->max_nb = adapter->max_nb; + + return 0; +} -- 2.42.0
[PATCH 0/4] Implementation of CNXK ML event adapter driver
This series of patches is an implementation of event ML adapter for Marvell's Octeon platform. Srikanth Yalavarthi (4): event/cnxk: add ML adapter capabilities get event/cnxk: implement queue pair add and delete ml/cnxk: add adapter enqueue function ml/cnxk: add adapter dequeue function drivers/event/cnxk/cn10k_eventdev.c | 121 +++ drivers/event/cnxk/cn10k_worker.h | 3 + drivers/event/cnxk/cnxk_eventdev.h | 4 + drivers/event/cnxk/meson.build | 2 +- drivers/ml/cnxk/cn10k_ml_event_dp.h | 18 drivers/ml/cnxk/cn10k_ml_ops.c | 146 +++- drivers/ml/cnxk/cn10k_ml_ops.h | 3 + drivers/ml/cnxk/cnxk_ml_ops.h | 20 drivers/ml/cnxk/meson.build | 2 +- drivers/ml/cnxk/version.map | 8 ++ 10 files changed, 320 insertions(+), 7 deletions(-) create mode 100644 drivers/ml/cnxk/cn10k_ml_event_dp.h create mode 100644 drivers/ml/cnxk/version.map -- 2.42.0
[PATCH 2/4] event/cnxk: implement queue pair add and delete
Added structures for ML event adapter. Implemented ML event adapter queue-pair add and delete functions. Signed-off-by: Srikanth Yalavarthi --- drivers/event/cnxk/cn10k_eventdev.c | 103 drivers/event/cnxk/cnxk_eventdev.h | 4 ++ drivers/ml/cnxk/cnxk_ml_ops.h | 12 3 files changed, 119 insertions(+) diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index 09eff569052..201972cec9e 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -1033,6 +1033,107 @@ cn10k_ml_adapter_caps_get(const struct rte_eventdev *event_dev, const struct rte return 0; } +static int +ml_adapter_qp_free(struct cnxk_ml_qp *qp) +{ + rte_mempool_free(qp->mla.req_mp); + qp->mla.enabled = false; + + return 0; +} + +static int +ml_adapter_qp_setup(const struct rte_ml_dev *mldev, struct cnxk_ml_qp *qp) +{ + char name[RTE_MEMPOOL_NAMESIZE]; + uint32_t cache_size, nb_req; + unsigned int req_size; + + snprintf(name, RTE_MEMPOOL_NAMESIZE, "cnxk_mla_req_%u_%u", mldev->data->dev_id, qp->id); + req_size = sizeof(struct cn10k_ml_req); + cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE; + nb_req = cache_size * rte_lcore_count(); + qp->mla.req_mp = rte_mempool_create(name, nb_req, req_size, cache_size, 0, NULL, NULL, NULL, + NULL, rte_socket_id(), 0); + if (qp->mla.req_mp == NULL) + return -ENOMEM; + + qp->mla.enabled = true; + + return 0; +} + +static int +cn10k_ml_adapter_qp_del(const struct rte_eventdev *event_dev, const struct rte_ml_dev *mldev, + int32_t queue_pair_id) +{ + struct cnxk_ml_qp *qp; + + CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k", EINVAL); + CNXK_VALID_DEV_OR_ERR_RET(mldev->device, "ml_cn10k", EINVAL); + + if (queue_pair_id == -1) { + uint16_t qp_id; + + for (qp_id = 0; qp_id < mldev->data->nb_queue_pairs; qp_id++) { + qp = mldev->data->queue_pairs[qp_id]; + if (qp->mla.enabled) + ml_adapter_qp_free(qp); + } + } else { + qp = mldev->data->queue_pairs[queue_pair_id]; + if (qp->mla.enabled) + ml_adapter_qp_free(qp); + } + + return 0; +} + +static int +cn10k_ml_adapter_qp_add(const struct rte_eventdev *event_dev, const struct rte_ml_dev *mldev, + int32_t queue_pair_id, const struct rte_event *event) +{ + struct cnxk_sso_evdev *sso_evdev = cnxk_sso_pmd_priv(event_dev); + uint32_t adptr_xae_cnt = 0; + struct cnxk_ml_qp *qp; + int ret; + + PLT_SET_USED(event); + + CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k", EINVAL); + CNXK_VALID_DEV_OR_ERR_RET(mldev->device, "ml_cn10k", EINVAL); + + sso_evdev->is_mla_internal_port = 1; + cn10k_sso_fp_fns_set((struct rte_eventdev *)(uintptr_t)event_dev); + + if (queue_pair_id == -1) { + uint16_t qp_id; + + for (qp_id = 0; qp_id < mldev->data->nb_queue_pairs; qp_id++) { + qp = mldev->data->queue_pairs[qp_id]; + ret = ml_adapter_qp_setup(mldev, qp); + if (ret != 0) { + cn10k_ml_adapter_qp_del(event_dev, mldev, -1); + return ret; + } + adptr_xae_cnt += qp->mla.req_mp->size; + } + } else { + qp = mldev->data->queue_pairs[queue_pair_id]; + ret = ml_adapter_qp_setup(mldev, qp); + if (ret != 0) + return ret; + + adptr_xae_cnt = qp->mla.req_mp->size; + } + + /* Update ML adapter XAE count */ + sso_evdev->adptr_xae_cnt += adptr_xae_cnt; + cnxk_sso_xae_reconfigure((struct rte_eventdev *)(uintptr_t)event_dev); + + return ret; +} + static struct eventdev_ops cn10k_sso_dev_ops = { .dev_infos_get = cn10k_sso_info_get, .dev_configure = cn10k_sso_dev_configure, @@ -1075,6 +1176,8 @@ static struct eventdev_ops cn10k_sso_dev_ops = { .crypto_adapter_vector_limits_get = cn10k_crypto_adapter_vec_limits, .ml_adapter_caps_get = cn10k_ml_adapter_caps_get, + .ml_adapter_queue_pair_add = cn10k_ml_adapter_qp_add, + .ml_adapter_queue_pair_del = cn10k_ml_adapter_qp_del, .xstats_get = cnxk_sso_xstats_get, .xstats_reset = cnxk_sso_xstats_reset, diff --git a/drivers/event/cnxk/cnxk_eventdev.h b/drivers/event/cnxk/cnxk_eventdev.h index d42d1afa1a1..bc51e952c9a 100644 --- a/drivers/event/cnxk/cnxk_eventdev.h +++ b/drivers/event/cnxk/cnxk_eventdev.h @@ -124,6 +124,10 @@ struct cnxk_sso_evdev { uint32_t gw_mode; uint16_t stash_cnt; str
[PATCH 1/4] event/cnxk: add ML adapter capabilities get
Implemented driver function to get ML adapter capabilities. Signed-off-by: Srikanth Yalavarthi --- Depends-on: series-30752 ("Introduce Event ML Adapter") drivers/event/cnxk/cn10k_eventdev.c | 15 +++ drivers/event/cnxk/meson.build | 2 +- drivers/ml/cnxk/cn10k_ml_ops.h | 2 ++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index bb0c9105535..09eff569052 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -6,6 +6,7 @@ #include "cn10k_worker.h" #include "cn10k_ethdev.h" #include "cn10k_cryptodev_ops.h" +#include "cnxk_ml_ops.h" #include "cnxk_eventdev.h" #include "cnxk_worker.h" @@ -1020,6 +1021,18 @@ cn10k_crypto_adapter_vec_limits(const struct rte_eventdev *event_dev, return 0; } +static int +cn10k_ml_adapter_caps_get(const struct rte_eventdev *event_dev, const struct rte_ml_dev *mldev, + uint32_t *caps) +{ + CNXK_VALID_DEV_OR_ERR_RET(event_dev->dev, "event_cn10k", EINVAL); + CNXK_VALID_DEV_OR_ERR_RET(mldev->device, "ml_cn10k", EINVAL); + + *caps = RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_OP_FWD; + + return 0; +} + static struct eventdev_ops cn10k_sso_dev_ops = { .dev_infos_get = cn10k_sso_info_get, .dev_configure = cn10k_sso_dev_configure, @@ -1061,6 +1074,8 @@ static struct eventdev_ops cn10k_sso_dev_ops = { .crypto_adapter_queue_pair_del = cn10k_crypto_adapter_qp_del, .crypto_adapter_vector_limits_get = cn10k_crypto_adapter_vec_limits, + .ml_adapter_caps_get = cn10k_ml_adapter_caps_get, + .xstats_get = cnxk_sso_xstats_get, .xstats_reset = cnxk_sso_xstats_reset, .xstats_get_names = cnxk_sso_xstats_get_names, diff --git a/drivers/event/cnxk/meson.build b/drivers/event/cnxk/meson.build index 13281d687f7..e09ad97b660 100644 --- a/drivers/event/cnxk/meson.build +++ b/drivers/event/cnxk/meson.build @@ -316,7 +316,7 @@ foreach flag: extra_flags endforeach headers = files('rte_pmd_cnxk_eventdev.h') -deps += ['bus_pci', 'common_cnxk', 'net_cnxk', 'crypto_cnxk'] +deps += ['bus_pci', 'common_cnxk', 'net_cnxk', 'crypto_cnxk', 'ml_cnxk'] require_iova_in_mbuf = false diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h index eb3e1c139c7..d225ed2098e 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.h +++ b/drivers/ml/cnxk/cn10k_ml_ops.h @@ -10,6 +10,8 @@ #include +#include "cnxk_ml_xstats.h" + struct cnxk_ml_dev; struct cnxk_ml_qp; struct cnxk_ml_model; -- 2.42.0
[PATCH 3/4] ml/cnxk: add adapter enqueue function
Implemented ML adapter enqueue function. Rename internal fast-path JD preparation function for poll mode. Added JD preparation function for event mode. Updated meson build dependencies for ml/cnxk driver. Signed-off-by: Srikanth Yalavarthi --- drivers/event/cnxk/cn10k_eventdev.c | 3 + drivers/ml/cnxk/cn10k_ml_event_dp.h | 16 drivers/ml/cnxk/cn10k_ml_ops.c | 129 ++-- drivers/ml/cnxk/cn10k_ml_ops.h | 1 + drivers/ml/cnxk/cnxk_ml_ops.h | 8 ++ drivers/ml/cnxk/meson.build | 2 +- drivers/ml/cnxk/version.map | 7 ++ 7 files changed, 160 insertions(+), 6 deletions(-) create mode 100644 drivers/ml/cnxk/cn10k_ml_event_dp.h create mode 100644 drivers/ml/cnxk/version.map diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index 201972cec9e..3b5dce23fe9 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -6,6 +6,7 @@ #include "cn10k_worker.h" #include "cn10k_ethdev.h" #include "cn10k_cryptodev_ops.h" +#include "cn10k_ml_event_dp.h" #include "cnxk_ml_ops.h" #include "cnxk_eventdev.h" #include "cnxk_worker.h" @@ -478,6 +479,8 @@ cn10k_sso_fp_fns_set(struct rte_eventdev *event_dev) else event_dev->ca_enqueue = cn10k_cpt_sg_ver1_crypto_adapter_enqueue; + event_dev->mla_enqueue = cn10k_ml_adapter_enqueue; + if (dev->tx_offloads & NIX_TX_MULTI_SEG_F) CN10K_SET_EVDEV_ENQ_OP(dev, event_dev->txa_enqueue, sso_hws_tx_adptr_enq_seg); else diff --git a/drivers/ml/cnxk/cn10k_ml_event_dp.h b/drivers/ml/cnxk/cn10k_ml_event_dp.h new file mode 100644 index 000..bf7fc57bceb --- /dev/null +++ b/drivers/ml/cnxk/cn10k_ml_event_dp.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Marvell. + */ + +#ifndef _CN10K_ML_EVENT_DP_H_ +#define _CN10K_ML_EVENT_DP_H_ + +#include + +#include +#include + +__rte_internal +__rte_hot uint16_t cn10k_ml_adapter_enqueue(void *ws, struct rte_event ev[], uint16_t nb_events); + +#endif /* _CN10K_ML_EVENT_DP_H_ */ diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 834e55e88e9..4bc17eaa8c4 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -2,11 +2,13 @@ * Copyright (c) 2022 Marvell. */ +#include #include #include #include +#include "cn10k_ml_event_dp.h" #include "cnxk_ml_dev.h" #include "cnxk_ml_model.h" #include "cnxk_ml_ops.h" @@ -144,8 +146,8 @@ cn10k_ml_prep_sp_job_descriptor(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_l } static __rte_always_inline void -cn10k_ml_prep_fp_job_descriptor(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_req *req, - uint16_t index, void *input, void *output, uint16_t nb_batches) +cn10k_ml_prep_fp_job_descriptor_poll(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_req *req, +uint16_t index, void *input, void *output, uint16_t nb_batches) { struct cn10k_ml_dev *cn10k_mldev; @@ -166,6 +168,33 @@ cn10k_ml_prep_fp_job_descriptor(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_r req->cn10k_req.jd.model_run.num_batches = nb_batches; } +static __rte_always_inline void +cn10k_ml_prep_fp_job_descriptor_event(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_req *req, + uint16_t index, void *input, void *output, uint16_t nb_batches + + , + uint64_t *compl_W0) +{ + + struct cn10k_ml_dev *cn10k_mldev; + + cn10k_mldev = &cnxk_mldev->cn10k_mldev; + + req->cn10k_req.jd.hdr.jce.w0.u64 = *compl_W0; + req->cn10k_req.jd.hdr.jce.w1.s.wqp = PLT_U64_CAST(req); + req->cn10k_req.jd.hdr.model_id = index; + req->cn10k_req.jd.hdr.job_type = ML_CN10K_JOB_TYPE_MODEL_RUN; + req->cn10k_req.jd.hdr.fp_flags = ML_FLAGS_SSO_COMPL; + req->cn10k_req.jd.hdr.sp_flags = 0x0; + req->cn10k_req.jd.hdr.result = + roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &req->cn10k_req.result); + req->cn10k_req.jd.model_run.input_ddr_addr = + PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, input)); + req->cn10k_req.jd.model_run.output_ddr_addr = + PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, output)); + req->cn10k_req.jd.model_run.num_batches = nb_batches; +} + static void cn10k_ml_xstats_layer_name_update(struct cnxk_ml_dev *cnxk_mldev, uint16_t model_id, uint16_t layer_id) @@ -1305,13 +1334,16 @@ cn10k_ml_enqueue_single(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_op *op, ui model = cnxk_mldev->mldev->data->models[op->model_id]; model->set_poll_addr(req); - cn10k_ml_prep_fp_job_descriptor(cnxk_mldev, req, model->layer[layer_id].index, - op->input[0]->addr, op-
[PATCH 4/4] ml/cnxk: add adapter dequeue function
Implemented ML adapter dequeue function. Signed-off-by: Srikanth Yalavarthi --- drivers/event/cnxk/cn10k_worker.h | 3 +++ drivers/ml/cnxk/cn10k_ml_event_dp.h | 2 ++ drivers/ml/cnxk/cn10k_ml_ops.c | 17 + drivers/ml/cnxk/version.map | 1 + 4 files changed, 23 insertions(+) diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index 8aa916fa129..1a0ca7f9493 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -7,6 +7,7 @@ #include #include "cn10k_cryptodev_event_dp.h" +#include "cn10k_ml_event_dp.h" #include "cn10k_rx.h" #include "cnxk_worker.h" #include "cn10k_eventdev.h" @@ -236,6 +237,8 @@ cn10k_sso_hws_post_process(struct cn10k_sso_hws *ws, uint64_t *u64, /* Mark vector mempool object as get */ RTE_MEMPOOL_CHECK_COOKIES(rte_mempool_from_obj((void *)u64[1]), (void **)&u64[1], 1, 1); + } else if (CNXK_EVENT_TYPE_FROM_TAG(u64[0]) == RTE_EVENT_TYPE_MLDEV) { + u64[1] = cn10k_ml_adapter_dequeue(u64[1]); } } diff --git a/drivers/ml/cnxk/cn10k_ml_event_dp.h b/drivers/ml/cnxk/cn10k_ml_event_dp.h index bf7fc57bceb..0ff92091296 100644 --- a/drivers/ml/cnxk/cn10k_ml_event_dp.h +++ b/drivers/ml/cnxk/cn10k_ml_event_dp.h @@ -12,5 +12,7 @@ __rte_internal __rte_hot uint16_t cn10k_ml_adapter_enqueue(void *ws, struct rte_event ev[], uint16_t nb_events); +__rte_internal +__rte_hot uintptr_t cn10k_ml_adapter_dequeue(uintptr_t get_work1); #endif /* _CN10K_ML_EVENT_DP_H_ */ diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 4bc17eaa8c4..c33a7a85987 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -1660,3 +1660,20 @@ cn10k_ml_adapter_enqueue(void *ws, struct rte_event ev[], uint16_t nb_events) return count; } + +__rte_hot uintptr_t +cn10k_ml_adapter_dequeue(uintptr_t get_work1) +{ + struct cnxk_ml_dev *cnxk_mldev; + struct cnxk_ml_req *req; + struct cnxk_ml_qp *qp; + + req = (struct cnxk_ml_req *)(get_work1); + cnxk_mldev = req->cnxk_mldev; + qp = cnxk_mldev->mldev->data->queue_pairs[req->qp_id]; + + cn10k_ml_result_update(cnxk_mldev, req->qp_id, req); + rte_mempool_put(qp->mla.req_mp, req); + + return (uintptr_t)req->op; +} diff --git a/drivers/ml/cnxk/version.map b/drivers/ml/cnxk/version.map index c2cacaf8c65..97c2c149998 100644 --- a/drivers/ml/cnxk/version.map +++ b/drivers/ml/cnxk/version.map @@ -2,6 +2,7 @@ INTERNAL { global: cn10k_ml_adapter_enqueue; + cn10k_ml_adapter_dequeue; local: *; }; -- 2.42.0
[PATCH 1/1] buildtools: remove absolute paths from pc file
When linking with non-versioned libraries, absolute paths of the libraries are added to libdpdk.pc. This patch replaces the absolute path with correct linker flags, -l. https://github.com/mesonbuild/meson/issues/7766 Signed-off-by: Srikanth Yalavarthi --- buildtools/pkg-config/set-static-linker-flags.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/buildtools/pkg-config/set-static-linker-flags.py b/buildtools/pkg-config/set-static-linker-flags.py index 2745db34c29..e8804353383 100644 --- a/buildtools/pkg-config/set-static-linker-flags.py +++ b/buildtools/pkg-config/set-static-linker-flags.py @@ -9,6 +9,8 @@ def fix_ldflag(f): +if (f.startswith('/') and (f.endswith('.so') or f.endswith('.a'))): +return f.split('/', -1)[-1].split('.', -1)[0].replace('lib', '-l', 1) if not f.startswith('-lrte_'): return f return '-l:lib' + f[2:] + '.a' -- 2.42.0
[PATCH] tap: check that file is BPF arch before extracting
The script to extract BPF instructions from compiled ELF file would break if the ELF file was incorrectly built. Add simple check to give better message. Fixes: 4e679a5f1212 ("net/tap: add infrastructure to build BPF filter") Signed-off-by: Stephen Hemminger --- drivers/net/tap/bpf/bpf_extract.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/tap/bpf/bpf_extract.py b/drivers/net/tap/bpf/bpf_extract.py index b630c42b809f..72c15cf7ad12 100644 --- a/drivers/net/tap/bpf/bpf_extract.py +++ b/drivers/net/tap/bpf/bpf_extract.py @@ -77,6 +77,8 @@ def main(): write_header(out, args.source) for path in args.file: elffile = ELFFile(open_input(path)) +if elffile['e_machine'] != 'EM_BPF': +sys.exit(f'{path} is not BPF') sections = load_sections(elffile) for name, insns in sections: dump_section(name, insns, out) -- 2.43.0
RE: unnecessary rx callbacks when zero packets
> -Original Message- > From: Stephen Hemminger > Sent: Sunday, January 7, 2024 11:37 AM > To: dev@dpdk.org > Subject: unnecessary rx callbacks when zero packets > > I noticed while looking at packet capture that currently the receive callbacks > get called even if there are no packets. This seems unnecessary since if > nb_rx is > zero, then there are no packets to look at. My one concern is that an > application could be using callbacks as some form of scheduling mechanism > which would be broken. Is it possible that the call back functions are maintaining statistics on zero packet polls? > > The change would be: > > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index > 21e3a21903ec..f64bf977c46e 100644 > --- a/lib/ethdev/rte_ethdev.h > +++ b/lib/ethdev/rte_ethdev.h > @@ -6077,7 +6077,7 @@ rte_eth_rx_burst(uint16_t port_id, uint16_t > queue_id, > nb_rx = p->rx_pkt_burst(qd, rx_pkts, nb_pkts); > > #ifdef RTE_ETHDEV_RXTX_CALLBACKS > - { > + if (nb_rx > 0) { > void *cb; > > /* rte_memory_order_release memory order was used when the
RE: DTS testpmd and SCAPY integration
> -Original Message- > From: Etelson, Gregory > Sent: Tuesday, December 26, 2023 1:32 AM > To: tho...@monjalon.net; Juraj Linkeš ; > Honnappa Nagarahalli ; Paul Szczepanek > ; Luca Vizzarro ; Yoan > Picchi ; Jeremy Spewock > ; Gregory Etelson ; Patrick > Robb ; c...@dpdk.org; dev@dpdk.org > Subject: DTS testpmd and SCAPY integration > > Hello, > > Consider an option to describe DTS test with testpmd and SCAPY plain text > commands. > > For example: > > Scenario: > - Configure UDP packet in SCAPY and a flow in testpmd. > - Send UDP packet and validate testpmd output triggered by that packet. > > ```yaml > phase_0: > name: CONFIGURATION > tg: | > udp_pkt = Ether()/IP()/UDP(sport=31)/Raw('== TEST ==') > print('packet is ready') > dut: | > start > set verbose 1 > flow create 0 ingress pattern eth / ipv4 / udp src is 31 / end > actions queue > index 1 / end > result: > dut: 'Flow rule #0 created' > tg: 'packet is ready' > > phase_1: > name: SEND and VALIDATE > tg: sendp(udp_pkt, iface=pf0) > result: > dut: '- RSS queue=0x1 -' > ``` > > Test described as a sequence of phases. > > Phase definition: > > > ``` > : # unique phase ID > name:# phase name > : # application APP1 commands > ... > : # application APPy commands > ... > : # application APPx commands > > result:# optional phase results verification section > : # APPx expected output > ... > : # APPy expected output ``` > > - Application commands in a phase executed sequentially, >in order of application IDs: commands executed >before commands. > > - Application results in a phase validated sequentially, >in order of application IDs: result validated >before APPy result. > > - Application result is a regular expression. > > > Test application definition: > ~~~ > > ``` > : # unique application ID > agent:# mandatory application type identifier: {testpmd|scapy} > cmd: # optional application command template > ``` > > Example: > > ```yaml > > dut: > agent: testpmd > cmd: 'dpdk-testpmd -a pci0 -- -i --rxq=4 --txq=4' > > tg: > agent: scapy > ``` > > Test commands do not bound to a specific setup. > Therefore, testpmd commad line and SCAPY sendp() function use encoding to > describe relative interface position in a tested HBA. > > PCI encoding scheme for testpmd: > > > - PF PCI: `pciX` >Example: `pci0: ':08:00.0'` > > > - PCI SR-IOV: `pciXvfY` >Example: `pci0vf0: ':08:00.2'` > > Network devices encoding scheme for SCAPY: > - PF: `pfX` >Example: `pf0: enp8s0f0np0` > > > - PCI SR-IOV: `pfXvfY` >Example: `pf0vf0: enp5s0f0v0` > > > - Network device representor: `pfXrfY` > > Example: `pf0rf0: enp5s0f0npf0vf0` > > > Test execution requires an additional file to describe tested setup. > > Setup file format: > ~ > > ``` > : # unique application ID > host: # hostname or IPvX address > path: # optional application path > hba: # optional HBA description > pmd: # PMD > hw: # HW type > ``` > > Example: > > ```yaml > dut: > host: 1.2.3.4 > path: /opt/dpdk.org/build/app > hba: > pmd: mlx5 > hw: mt4125 > tg: > host: ::1234 > ``` > > ```yaml > dut: > agent: testpmd > cmd: 'dpdk-testpmd -a pci0 -- -i --rxq=4 --txq=4' > tg: > agent: scapy > > test: > - > phases: [ *ref_phase0 ] > repeat: 1 > - > phases: [ *ref_phase1 ] > repeat: 3 > > phase_0: &ref_phase0 > name: CONFIGURATION > tg: | > udp_pkt = Ether()/IP()/UDP(sport=31)/Raw('== TEST ==') > print('packet is ready') > dut: | > start > set verbose 1 > flow create 0 ingress pattern eth / ipv4 / udp src is 31 / end > actions queue > index 1 / end > result: > dut: 'Flow rule #0 created' > tg: 'packet is ready' > > phase_1: &ref_phase1 > name: SEND and VALIDATE > tg: sendp(udp_pkt, iface=pf0) > result: > dut: '- RSS queue=0x1 -' > ``` > > The plain text format provides minimalistic and intuitive framework for DTS > tests. > DTS can use plan text testpmd/scapy command format in addition to Python > framework. Hi Gregory, I do not fully understand your proposal, it will be helpful to join the DTS meetings to discuss this further. YAML has wide support built around it. By using our own text format, we will have to build the parsing support etc ourselves. However, YAML is supposed to be easy to read and understand. Is it just a matter for getting used to it? Thank you, Honnappa > > Regards, > Gregory
RE: DTS testpmd and SCAPY integration
Hello Honnappa, [snip] Hi Gregory, I do not fully understand your proposal, it will be helpful to join the DTS meetings to discuss this further. Agree, let's discuss the proposal details during the DTS meeting. YAML has wide support built around it. By using our own text format, we will have to build the parsing support etc ourselves. However, YAML is supposed to be easy to read and understand. Is it just a matter for getting used to it? I selected YAML for 2 reasons: * Plain and intuitive YAML format minimized test meta data. By the meta data I refer to control tags and markup characters that are not test commands. * YAML has Python parser. Regards, Gregory
RE: [PATCH] net/ice: fix memory leak
> -Original Message- > From: Zhang, Qi Z > Sent: Monday, January 8, 2024 4:49 AM > To: Yang, Qiming ; Wu, Wenjun1 > > Cc: dev@dpdk.org; Zhang, Qi Z ; sta...@dpdk.org > Subject: [PATCH] net/ice: fix memory leak > > Free memory for AQ buffer at icd_move_recfg_lan_txq Free memory for > profile list at ice_tm_conf_uninit > > Fixes: 8c481c3bb65b ("net/ice: support queue and queue group bandwidth > limit") > Cc: sta...@dpdk.org > > Signed-off-by: Qi Zhang > --- > drivers/net/ice/ice_tm.c | 12 > 1 file changed, 12 insertions(+) > > diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index > b570798f07..c00ecb6a97 100644 > --- a/drivers/net/ice/ice_tm.c > +++ b/drivers/net/ice/ice_tm.c > @@ -59,8 +59,15 @@ void > ice_tm_conf_uninit(struct rte_eth_dev *dev) { > struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); > + struct ice_tm_shaper_profile *shaper_profile; > struct ice_tm_node *tm_node; > > + /* clear profile */ > + while ((shaper_profile = TAILQ_FIRST(&pf- > >tm_conf.shaper_profile_list))) { > + TAILQ_REMOVE(&pf->tm_conf.shaper_profile_list, > shaper_profile, node); > + rte_free(shaper_profile); > + } > + > /* clear node configuration */ > while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) { > TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node); > @@ -636,6 +643,8 @@ static int ice_move_recfg_lan_txq(struct rte_eth_dev > *dev, > uint16_t buf_size = ice_struct_size(buf, txqs, 1); > > buf = (struct ice_aqc_move_txqs_data *)ice_malloc(hw, sizeof(*buf)); > + if (buf == NULL) > + return -ENOMEM; > > queue_parent_node = queue_sched_node->parent; > buf->src_teid = queue_parent_node->info.node_teid; > @@ -647,6 +656,7 @@ static int ice_move_recfg_lan_txq(struct rte_eth_dev > *dev, > NULL, buf, buf_size, &txqs_moved, > NULL); > if (ret || txqs_moved == 0) { > PMD_DRV_LOG(ERR, "move lan queue %u failed", queue_id); > + rte_free(buf); > return ICE_ERR_PARAM; > } > > @@ -656,12 +666,14 @@ static int ice_move_recfg_lan_txq(struct > rte_eth_dev *dev, > } else { > PMD_DRV_LOG(ERR, "invalid children number %d for > queue %u", > queue_parent_node->num_children, queue_id); > + rte_free(buf); > return ICE_ERR_PARAM; > } > dst_node->children[dst_node->num_children++] = > queue_sched_node; > queue_sched_node->parent = dst_node; > ice_sched_query_elem(hw, queue_sched_node->info.node_teid, > &queue_sched_node->info); > > + rte_free(buf); > return ret; > } > > -- > 2.31.1 Acked-by: Wenjun Wu
[PATCH] doc: update command scope information
From: Sunil Kumar Kori Set of CLI commands are classified into following types; - Commands which must be used in script only. - Commands which must be used via telnet session only. - Commands which can be used either in script or via telnet session. Rename "Dynamic" column to "Scope" to provide clear scope of commands. Signed-off-by: Sunil Kumar Kori --- doc/guides/tools/graph.rst | 211 +++-- 1 file changed, 108 insertions(+), 103 deletions(-) diff --git a/doc/guides/tools/graph.rst b/doc/guides/tools/graph.rst index 1855d12891..6c559afe35 100644 --- a/doc/guides/tools/graph.rst +++ b/doc/guides/tools/graph.rst @@ -154,109 +154,114 @@ file to express the requested use case configuration. .. table:: Exposed CLIs :widths: auto - +--+---+-+--+ - | Command| Description | Dynamic | Optional | - +==+===+=+==+ - | | graph [bsz ] | | Command to express the desired | No|No| - | | [tmo ] [coremask ]| | use case. Also enables/disable | | | - | | model pcap_enable| | pcap capturing. | | | - | | <0/1> num_pcap_pkts pcap_file| | | | - | | | | | | - +--+---+-+--+ - | graph start | | Command to start the graph. | No|No| - | | | This command triggers that no | | | - | | | more commands are left to be | | | - | | | parsed and graph initialization | | | - | | | can be started now. It must be | | | - | | | the last command in usecase.cli | | | - +--+---+-+--+ - | graph stats show | | Command to dump current graph | Yes |Yes | - | | | statistics. | | | - +--+---+-+--+ - | help graph | | Command to dump graph help | Yes |Yes | - | | | message. | | | - +--+---+-+--+ - | | mempool size| | Command to create mempool which | No|No| - | | buffers| | will be further associated to | | | - | | | | RxQ to dequeue the packets. | | | - | | cache numa | | | | - +--+---+-+--+ - | help mempool | | Command to dump mempool help | Yes |Yes | - | | | message. | | | - +--+---+-+--+ - | | ethdev rxq | | Command to create DPDK port with| No|No| - | | txq| | given number of Rx and Tx queues| | | - | | | . Also attach RxQ with given | | | - | | | mempool. Each port can have | | | - | | | single mempool only i.e. all | | | - | | | RxQs will share the same mempool| | | - | | | . | | | - +--+---+-+--+ - | ethdev mtu | | Command to configure MTU of DPDK| Yes |Yes | - | | | port. | | | - +--+---+-+--+ - | | ethdev promiscuous | | Command to enable/disable | Yes |Yes | - | | | | promiscuous mode on DPDK port. | | | - +---
Re: [dpdk-dev] [v3] doc: define qualification criteria for external library
On Sat, Jan 6, 2024 at 12:14 AM Stephen Hemminger wrote: > > On Fri, 5 Jan 2024 17:42:15 +0530 > wrote: > > > > I would a clause about optional dependency. > Something like: > > If external dependency is not available, then it must be detectable > by the > build process. Missing external library must not impact the core > functionality > of the DPDK; only the library or driver in DPDK will not be built. OK. I will send the next verison with following update [main][dpdk.org] $ git diff diff --git a/doc/guides/contributing/library_dependency.rst b/doc/guides/contributing/library_dependency.rst index 367e380a89..7b008d7e8a 100644 --- a/doc/guides/contributing/library_dependency.rst +++ b/doc/guides/contributing/library_dependency.rst @@ -44,3 +44,9 @@ used as dependencies in DPDK drivers or libraries. - Optional dependencies should use stubs to minimize ``ifdef`` clutter, promoting improved code readability. + +#. **Dependency nature:** + + - The external library dependency should be optional. + i.e Missing external library must not impact the core functionality of the DPDK, specififc + library and/or driver will not built if dependencies are not meet.
[dpdk-dev] [v5] doc: define qualification criteria for external library
From: Jerin Jacob Define qualification criteria for external library based on a techboard meeting minutes [1] and past learnings from mailing list discussion. [1] http://mails.dpdk.org/archives/dev/2019-June/135847.html https://mails.dpdk.org/archives/dev/2024-January/284849.html Signed-off-by: Jerin Jacob Acked-by: Thomas Monjalon --- doc/guides/contributing/index.rst | 1 + .../contributing/library_dependency.rst | 52 +++ 2 files changed, 53 insertions(+) create mode 100644 doc/guides/contributing/library_dependency.rst v5: - Added "Dependency nature" section based on Stephen's input v4: - Address Thomas comments from https://patches.dpdk.org/project/dpdk/patch/20240105121215.3950532-1-jer...@marvell.com/ v3: - Updated the content based on TB discussion which is documented at https://mails.dpdk.org/archives/dev/2024-January/284849.html v2: - Added "Meson build integration" and "Code readability" sections. diff --git a/doc/guides/contributing/index.rst b/doc/guides/contributing/index.rst index dcb9b1fbf0..e5a8c2b0a3 100644 --- a/doc/guides/contributing/index.rst +++ b/doc/guides/contributing/index.rst @@ -15,6 +15,7 @@ Contributor's Guidelines documentation unit_test new_library +library_dependency patches vulnerability stable diff --git a/doc/guides/contributing/library_dependency.rst b/doc/guides/contributing/library_dependency.rst new file mode 100644 index 00..94025fdf60 --- /dev/null +++ b/doc/guides/contributing/library_dependency.rst @@ -0,0 +1,52 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2024 Marvell. + +External Library dependency +=== + +This document defines the qualification criteria for external libraries that may be +used as dependencies in DPDK drivers or libraries. + +#. **Documentation:** + + - Must have adequate documentation for the steps to build it. + - Must have clear license documentation on distribution and usage aspects of external library. + +#. **Free availability:** + + - The library must be freely available to build in either source or binary form. + - It shall be downloadable from a direct link. There shall not be any requirement to explicitly + login or sign a user agreement. + +#. **Usage License:** + + - Both permissive (e.g., BSD-3 or Apache) and non-permissive (e.g., GPLv3) licenses are acceptable. + - In the case of a permissive license, automatic inclusion in the build process is assumed. + For non-permissive licenses, an additional build configuration option is required. + +#. **Distributions License:** + + - No specific constraints beyond documentation. + +#. **Compiler compatibility:** + + - The library must be able to compile with a DPDK supported compiler for the given execution + environment. + For example, for Linux, the library must be able to compile with GCC and/or clang. + - Library may be limited to a specific OS. + +#. **Meson build integration:** + + - The library must have standard method like ``pkg-config`` for seamless integration with + DPDK's build environment. + +#. **Code readability:** + + - Optional dependencies should use stubs to minimize ``ifdef`` clutter, promoting improved + code readability. + +#. **Dependency nature:** + + - The external library dependency should be optional. + i.e Missing external library must not impact the core functionality of the DPDK, specific + library and/or driver will not built if dependencies are not meet. -- 2.43.0