[DPDK/core Bug 1509] Build failure on Ubuntu 24.04 with Link Time Optimization (LTO)
https://bugs.dpdk.org/show_bug.cgi?id=1509 Bug ID: 1509 Summary: Build failure on Ubuntu 24.04 with Link Time Optimization (LTO) Product: DPDK Version: 24.07 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: normal Priority: Normal Component: core Assignee: dev@dpdk.org Reporter: alia...@nvidia.com Target Milestone: --- $ meson --werror -Db_lto=true -Dc_args='-O3' -Ddisable_drivers=net/ena,net/tap,*/octeontx,*/cnxk build && ninja -C build [..] [2453/2453] Linking target app/dpdk-test In function ‘memcpy’, inlined from ‘test_argparse_copy’ at ../app/test/test_argparse.c:94:3, inlined from ‘test_argparse_init_obj’ at ../app/test/test_argparse.c:106:2, inlined from ‘test_argparse_invalid_basic_param’ at ../app/test/test_argparse.c:116:8: /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: warning: ‘__builtin___memcpy_chk’ writing 48 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=] [..] OS: Ubuntu 24.04 Meson: 1.3.2 Compiler: gcc 13.2.0 (Note: some drivers are disabled because they report other errors) -- You are receiving this mail because: You are the assignee for the bug.
[DPDK/ethdev Bug 1510] net/ena build failure on Ubuntu 24.04 with LTO
https://bugs.dpdk.org/show_bug.cgi?id=1510 Bug ID: 1510 Summary: net/ena build failure on Ubuntu 24.04 with LTO Product: DPDK Version: 24.07 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: alia...@nvidia.com Target Milestone: --- $ meson --werror -Db_lto=true build && ninja -C build [..] [1586/2774] Linking target drivers/librte_net_ena.so.24.2 ../lib/eal/x86/include/rte_memcpy.h: In function ‘ena_rss_hash_conf_get’: ../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a region of size 8 [-Wstringop-overflow=] 128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0); | ^ ../drivers/net/ena/ena_rss.c:452:17: note: at offset 32 into destination object ‘hw_rss_key’ of size 40 452 | uint8_t hw_rss_key[ENA_HASH_KEY_SIZE]; | ^ ../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=] 128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0); | ^ ../drivers/net/ena/ena_rss.c:452:17: note: at offset 64 into destination object ‘hw_rss_key’ of size 40 452 | uint8_t hw_rss_key[ENA_HASH_KEY_SIZE]; | ^ ../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=] 128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0); | ^ ../drivers/net/ena/ena_rss.c:452:17: note: at offset 96 into destination object ‘hw_rss_key’ of size 40 452 | uint8_t hw_rss_key[ENA_HASH_KEY_SIZE]; | ^ ../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=] 128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0); | ^ ../drivers/net/ena/ena_rss.c:452:17: note: at offset [131, 17179869245] into destination object ‘hw_rss_key’ of size 40 452 | uint8_t hw_rss_key[ENA_HASH_KEY_SIZE]; | ^ ../drivers/net/ena/ena_rss.c:452:17: note: at offset [129, 193] into destination object ‘hw_rss_key’ of size 40 ../drivers/net/ena/ena_rss.c:452:17: note: at offset [131, 17179869245] into destination object ‘hw_rss_key’ of size 40 ../drivers/net/ena/ena_rss.c:452:17: note: at offset [129, 193] into destination object ‘hw_rss_key’ of size 40 ../drivers/net/ena/ena_rss.c:452:17: note: at offset [1, 40] into destination object ‘hw_rss_key’ of size 40 ../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a region of size 8 [-Wstringop-overflow=] 128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0); | ^ ../drivers/net/ena/ena_rss.c:452:17: note: at offset [32, 40] into destination object ‘hw_rss_key’ of size 40 [..] OS: Ubuntu 24.04 Meson: 1.3.2 Compiler: gcc 13.2.0 -- You are receiving this mail because: You are the assignee for the bug.
[DPDK/ethdev Bug 1511] net/tap build failure on Ubuntu 24.04 with LTO
https://bugs.dpdk.org/show_bug.cgi?id=1511 Bug ID: 1511 Summary: net/tap build failure on Ubuntu 24.04 with LTO Product: DPDK Version: 24.07 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: alia...@nvidia.com Target Milestone: --- $ meson --werror -Db_lto=true build && ninja -C build [..] [2109/2774] Linking target drivers/librte_net_tap.so.24.2 ../drivers/net/tap/tap_netlink.c: In function ‘tap_flow_create_ipv6’: ../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region of size 12 [-Wstringop-overflow=] 305 | memcpy(RTA_DATA(rta), data, data_len); | ^ ../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into destination object ‘nh’ of size 16 20 | struct nlmsghdr nh; | ^ ../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region of size 12 [-Wstringop-overflow=] 305 | memcpy(RTA_DATA(rta), data, data_len); | ^ ../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into destination object ‘nh’ of size 16 20 | struct nlmsghdr nh; | ^ ../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region of size 12 [-Wstringop-overflow=] 305 | memcpy(RTA_DATA(rta), data, data_len); | ^ ../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into destination object ‘nh’ of size 16 20 | struct nlmsghdr nh; | ^ ../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region of size 12 [-Wstringop-overflow=] 305 | memcpy(RTA_DATA(rta), data, data_len); | ^ ../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into destination object ‘nh’ of size 16 20 | struct nlmsghdr nh; | ^ [2755/2774] Linking target app/dpdk-test-pipeline ../drivers/net/tap/tap_netlink.c: In function ‘tap_flow_create_ipv6’: ../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region of size 12 [-Wstringop-overflow=] 305 | memcpy(RTA_DATA(rta), data, data_len); | ^ ../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into destination object ‘nh’ of size 16 20 | struct nlmsghdr nh; | ^ ../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region of size 12 [-Wstringop-overflow=] 305 | memcpy(RTA_DATA(rta), data, data_len); | ^ ../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into destination object ‘nh’ of size 16 20 | struct nlmsghdr nh; | ^ ../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region of size 12 [-Wstringop-overflow=] 305 | memcpy(RTA_DATA(rta), data, data_len); | ^ ../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into destination object ‘nh’ of size 16 20 | struct nlmsghdr nh; | ^ ../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region of size 12 [-Wstringop-overflow=] 305 | memcpy(RTA_DATA(rta), data, data_len); | ^ ../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into destination object ‘nh’ of size 16 [..] OS: Ubuntu 24.04 Meson: 1.3.2 Compiler: gcc 13.2.0 -- You are receiving this mail because: You are the assignee for the bug.
[DPDK/ethdev Bug 1512] event/octeontx build failure on Ubuntu 24.04 with LTO
https://bugs.dpdk.org/show_bug.cgi?id=1512 Bug ID: 1512 Summary: event/octeontx build failure on Ubuntu 24.04 with LTO Product: DPDK Version: 24.07 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: alia...@nvidia.com Target Milestone: --- $ meson --werror -Db_lto=true build && ninja -C build [..] In function ‘ssovf_parsekv’, inlined from ‘rte_kvargs_process’ at ../lib/kvargs/rte_kvargs.c:188:9, inlined from ‘ssovf_vdev_probe’ at ../drivers/event/octeontx/ssovf_evdev.c:872:10: ../drivers/event/octeontx/ssovf_evdev.c:723:15: warning: writing 4 bytes into a region of size 1 [-Wstringop-overflow=] 723 | *flag = !!atoi(value); | ^ ../drivers/event/octeontx/ssovf_evdev.c: In function ‘ssovf_vdev_probe’: ../drivers/event/octeontx/ssovf_evdev.c:26:16: note: destination object ‘timvf_enable_stats’ of size 1 [..] OS: Ubuntu 24.04 Meson: 1.3.2 Compiler: gcc 13.2.0 -- You are receiving this mail because: You are the assignee for the bug.
[DPDK/ethdev Bug 1513] */cnxk build failure on Ubuntu 24.04 with LTO
https://bugs.dpdk.org/show_bug.cgi?id=1513 Bug ID: 1513 Summary: */cnxk build failure on Ubuntu 24.04 with LTO Product: DPDK Version: 24.07 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: alia...@nvidia.com Target Milestone: --- $ meson --werror -Db_lto=true build && ninja -C build [..] ../drivers/common/cnxk/roc_irq.c: In function ‘irq_config’: ../drivers/common/cnxk/roc_irq.c:53:14: warning: argument to variable-length array is too large [-Wvla-larger-than=] 53 | char irq_set_buf[MSIX_IRQ_SET_BUF_LEN]; | ^ ../drivers/common/cnxk/roc_irq.c:53:14: note: limit is 9223372036854775807 bytes, but argument is 18446744073709551548 ../drivers/common/cnxk/roc_irq.c: In function ‘irq_init’: ../drivers/common/cnxk/roc_irq.c:90:14: warning: argument to variable-length array is too large [-Wvla-larger-than=] 90 | char irq_set_buf[MSIX_IRQ_SET_BUF_LEN]; | ^ ../drivers/common/cnxk/roc_irq.c:90:14: note: limit is 9223372036854775807 bytes, but argument is 18446744073709551548 In function ‘parse_flag’, inlined from ‘rte_kvargs_process’ at ../lib/kvargs/rte_kvargs.c:188:9, inlined from ‘cnxk_ethdev_parse_devargs’ at ../drivers/net/cnxk/cnxk_ethdev_devargs.c:339:2: ../drivers/net/cnxk/cnxk_ethdev_devargs.c:167:33: warning: writing 2 bytes into a region of size 1 [-Wstringop-overflow=] 167 | *(uint16_t *)extra_args = atoi(value); | ^ ../drivers/net/cnxk/cnxk_ethdev_devargs.c: In function ‘cnxk_ethdev_parse_devargs’: ../drivers/net/cnxk/cnxk_ethdev_devargs.c:310:17: note: destination object ‘lock_rx_ctx’ of size 1 310 | uint8_t lock_rx_ctx = 0; | ^ In function ‘parse_flag’, inlined from ‘rte_kvargs_process’ at ../lib/kvargs/rte_kvargs.c:188:9, inlined from ‘cnxk_ethdev_parse_devargs’ at ../drivers/net/cnxk/cnxk_ethdev_devargs.c:362:2: ../drivers/net/cnxk/cnxk_ethdev_devargs.c:167:33: warning: writing 2 bytes into a region of size 1 [-Wstringop-overflow=] 167 | *(uint16_t *)extra_args = atoi(value); | ^ ../drivers/net/cnxk/cnxk_ethdev_devargs.c: In function ‘cnxk_ethdev_parse_devargs’: ../drivers/net/cnxk/cnxk_ethdev_devargs.c:311:17: note: destination object ‘rx_inj_ena’ of size 1 311 | uint8_t rx_inj_ena = 0; | ^ [2766/2774] Linking target app/dpdk-test-cmdline [..] OS: Ubuntu 24.04 Meson: 1.3.2 Compiler: gcc 13.2.0 -- You are receiving this mail because: You are the assignee for the bug.
Re: [PATCH v5] virtio: optimize stats counters performance
在 2024/8/2 19:27, Morten Brørup 写道: From: lihuisong (C) [mailto:lihuis...@huawei.com] Sent: Friday, 2 August 2024 04.23 在 2024/8/2 0:03, Morten Brørup 写道: void -virtio_update_packet_stats(struct virtnet_stats *stats, struct rte_mbuf *mbuf) +virtio_update_packet_stats(struct virtnet_stats *const stats, + const struct rte_mbuf *const mbuf) The two const is also for performace? Is there gain? The "const struct rte_mbuf * mbuf" informs the optimizer that the function does not modify the mbuf. This is for the benefit of callers of the function; they can rely on the mbuf being unchanged when the function returns. So, if the optimizer has cached some mbuf field before calling the function, it does not need to read the mbuf field again, but can continue using the cached value after the function call. The two "type *const ptr" probably make no performance difference with the compilers and function call conventions used by the CPU architectures supported by DPDK. ok + RTE_BUILD_BUG_ON(offsetof(struct virtnet_stats, broadcast) != + offsetof(struct virtnet_stats, multicast) + sizeof(uint64_t)); + if (unlikely(rte_is_multicast_ether_addr(ea))) + (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++; The "rte_is_broadcast_ether_addr(ea) " will be calculated twice if packet is mulitcast. How about coding like: --> is_mulitcast = rte_is_multicast_ether_addr(ea); if (unlikely(is_mulitcast)) (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++; I don't think "rte_is_broadcast_ether_addr(ea)" is calculated twice for multicast packets. My code essentially does this: if (mc(ea)) stats[bc(ea)]++; Changing to this shouldn't make a difference: m = mc(ea); if (m) stats[bc(ea)]++; Yeah,you are right.
Re: [PATCH v5] virtio: optimize stats counters performance
LGTM, Acked-by: Huisong Li 在 2024/8/2 0:03, Morten Brørup 写道: Optimized the performance of updating the virtio statistics counters by reducing the number of branches. Ordered the packet size comparisons according to the probability with typical internet traffic mix. Signed-off-by: Morten Brørup --- v5: * Do not inline the function. (Stephen) v4: * Consider multicast/broadcast packets unlikely. v3: * Eliminated a local variable. * Note: Substituted sizeof(uint32_t)*4 by 32UL, using unsigned long type to keep optimal offsetting in generated assembler output. * Removed unnecessary curly braces. v2: * Fixed checkpatch warning about line length. --- drivers/net/virtio/virtio_rxtx.c | 39 drivers/net/virtio/virtio_rxtx.h | 4 ++-- 2 files changed, 16 insertions(+), 27 deletions(-) diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index f69b9453a2..b67f063b31 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -82,37 +82,26 @@ vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx) } void -virtio_update_packet_stats(struct virtnet_stats *stats, struct rte_mbuf *mbuf) +virtio_update_packet_stats(struct virtnet_stats *const stats, + const struct rte_mbuf *const mbuf) { uint32_t s = mbuf->pkt_len; - struct rte_ether_addr *ea; + const struct rte_ether_addr *const ea = + rte_pktmbuf_mtod(mbuf, const struct rte_ether_addr *); stats->bytes += s; - if (s == 64) { - stats->size_bins[1]++; - } else if (s > 64 && s < 1024) { - uint32_t bin; - - /* count zeros, and offset into correct bin */ - bin = (sizeof(s) * 8) - rte_clz32(s) - 5; - stats->size_bins[bin]++; - } else { - if (s < 64) - stats->size_bins[0]++; - else if (s < 1519) - stats->size_bins[6]++; - else - stats->size_bins[7]++; - } + if (s >= 1024) + stats->size_bins[6 + (s > 1518)]++; + else if (s <= 64) + stats->size_bins[s >> 6]++; + else + stats->size_bins[32UL - rte_clz32(s) - 5]++; - ea = rte_pktmbuf_mtod(mbuf, struct rte_ether_addr *); - if (rte_is_multicast_ether_addr(ea)) { - if (rte_is_broadcast_ether_addr(ea)) - stats->broadcast++; - else - stats->multicast++; - } + RTE_BUILD_BUG_ON(offsetof(struct virtnet_stats, broadcast) != + offsetof(struct virtnet_stats, multicast) + sizeof(uint64_t)); + if (unlikely(rte_is_multicast_ether_addr(ea))) + (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++; } static inline void diff --git a/drivers/net/virtio/virtio_rxtx.h b/drivers/net/virtio/virtio_rxtx.h index afc4b74534..68034c914b 100644 --- a/drivers/net/virtio/virtio_rxtx.h +++ b/drivers/net/virtio/virtio_rxtx.h @@ -35,7 +35,7 @@ struct virtnet_tx { }; int virtio_rxq_vec_setup(struct virtnet_rx *rxvq); -void virtio_update_packet_stats(struct virtnet_stats *stats, - struct rte_mbuf *mbuf); +void virtio_update_packet_stats(struct virtnet_stats *const stats, + const struct rte_mbuf *const mbuf); #endif /* _VIRTIO_RXTX_H_ */
[PATCH v4 00/11] support software live migration
This patch series aims to add the support of software live migration feature for NFP vDPA device. --- v4: * Rebase to the newest main branch. * Add the 'Review-by' tag of the external reviewer. v3: * Fix one compile error when using standard atomic. v2: * Adjust some spell in the commit message. * Split out a commit to enable this feature. --- Xinying Yu (11): mailmap: add new contributor vdpa/nfp: fix logic in hardware init vdpa/nfp: fix the logic of reconfiguration vdpa/nfp: refactor the logic of datapath update vdpa/nfp: add the live migration logic vdpa/nfp: add the interrupt logic of vring relay vdpa/nfp: setup the VF configure vdpa/nfp: recover the ring index on new host vdpa/nfp: setup vring relay thread vdpa/nfp: enable feature bits of live migration doc: update nfp document .mailmap | 1 + doc/guides/vdpadevs/nfp.rst | 9 + drivers/common/nfp/nfp_common_ctrl.h | 11 +- drivers/vdpa/nfp/nfp_vdpa.c | 441 +-- drivers/vdpa/nfp/nfp_vdpa_core.c | 135 ++-- drivers/vdpa/nfp/nfp_vdpa_core.h | 14 + 6 files changed, 565 insertions(+), 46 deletions(-) -- 2.39.1
[PATCH v4 01/11] mailmap: add new contributor
From: Xinying Yu Add new contributor. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang --- .mailmap | 1 + 1 file changed, 1 insertion(+) diff --git a/.mailmap b/.mailmap index 4a508bafad..dc3a202a41 100644 --- a/.mailmap +++ b/.mailmap @@ -1645,6 +1645,7 @@ Xieming Katty Xinfeng Zhao Xingguang He Xingyou Chen +Xinying Yu Xin Long Xi Zhang Xuan Ding -- 2.39.1
[PATCH v4 02/11] vdpa/nfp: fix logic in hardware init
From: Xinying Yu Reconfigure the NIC will fail because lack of the initialization logic of queue configuration pointer. Fix this by adding the correct initialization logic. Fixes: d89f4990c14e ("vdpa/nfp: add hardware init") Cc: chaoyong...@corigine.com Cc: sta...@dpdk.org Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/vdpa/nfp/nfp_vdpa_core.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c index 7b877605e4..291798196c 100644 --- a/drivers/vdpa/nfp/nfp_vdpa_core.c +++ b/drivers/vdpa/nfp/nfp_vdpa_core.c @@ -55,7 +55,10 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, struct rte_pci_device *pci_dev) { uint32_t queue; + uint8_t *tx_bar; + uint32_t start_q; struct nfp_hw *hw; + uint32_t tx_bar_off; uint8_t *notify_base; hw = &vdpa_hw->super; @@ -82,6 +85,12 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, idx + 1, vdpa_hw->notify_addr[idx + 1]); } + /* NFP vDPA cfg queue setup */ + start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ); + tx_bar_off = start_q * NFP_QCP_QUEUE_ADDR_SZ; + tx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off; + hw->qcp_cfg = tx_bar + NFP_QCP_QUEUE_ADDR_SZ; + vdpa_hw->features = (1ULL << VIRTIO_F_VERSION_1) | (1ULL << VIRTIO_F_IN_ORDER) | (1ULL << VHOST_USER_F_PROTOCOL_FEATURES); -- 2.39.1
[PATCH v4 03/11] vdpa/nfp: fix the logic of reconfiguration
From: Xinying Yu The ctrl words of vDPA is located on the extended word, so it should use the 'nfp_ext_reconfig()' rather than 'nfp_reconfig()'. Also replace the misuse of 'NFP_NET_CFG_CTRL_SCATTER' macro with 'NFP_NET_CFG_CTRL_VIRTIO'. Fixes: b47a0373903f ("vdpa/nfp: add datapath update") Cc: chaoyong...@corigine.com Cc: sta...@dpdk.org Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/common/nfp/nfp_common_ctrl.h | 1 + drivers/vdpa/nfp/nfp_vdpa_core.c | 16 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/common/nfp/nfp_common_ctrl.h b/drivers/common/nfp/nfp_common_ctrl.h index 69596dd6f5..1b30f81fdb 100644 --- a/drivers/common/nfp/nfp_common_ctrl.h +++ b/drivers/common/nfp/nfp_common_ctrl.h @@ -205,6 +205,7 @@ struct nfp_net_fw_ver { #define NFP_NET_CFG_CTRL_IPSEC_LM_LOOKUP (0x1 << 4) /**< SA long match lookup */ #define NFP_NET_CFG_CTRL_MULTI_PF (0x1 << 5) #define NFP_NET_CFG_CTRL_FLOW_STEER (0x1 << 8) /**< Flow Steering */ +#define NFP_NET_CFG_CTRL_VIRTIO (0x1 << 10) /**< Virtio offload */ #define NFP_NET_CFG_CTRL_IN_ORDER (0x1 << 11) /**< Virtio in-order flag */ #define NFP_NET_CFG_CTRL_USO (0x1 << 16) /**< UDP segmentation offload */ diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c index 291798196c..6d07356581 100644 --- a/drivers/vdpa/nfp/nfp_vdpa_core.c +++ b/drivers/vdpa/nfp/nfp_vdpa_core.c @@ -101,7 +101,7 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, static uint32_t nfp_vdpa_check_offloads(void) { - return NFP_NET_CFG_CTRL_SCATTER | + return NFP_NET_CFG_CTRL_VIRTIO | NFP_NET_CFG_CTRL_IN_ORDER; } @@ -112,6 +112,7 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw, int ret; uint32_t update; uint32_t new_ctrl; + uint32_t new_ext_ctrl; struct timespec wait_tst; struct nfp_hw *hw = &vdpa_hw->super; uint8_t mac_addr[RTE_ETHER_ADDR_LEN]; @@ -131,8 +132,6 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw, nfp_disable_queues(hw); nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES); - new_ctrl = nfp_vdpa_check_offloads(); - nn_cfg_writel(hw, NFP_NET_CFG_MTU, 9216); nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, 10240); @@ -147,8 +146,17 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw, /* Writing new MAC to the specific port BAR address */ nfp_write_mac(hw, (uint8_t *)mac_addr); + new_ext_ctrl = nfp_vdpa_check_offloads(); + + update = NFP_NET_CFG_UPDATE_GEN; + ret = nfp_ext_reconfig(hw, new_ext_ctrl, update); + if (ret != 0) + return -EIO; + + hw->ctrl_ext = new_ext_ctrl; + /* Enable device */ - new_ctrl |= NFP_NET_CFG_CTRL_ENABLE; + new_ctrl = NFP_NET_CFG_CTRL_ENABLE; /* Signal the NIC about the change */ update = NFP_NET_CFG_UPDATE_MACADDR | -- 2.39.1
[PATCH v4 04/11] vdpa/nfp: refactor the logic of datapath update
From: Xinying Yu In order to add the new configuration logic of software live migration, split the datapath update logic into two parts, queue configuration and VF configuration. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/vdpa/nfp/nfp_vdpa_core.c | 54 +--- 1 file changed, 36 insertions(+), 18 deletions(-) diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c index 6d07356581..79ecd2b4fc 100644 --- a/drivers/vdpa/nfp/nfp_vdpa_core.c +++ b/drivers/vdpa/nfp/nfp_vdpa_core.c @@ -105,8 +105,8 @@ nfp_vdpa_check_offloads(void) NFP_NET_CFG_CTRL_IN_ORDER; } -int -nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw, +static int +nfp_vdpa_vf_config(struct nfp_hw *hw, int vid) { int ret; @@ -114,24 +114,8 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw, uint32_t new_ctrl; uint32_t new_ext_ctrl; struct timespec wait_tst; - struct nfp_hw *hw = &vdpa_hw->super; uint8_t mac_addr[RTE_ETHER_ADDR_LEN]; - nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(0), vdpa_hw->vring[1].desc); - nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(0), rte_log2_u32(vdpa_hw->vring[1].size)); - nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(1), vdpa_hw->vring[1].avail); - nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(2), vdpa_hw->vring[1].used); - - nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(0), vdpa_hw->vring[0].desc); - nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(0), rte_log2_u32(vdpa_hw->vring[0].size)); - nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(1), vdpa_hw->vring[0].avail); - nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used); - - rte_wmb(); - - nfp_disable_queues(hw); - nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES); - nn_cfg_writel(hw, NFP_NET_CFG_MTU, 9216); nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, 10240); @@ -177,6 +161,40 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw, return 0; } +static void +nfp_vdpa_queue_config(struct nfp_vdpa_hw *vdpa_hw) +{ + struct nfp_hw *hw = &vdpa_hw->super; + + nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(0), vdpa_hw->vring[1].desc); + nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(0), + rte_log2_u32(vdpa_hw->vring[1].size)); + nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(1), vdpa_hw->vring[1].avail); + nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(2), vdpa_hw->vring[1].used); + + nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(0), vdpa_hw->vring[0].desc); + nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(0), + rte_log2_u32(vdpa_hw->vring[0].size)); + nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(1), vdpa_hw->vring[0].avail); + nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used); + + rte_wmb(); +} + +int +nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw, + int vid) +{ + struct nfp_hw *hw = &vdpa_hw->super; + + nfp_vdpa_queue_config(vdpa_hw); + + nfp_disable_queues(hw); + nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES); + + return nfp_vdpa_vf_config(hw, vid); +} + void nfp_vdpa_hw_stop(struct nfp_vdpa_hw *vdpa_hw) { -- 2.39.1
[PATCH v4 05/11] vdpa/nfp: add the live migration logic
From: Xinying Yu Add the basic logic of software live migration. Unset the ring notify area to stop the direct IO datapath if the device support, then we can setup the vring relay to help the live migration. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/vdpa/nfp/nfp_vdpa.c | 66 +++- drivers/vdpa/nfp/nfp_vdpa_core.c | 2 + drivers/vdpa/nfp/nfp_vdpa_core.h | 4 ++ 3 files changed, 70 insertions(+), 2 deletions(-) diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c index cef80b5476..45092cb0af 100644 --- a/drivers/vdpa/nfp/nfp_vdpa.c +++ b/drivers/vdpa/nfp/nfp_vdpa.c @@ -603,6 +603,30 @@ update_datapath(struct nfp_vdpa_dev *device) return ret; } +static int +nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device) +{ + int ret; + int vid = device->vid; + + /* Stop the direct IO data path */ + nfp_vdpa_unset_notify_relay(device); + nfp_vdpa_disable_vfio_intr(device); + + ret = rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, false); + if ((ret != 0) && (ret != -ENOTSUP)) { + DRV_VDPA_LOG(ERR, "Unset the host notifier failed."); + goto error; + } + + device->hw.sw_fallback_running = true; + + return 0; + +error: + return ret; +} + static int nfp_vdpa_dev_config(int vid) { @@ -646,8 +670,18 @@ nfp_vdpa_dev_close(int vid) } device = node->device; - rte_atomic_store_explicit(&device->dev_attached, 0, rte_memory_order_relaxed); - update_datapath(device); + if (device->hw.sw_fallback_running) { + device->hw.sw_fallback_running = false; + + rte_atomic_store_explicit(&device->dev_attached, 0, + rte_memory_order_relaxed); + rte_atomic_store_explicit(&device->running, 0, + rte_memory_order_relaxed); + } else { + rte_atomic_store_explicit(&device->dev_attached, 0, + rte_memory_order_relaxed); + update_datapath(device); + } return 0; } @@ -770,7 +804,35 @@ nfp_vdpa_get_protocol_features(struct rte_vdpa_device *vdev __rte_unused, static int nfp_vdpa_set_features(int32_t vid) { + int ret; + uint64_t features = 0; + struct nfp_vdpa_dev *device; + struct rte_vdpa_device *vdev; + struct nfp_vdpa_dev_node *node; + DRV_VDPA_LOG(DEBUG, "Start vid=%d", vid); + + vdev = rte_vhost_get_vdpa_device(vid); + node = nfp_vdpa_find_node_by_vdev(vdev); + if (node == NULL) { + DRV_VDPA_LOG(ERR, "Invalid vDPA device: %p", vdev); + return -ENODEV; + } + + rte_vhost_get_negotiated_features(vid, &features); + + if (RTE_VHOST_NEED_LOG(features) == 0) + return 0; + + device = node->device; + if (device->hw.sw_lm) { + ret = nfp_vdpa_sw_fallback(device); + if (ret != 0) { + DRV_VDPA_LOG(ERR, "Software fallback start failed"); + return -1; + } + } + return 0; } diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c index 79ecd2b4fc..50eda4cb2c 100644 --- a/drivers/vdpa/nfp/nfp_vdpa_core.c +++ b/drivers/vdpa/nfp/nfp_vdpa_core.c @@ -91,6 +91,8 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, tx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off; hw->qcp_cfg = tx_bar + NFP_QCP_QUEUE_ADDR_SZ; + vdpa_hw->sw_lm = true; + vdpa_hw->features = (1ULL << VIRTIO_F_VERSION_1) | (1ULL << VIRTIO_F_IN_ORDER) | (1ULL << VHOST_USER_F_PROTOCOL_FEATURES); diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.h b/drivers/vdpa/nfp/nfp_vdpa_core.h index a8e0d6dd70..0f880fc0c6 100644 --- a/drivers/vdpa/nfp/nfp_vdpa_core.h +++ b/drivers/vdpa/nfp/nfp_vdpa_core.h @@ -36,6 +36,10 @@ struct nfp_vdpa_hw { uint8_t mac_addr[RTE_ETHER_ADDR_LEN]; uint8_t notify_region; uint8_t nr_vring; + + /** Software Live Migration */ + bool sw_lm; + bool sw_fallback_running; }; int nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, struct rte_pci_device *dev); -- 2.39.1
[PATCH v4 06/11] vdpa/nfp: add the interrupt logic of vring relay
From: Xinying Yu Add the interrupt setup logic of vring relay. The epoll fd is provided here so host can get the interrupt from device on Rx direction, all other operations on vring relay are based on this. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/vdpa/nfp/nfp_vdpa.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c index 45092cb0af..1643ebbb8c 100644 --- a/drivers/vdpa/nfp/nfp_vdpa.c +++ b/drivers/vdpa/nfp/nfp_vdpa.c @@ -336,8 +336,10 @@ nfp_vdpa_stop(struct nfp_vdpa_dev *device) } static int -nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device) +nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device, + bool relay) { + int fd; int ret; uint16_t i; int *fd_ptr; @@ -366,6 +368,19 @@ nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device) fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd; } + if (relay) { + for (i = 0; i < nr_vring; i += 2) { + fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); + if (fd < 0) { + DRV_VDPA_LOG(ERR, "Can't setup eventfd"); + return -EINVAL; + } + + device->intr_fd[i] = fd; + fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd; + } + } + ret = ioctl(device->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set); if (ret != 0) { DRV_VDPA_LOG(ERR, "Error enabling MSI-X interrupts."); @@ -556,7 +571,7 @@ update_datapath(struct nfp_vdpa_dev *device) if (ret != 0) goto unlock_exit; - ret = nfp_vdpa_enable_vfio_intr(device); + ret = nfp_vdpa_enable_vfio_intr(device, false); if (ret != 0) goto dma_map_rollback; @@ -619,6 +634,11 @@ nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device) goto error; } + /* Setup interrupt for vring relay */ + ret = nfp_vdpa_enable_vfio_intr(device, true); + if (ret != 0) + goto error; + device->hw.sw_fallback_running = true; return 0; -- 2.39.1
[PATCH v4 07/11] vdpa/nfp: setup the VF configure
From: Xinying Yu Create the relay vring on host and then set the address of Rx used ring to the VF config bar. So the device can DMA the used ring information to host rather than directly to VM. Use 'NFP_NET_CFG_CTRL_LM_RELAY' notify the device side. And enable the MSIX interrupt on device. Tx ring address is not needed to change since the relay vring only assists Rx ring to do the dirty page logging. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/common/nfp/nfp_common_ctrl.h | 3 + drivers/vdpa/nfp/nfp_vdpa.c | 203 --- drivers/vdpa/nfp/nfp_vdpa_core.c | 55 ++-- drivers/vdpa/nfp/nfp_vdpa_core.h | 8 ++ 4 files changed, 239 insertions(+), 30 deletions(-) diff --git a/drivers/common/nfp/nfp_common_ctrl.h b/drivers/common/nfp/nfp_common_ctrl.h index 1b30f81fdb..8a760ddb4b 100644 --- a/drivers/common/nfp/nfp_common_ctrl.h +++ b/drivers/common/nfp/nfp_common_ctrl.h @@ -207,6 +207,9 @@ struct nfp_net_fw_ver { #define NFP_NET_CFG_CTRL_FLOW_STEER (0x1 << 8) /**< Flow Steering */ #define NFP_NET_CFG_CTRL_VIRTIO (0x1 << 10) /**< Virtio offload */ #define NFP_NET_CFG_CTRL_IN_ORDER (0x1 << 11) /**< Virtio in-order flag */ +#define NFP_NET_CFG_CTRL_LM_RELAY (0x1 << 12) /**< Virtio live migration relay start */ +#define NFP_NET_CFG_CTRL_NOTIFY_DATA (0x1 << 13) /**< Virtio notification data flag */ +#define NFP_NET_CFG_CTRL_SWLM (0x1 << 14) /**< Virtio SW live migration enable */ #define NFP_NET_CFG_CTRL_USO (0x1 << 16) /**< UDP segmentation offload */ #define NFP_NET_CFG_CAP_WORD1 0x00a4 diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c index 1643ebbb8c..983123ba08 100644 --- a/drivers/vdpa/nfp/nfp_vdpa.c +++ b/drivers/vdpa/nfp/nfp_vdpa.c @@ -11,6 +11,8 @@ #include #include #include +#include +#include #include #include "nfp_vdpa_core.h" @@ -21,6 +23,9 @@ #define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \ sizeof(int) * (NFP_VDPA_MAX_QUEUES * 2 + 1)) +#define NFP_VDPA_USED_RING_LEN(size) \ + ((size) * sizeof(struct vring_used_elem) + sizeof(struct vring_used)) + struct nfp_vdpa_dev { struct rte_pci_device *pci_dev; struct rte_vdpa_device *vdev; @@ -261,15 +266,85 @@ nfp_vdpa_qva_to_gpa(int vid, return gpa; } +static void +nfp_vdpa_relay_vring_free(struct nfp_vdpa_dev *device, + uint16_t vring_index) +{ + uint16_t i; + uint64_t size; + struct rte_vhost_vring vring; + uint64_t m_vring_iova = NFP_VDPA_RELAY_VRING; + + for (i = 0; i < vring_index; i++) { + rte_vhost_get_vhost_vring(device->vid, i, &vring); + + size = RTE_ALIGN_CEIL(vring_size(vring.size, rte_mem_page_size()), + rte_mem_page_size()); + rte_vfio_container_dma_unmap(device->vfio_container_fd, + (uint64_t)(uintptr_t)device->hw.m_vring[i].desc, + m_vring_iova, size); + + rte_free(device->hw.m_vring[i].desc); + m_vring_iova += size; + } +} + static int -nfp_vdpa_start(struct nfp_vdpa_dev *device) +nfp_vdpa_relay_vring_alloc(struct nfp_vdpa_dev *device) +{ + int ret; + uint16_t i; + uint64_t size; + void *vring_buf; + uint64_t page_size; + struct rte_vhost_vring vring; + struct nfp_vdpa_hw *vdpa_hw = &device->hw; + uint64_t m_vring_iova = NFP_VDPA_RELAY_VRING; + + page_size = rte_mem_page_size(); + + for (i = 0; i < vdpa_hw->nr_vring; i++) { + rte_vhost_get_vhost_vring(device->vid, i, &vring); + + size = RTE_ALIGN_CEIL(vring_size(vring.size, page_size), page_size); + vring_buf = rte_zmalloc("nfp_vdpa_relay", size, page_size); + if (vring_buf == NULL) + goto vring_free_all; + + vring_init(&vdpa_hw->m_vring[i], vring.size, vring_buf, page_size); + + ret = rte_vfio_container_dma_map(device->vfio_container_fd, + (uint64_t)(uintptr_t)vring_buf, m_vring_iova, size); + if (ret != 0) { + DRV_VDPA_LOG(ERR, "vDPA vring relay dma map failed."); + goto vring_free_one; + } + + m_vring_iova += size; + } + + return 0; + +vring_free_one: + rte_free(device->hw.m_vring[i].desc); +vring_free_all: + nfp_vdpa_relay_vring_free(device, i); + + return -ENOSPC; +} + +static int +nfp_vdpa_start(struct nfp_vdpa_dev *device, + bool relay) { int ret; int vid; uint16_t i; uint64_t gpa; + uint16_t size; struct rte_vhost_vring vring; struct nfp_vdpa_hw *vdpa_hw = &device->
[PATCH v4 08/11] vdpa/nfp: recover the ring index on new host
From: Xinying Yu After migrating to new host, the vring information is recovered by the value in offset 'NFP_NET_CFG_TX_USED_INDEX' and 'NFP_NET_CFG_RX_USED_INDEX'. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/common/nfp/nfp_common_ctrl.h | 7 +-- drivers/vdpa/nfp/nfp_vdpa_core.c | 13 + 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/common/nfp/nfp_common_ctrl.h b/drivers/common/nfp/nfp_common_ctrl.h index 8a760ddb4b..b64bb1dd2d 100644 --- a/drivers/common/nfp/nfp_common_ctrl.h +++ b/drivers/common/nfp/nfp_common_ctrl.h @@ -214,8 +214,11 @@ struct nfp_net_fw_ver { #define NFP_NET_CFG_CAP_WORD1 0x00a4 -/* 16B reserved for future use (0x00b0 - 0x00c0). */ -#define NFP_NET_CFG_RESERVED0x00b0 +#define NFP_NET_CFG_TX_USED_INDEX 0x00b0 +#define NFP_NET_CFG_RX_USED_INDEX 0x00b4 + +/* 16B reserved for future use (0x00b8 - 0x0010). */ +#define NFP_NET_CFG_RESERVED0x00b8 #define NFP_NET_CFG_RESERVED_SZ 0x0010 /* diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c index 2b609dddc2..d7c48e2490 100644 --- a/drivers/vdpa/nfp/nfp_vdpa_core.c +++ b/drivers/vdpa/nfp/nfp_vdpa_core.c @@ -100,6 +100,16 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, return 0; } +static void +nfp_vdpa_hw_queue_init(struct nfp_vdpa_hw *vdpa_hw) +{ + /* Distribute ring information to firmware */ + nn_cfg_writel(&vdpa_hw->super, NFP_NET_CFG_TX_USED_INDEX, + vdpa_hw->vring[1].last_used_idx); + nn_cfg_writel(&vdpa_hw->super, NFP_NET_CFG_RX_USED_INDEX, + vdpa_hw->vring[0].last_used_idx); +} + static uint32_t nfp_vdpa_check_offloads(void) { @@ -198,6 +208,9 @@ nfp_vdpa_queue_config(struct nfp_vdpa_hw *vdpa_hw, nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used); + if (!relay) + nfp_vdpa_hw_queue_init(vdpa_hw); + rte_wmb(); } -- 2.39.1
[PATCH v4 10/11] vdpa/nfp: enable feature bits of live migration
From: Xinying Yu Add the 'VHOST_F_LOG_ALL' feature bits inorder to enable the live migration function. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/vdpa/nfp/nfp_vdpa_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c index 3b3481a99c..70aeb4a3ac 100644 --- a/drivers/vdpa/nfp/nfp_vdpa_core.c +++ b/drivers/vdpa/nfp/nfp_vdpa_core.c @@ -95,6 +95,7 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, vdpa_hw->features = (1ULL << VIRTIO_F_VERSION_1) | (1ULL << VIRTIO_F_IN_ORDER) | + (1ULL << VHOST_F_LOG_ALL) | (1ULL << VHOST_USER_F_PROTOCOL_FEATURES); return 0; -- 2.39.1
[PATCH v4 09/11] vdpa/nfp: setup vring relay thread
From: Xinying Yu Setup the vring relay thread to monitor the interruption from device. And do the dirty page logging or notify device according to event data. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- drivers/vdpa/nfp/nfp_vdpa.c | 148 +++ drivers/vdpa/nfp/nfp_vdpa_core.c | 9 ++ drivers/vdpa/nfp/nfp_vdpa_core.h | 2 + 3 files changed, 159 insertions(+) diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c index 983123ba08..91f1b8d779 100644 --- a/drivers/vdpa/nfp/nfp_vdpa.c +++ b/drivers/vdpa/nfp/nfp_vdpa.c @@ -26,6 +26,8 @@ #define NFP_VDPA_USED_RING_LEN(size) \ ((size) * sizeof(struct vring_used_elem) + sizeof(struct vring_used)) +#define EPOLL_DATA_INTR1 + struct nfp_vdpa_dev { struct rte_pci_device *pci_dev; struct rte_vdpa_device *vdev; @@ -777,6 +779,139 @@ update_datapath(struct nfp_vdpa_dev *device) return ret; } +static int +nfp_vdpa_vring_epoll_ctl(uint32_t queue_num, + struct nfp_vdpa_dev *device) +{ + int ret; + uint32_t qid; + struct epoll_event ev; + struct rte_vhost_vring vring; + + for (qid = 0; qid < queue_num; qid++) { + ev.events = EPOLLIN | EPOLLPRI; + rte_vhost_get_vhost_vring(device->vid, qid, &vring); + ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32; + ret = epoll_ctl(device->epoll_fd, EPOLL_CTL_ADD, vring.kickfd, &ev); + if (ret < 0) { + DRV_VDPA_LOG(ERR, "Epoll add error for queue %u", qid); + return ret; + } + } + + /* vDPA driver interrupt */ + for (qid = 0; qid < queue_num; qid += 2) { + ev.events = EPOLLIN | EPOLLPRI; + /* Leave a flag to mark it's for interrupt */ + ev.data.u64 = EPOLL_DATA_INTR | qid << 1 | + (uint64_t)device->intr_fd[qid] << 32; + ret = epoll_ctl(device->epoll_fd, EPOLL_CTL_ADD, + device->intr_fd[qid], &ev); + if (ret < 0) { + DRV_VDPA_LOG(ERR, "Epoll add error for queue %u", qid); + return ret; + } + + nfp_vdpa_update_used_ring(device, qid); + } + + return 0; +} + +static int +nfp_vdpa_vring_epoll_wait(uint32_t queue_num, + struct nfp_vdpa_dev *device) +{ + int i; + int fds; + int kickfd; + uint32_t qid; + struct epoll_event events[NFP_VDPA_MAX_QUEUES * 2]; + + for (;;) { + fds = epoll_wait(device->epoll_fd, events, queue_num * 2, -1); + if (fds < 0) { + if (errno == EINTR) + continue; + + DRV_VDPA_LOG(ERR, "Epoll wait fail"); + return -EACCES; + } + + for (i = 0; i < fds; i++) { + qid = events[i].data.u32 >> 1; + kickfd = (uint32_t)(events[i].data.u64 >> 32); + + nfp_vdpa_read_kickfd(kickfd); + if ((events[i].data.u32 & EPOLL_DATA_INTR) != 0) { + nfp_vdpa_update_used_ring(device, qid); + nfp_vdpa_irq_unmask(&device->hw); + } else { + nfp_vdpa_notify_queue(&device->hw, qid); + } + } + } + + return 0; +} + +static uint32_t +nfp_vdpa_vring_relay(void *arg) +{ + int ret; + int epoll_fd; + uint16_t queue_id; + uint32_t queue_num; + struct nfp_vdpa_dev *device = arg; + + epoll_fd = epoll_create(NFP_VDPA_MAX_QUEUES * 2); + if (epoll_fd < 0) { + DRV_VDPA_LOG(ERR, "failed to create epoll instance."); + return 1; + } + + device->epoll_fd = epoll_fd; + + queue_num = rte_vhost_get_vring_num(device->vid); + + ret = nfp_vdpa_vring_epoll_ctl(queue_num, device); + if (ret != 0) + goto notify_exit; + + /* Start relay with a first kick */ + for (queue_id = 0; queue_id < queue_num; queue_id++) + nfp_vdpa_notify_queue(&device->hw, queue_id); + + ret = nfp_vdpa_vring_epoll_wait(queue_num, device); + if (ret != 0) + goto notify_exit; + + return 0; + +notify_exit: + close(device->epoll_fd); + device->epoll_fd = -1; + + return 1; +} + +static int +nfp_vdpa_setup_vring_relay(struct nfp_vdpa_dev *device) +{ + int ret; + char name[RTE_THREAD_INTERNAL_NAME_SIZE]; + + snprintf(name, sizeof(name), "nfp_vring%d", device->vid); + ret = rte_thread_create_internal_control(&device->tid, name, + nfp_vdpa_vring_
[PATCH v4 11/11] doc: update nfp document
From: Xinying Yu Add the software assisted vDPA live migration feature into NFP document. Signed-off-by: Xinying Yu Reviewed-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang Reviewed-by: Maxime Coquelin --- doc/guides/vdpadevs/nfp.rst | 9 + 1 file changed, 9 insertions(+) diff --git a/doc/guides/vdpadevs/nfp.rst b/doc/guides/vdpadevs/nfp.rst index dc9e94dbc8..e4736d9f61 100644 --- a/doc/guides/vdpadevs/nfp.rst +++ b/doc/guides/vdpadevs/nfp.rst @@ -19,6 +19,15 @@ device will be probed by net/nfp driver and will used as a VF net device. This PMD uses (common/nfp) code to access the device firmware. +Software Live Migration +~~~ + +Now the NFP vDPA driver only support software assisted live migration mode. +In this mode, the driver will setup a software relay thread when live migration +happens, this thread will help device to log dirty pages. Although this mode +does not require hardware to implement a dirty page logging function block, it +will consume percentage of CPU resource depending on the network throughput. + Per-Device Parameters ~ -- 2.39.1
Re: 22.11.6 patches review and test
RedHat QE tested below 18 scenarios on RHEL 9.2 and didn't find any new dpdk issues. - VM with device assignment(PF) throughput testing(1G hugepage size): PASS - VM with device assignment(PF) throughput testing(2M hugepage size) : PASS - VM with device assignment(VF) throughput testing: PASS - PVP (host dpdk testpmd as vswitch) 1Q: throughput testing: PASS - PVP vhost-user 2Q throughput testing: PASS - PVP vhost-user 1Q - cross numa node throughput testing: PASS - VM with vhost-user 2 queues throughput testing: PASS - vhost-user reconnect with dpdk-client, qemu-server qemu reconnect: PASS - vhost-user reconnect with dpdk-client, qemu-server ovs reconnect: PASS - PVP reconnect with dpdk-client, qemu-server: PASS - PVP 1Q live migration testing: PASS - PVP 1Q cross numa node live migration testing: PASS - VM with ovs+dpdk+vhost-user 1Q live migration testing: PASS - VM with ovs+dpdk+vhost-user 1Q live migration testing (2M): PASS - VM with ovs+dpdk+vhost-user 2Q live migration testing: PASS - VM with ovs+dpdk+vhost-user 4Q live migration testing: PASS - Host PF + DPDK testing: PASS - Host VF + DPDK testing: PASS Test Versions: - qemu-kvm-7.2.0 - kernel 5.14 - libvirt 9.0 - openvswitch 3.1 - git log commit 2480dbd434234a40e7f999ced4650581fd64a24e Author: Luca Boccassi Date: Wed Jul 31 20:35:00 2024 +0100 version: 22.11.6-rc1 Signed-off-by: Luca Boccassi - Test device : X540-AT2 NIC(ixgbe, 10G) Tested-by: Yanghang Liu On Thu, Aug 1, 2024 at 3:37 AM wrote: > Hi all, > > Here is a list of patches targeted for stable release 22.11.6. > > The planned date for the final release is August 20th. > > Please help with testing and validation of your use cases and report > any issues/results with reply-all to this mail. For the final release > the fixes and reported validations will be added to the release notes. > > A release candidate tarball can be found at: > > https://dpdk.org/browse/dpdk-stable/tag/?id=v22.11.6-rc1 > > These patches are located at branch 22.11 of dpdk-stable repo: > https://dpdk.org/browse/dpdk-stable/ > > Thanks. > > Luca Boccassi > > --- > Abdullah Ömer Yamaç (1): > hash: fix RCU reclamation size > > Akhil Goyal (1): > test/crypto: fix enqueue/dequeue callback case > > Alex Vesker (1): > net/mlx5/hws: fix port ID on root item convert > > Alexander Kozyrev (2): > net/mlx5: break flow resource release loop > app/testpmd: add postpone option to async flow destroy > > Anatoly Burakov (7): > net/e1000/base: fix link power down > fbarray: fix incorrect lookahead behavior > fbarray: fix incorrect lookbehind behavior > fbarray: fix lookahead ignore mask handling > fbarray: fix lookbehind ignore mask handling > fbarray: fix finding for unaligned length > malloc: fix multi-process wait condition handling > > Andrew Boyer (1): > net/ionic: fix mbuf double-free when emptying array > > Apeksha Gupta (2): > bus/dpaa: fix memory leak in bus scan > common/dpaax: fix node array overrun > > Arkadiusz Kusztal (1): > crypto/qat: fix placement of OOP offset > > Bing Zhao (3): > net/mlx5: fix end condition of reading xstats > net/mlx5: fix uplink port probing in bonding mode > common/mlx5: remove unneeded field when modify RQ table > > Brian Dooley (1): > crypto/qat: fix GEN4 write > > Bruce Richardson (2): > net/ice: fix sizing of filter hash table > ethdev: fix device init without socket-local memory > > Chaoyong He (5): > app/testpmd: fix help string of BPF load command > net/nfp: fix IPv6 TTL and DSCP flow action > net/nfp: fix allocation of switch domain > net/nfp: forbid offload flow rules with empty action list > net/nfp: remove redundant function call > > Chengwen Feng (2): > net/hns3: check Rx DMA address alignmnent > dma/hisilicon: remove support for HIP09 platform > > Chenming Chang (1): > hash: fix return code description in Doxygen > > Chinh Cao (1): > net/ice/base: fix return type of bitmap hamming weight > > Christian Ehrhardt (1): > test: force IOVA mode on PPC64 without huge pages > > Ciara Loftus (4): > net/af_xdp: fix port ID in Rx mbuf > net/af_xdp: count mbuf allocation failures > net/af_xdp: fix stats reset > net/af_xdp: remove unused local statistic > > Ciara Power (1): > test/crypto: fix vector global buffer overflow > > Conor Fogarty (1): > hash: check name when creating a hash > > Dariusz Sosnowski (2): > net/mlx5: fix MTU configuration > net/mlx5: fix disabling E-Switch default flow rules > > David Marchand (14): > bus/pci: fix build with musl 1.2.4 / Alpine 3.19 > eal/unix: support ZSTD compression for firmware > net/ice: fix check for outer UDP checksum offload > app/testpmd: fix outer IP checksum offload > net: fix outer