[DPDK/core Bug 1509] Build failure on Ubuntu 24.04 with Link Time Optimization (LTO)

2024-08-04 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1509

Bug ID: 1509
   Summary: Build failure on Ubuntu 24.04 with Link Time
Optimization (LTO)
   Product: DPDK
   Version: 24.07
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: core
  Assignee: dev@dpdk.org
  Reporter: alia...@nvidia.com
  Target Milestone: ---

$ meson --werror -Db_lto=true -Dc_args='-O3'
-Ddisable_drivers=net/ena,net/tap,*/octeontx,*/cnxk build && ninja -C build
[..]
[2453/2453] Linking target app/dpdk-test
In function ‘memcpy’,
inlined from ‘test_argparse_copy’ at ../app/test/test_argparse.c:94:3,
inlined from ‘test_argparse_init_obj’ at ../app/test/test_argparse.c:106:2,
inlined from ‘test_argparse_invalid_basic_param’ at
../app/test/test_argparse.c:116:8:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10: warning:
‘__builtin___memcpy_chk’ writing 48 bytes into a region of size 0 overflows the
destination [-Wstringop-overflow=]
[..]

OS: Ubuntu 24.04
Meson: 1.3.2
Compiler: gcc 13.2.0

(Note: some drivers are disabled because they report other errors)

-- 
You are receiving this mail because:
You are the assignee for the bug.

[DPDK/ethdev Bug 1510] net/ena build failure on Ubuntu 24.04 with LTO

2024-08-04 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1510

Bug ID: 1510
   Summary: net/ena build failure on Ubuntu 24.04 with LTO
   Product: DPDK
   Version: 24.07
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: alia...@nvidia.com
  Target Milestone: ---

$ meson --werror -Db_lto=true build && ninja -C build
[..]
[1586/2774] Linking target drivers/librte_net_ena.so.24.2
../lib/eal/x86/include/rte_memcpy.h: In function ‘ena_rss_hash_conf_get’:
../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a
region of size 8 [-Wstringop-overflow=]
  128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0);
  | ^
../drivers/net/ena/ena_rss.c:452:17: note: at offset 32 into destination object
‘hw_rss_key’ of size 40
  452 | uint8_t hw_rss_key[ENA_HASH_KEY_SIZE];
  | ^
../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a
region of size 0 [-Wstringop-overflow=]
  128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0);
  | ^
../drivers/net/ena/ena_rss.c:452:17: note: at offset 64 into destination object
‘hw_rss_key’ of size 40
  452 | uint8_t hw_rss_key[ENA_HASH_KEY_SIZE];
  | ^
../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a
region of size 0 [-Wstringop-overflow=]
  128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0);
  | ^
../drivers/net/ena/ena_rss.c:452:17: note: at offset 96 into destination object
‘hw_rss_key’ of size 40
  452 | uint8_t hw_rss_key[ENA_HASH_KEY_SIZE];
  | ^
../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a
region of size 0 [-Wstringop-overflow=]
  128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0);
  | ^
../drivers/net/ena/ena_rss.c:452:17: note: at offset [131, 17179869245] into
destination object ‘hw_rss_key’ of size 40
  452 | uint8_t hw_rss_key[ENA_HASH_KEY_SIZE];
  | ^
../drivers/net/ena/ena_rss.c:452:17: note: at offset [129, 193] into
destination object ‘hw_rss_key’ of size 40
../drivers/net/ena/ena_rss.c:452:17: note: at offset [131, 17179869245] into
destination object ‘hw_rss_key’ of size 40
../drivers/net/ena/ena_rss.c:452:17: note: at offset [129, 193] into
destination object ‘hw_rss_key’ of size 40
../drivers/net/ena/ena_rss.c:452:17: note: at offset [1, 40] into destination
object ‘hw_rss_key’ of size 40
../lib/eal/x86/include/rte_memcpy.h:128:9: warning: writing 32 bytes into a
region of size 8 [-Wstringop-overflow=]
  128 | _mm256_storeu_si256((__m256i *)(void *)dst, ymm0);
  | ^
../drivers/net/ena/ena_rss.c:452:17: note: at offset [32, 40] into destination
object ‘hw_rss_key’ of size 40
[..]

OS: Ubuntu 24.04
Meson: 1.3.2
Compiler: gcc 13.2.0

-- 
You are receiving this mail because:
You are the assignee for the bug.

[DPDK/ethdev Bug 1511] net/tap build failure on Ubuntu 24.04 with LTO

2024-08-04 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1511

Bug ID: 1511
   Summary: net/tap build failure on Ubuntu 24.04 with LTO
   Product: DPDK
   Version: 24.07
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: alia...@nvidia.com
  Target Milestone: ---

$ meson --werror -Db_lto=true build && ninja -C build
[..]
[2109/2774] Linking target drivers/librte_net_tap.so.24.2
../drivers/net/tap/tap_netlink.c: In function ‘tap_flow_create_ipv6’:
../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region
of size 12 [-Wstringop-overflow=]
  305 | memcpy(RTA_DATA(rta), data, data_len);
  | ^
../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into
destination object ‘nh’ of size 16
   20 | struct nlmsghdr nh;
  | ^
../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region
of size 12 [-Wstringop-overflow=]
  305 | memcpy(RTA_DATA(rta), data, data_len);
  | ^
../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into
destination object ‘nh’ of size 16
   20 | struct nlmsghdr nh;
  | ^
../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region
of size 12 [-Wstringop-overflow=]
  305 | memcpy(RTA_DATA(rta), data, data_len);
  | ^
../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into
destination object ‘nh’ of size 16
   20 | struct nlmsghdr nh;
  | ^
../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region
of size 12 [-Wstringop-overflow=]
  305 | memcpy(RTA_DATA(rta), data, data_len);
  | ^
../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into
destination object ‘nh’ of size 16
   20 | struct nlmsghdr nh;
  | ^
[2755/2774] Linking target app/dpdk-test-pipeline
../drivers/net/tap/tap_netlink.c: In function ‘tap_flow_create_ipv6’:
../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region
of size 12 [-Wstringop-overflow=]
  305 | memcpy(RTA_DATA(rta), data, data_len);
  | ^
../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into
destination object ‘nh’ of size 16
   20 | struct nlmsghdr nh;
  | ^
../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region
of size 12 [-Wstringop-overflow=]
  305 | memcpy(RTA_DATA(rta), data, data_len);
  | ^
../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into
destination object ‘nh’ of size 16
   20 | struct nlmsghdr nh;
  | ^
../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region
of size 12 [-Wstringop-overflow=]
  305 | memcpy(RTA_DATA(rta), data, data_len);
  | ^
../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into
destination object ‘nh’ of size 16
   20 | struct nlmsghdr nh;
  | ^
../drivers/net/tap/tap_netlink.c:305:9: warning: writing 16 bytes into a region
of size 12 [-Wstringop-overflow=]
  305 | memcpy(RTA_DATA(rta), data, data_len);
  | ^
../drivers/net/tap/tap_netlink.h:20:25: note: at offset [4, 16] into
destination object ‘nh’ of size 16
[..]

OS: Ubuntu 24.04
Meson: 1.3.2
Compiler: gcc 13.2.0

-- 
You are receiving this mail because:
You are the assignee for the bug.

[DPDK/ethdev Bug 1512] event/octeontx build failure on Ubuntu 24.04 with LTO

2024-08-04 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1512

Bug ID: 1512
   Summary: event/octeontx build failure on Ubuntu 24.04 with LTO
   Product: DPDK
   Version: 24.07
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: alia...@nvidia.com
  Target Milestone: ---

$ meson --werror -Db_lto=true build && ninja -C build
[..]
In function ‘ssovf_parsekv’,
inlined from ‘rte_kvargs_process’ at ../lib/kvargs/rte_kvargs.c:188:9,
inlined from ‘ssovf_vdev_probe’ at
../drivers/event/octeontx/ssovf_evdev.c:872:10:
../drivers/event/octeontx/ssovf_evdev.c:723:15: warning: writing 4 bytes into a
region of size 1 [-Wstringop-overflow=]
  723 | *flag = !!atoi(value);
  |   ^
../drivers/event/octeontx/ssovf_evdev.c: In function ‘ssovf_vdev_probe’:
../drivers/event/octeontx/ssovf_evdev.c:26:16: note: destination object
‘timvf_enable_stats’ of size 1
[..]

OS: Ubuntu 24.04
Meson: 1.3.2
Compiler: gcc 13.2.0

-- 
You are receiving this mail because:
You are the assignee for the bug.

[DPDK/ethdev Bug 1513] */cnxk build failure on Ubuntu 24.04 with LTO

2024-08-04 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1513

Bug ID: 1513
   Summary: */cnxk build failure on Ubuntu 24.04 with LTO
   Product: DPDK
   Version: 24.07
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: alia...@nvidia.com
  Target Milestone: ---

$ meson --werror -Db_lto=true build && ninja -C build
[..]
../drivers/common/cnxk/roc_irq.c: In function ‘irq_config’:
../drivers/common/cnxk/roc_irq.c:53:14: warning: argument to variable-length
array is too large [-Wvla-larger-than=]
   53 | char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
  |  ^
../drivers/common/cnxk/roc_irq.c:53:14: note: limit is 9223372036854775807
bytes, but argument is 18446744073709551548
../drivers/common/cnxk/roc_irq.c: In function ‘irq_init’:
../drivers/common/cnxk/roc_irq.c:90:14: warning: argument to variable-length
array is too large [-Wvla-larger-than=]
   90 | char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
  |  ^
../drivers/common/cnxk/roc_irq.c:90:14: note: limit is 9223372036854775807
bytes, but argument is 18446744073709551548
In function ‘parse_flag’,
inlined from ‘rte_kvargs_process’ at ../lib/kvargs/rte_kvargs.c:188:9,
inlined from ‘cnxk_ethdev_parse_devargs’ at
../drivers/net/cnxk/cnxk_ethdev_devargs.c:339:2:
../drivers/net/cnxk/cnxk_ethdev_devargs.c:167:33: warning: writing 2 bytes into
a region of size 1 [-Wstringop-overflow=]
  167 | *(uint16_t *)extra_args = atoi(value);
  | ^
../drivers/net/cnxk/cnxk_ethdev_devargs.c: In function
‘cnxk_ethdev_parse_devargs’:
../drivers/net/cnxk/cnxk_ethdev_devargs.c:310:17: note: destination object
‘lock_rx_ctx’ of size 1
  310 | uint8_t lock_rx_ctx = 0;
  | ^
In function ‘parse_flag’,
inlined from ‘rte_kvargs_process’ at ../lib/kvargs/rte_kvargs.c:188:9,
inlined from ‘cnxk_ethdev_parse_devargs’ at
../drivers/net/cnxk/cnxk_ethdev_devargs.c:362:2:
../drivers/net/cnxk/cnxk_ethdev_devargs.c:167:33: warning: writing 2 bytes into
a region of size 1 [-Wstringop-overflow=]
  167 | *(uint16_t *)extra_args = atoi(value);
  | ^
../drivers/net/cnxk/cnxk_ethdev_devargs.c: In function
‘cnxk_ethdev_parse_devargs’:
../drivers/net/cnxk/cnxk_ethdev_devargs.c:311:17: note: destination object
‘rx_inj_ena’ of size 1
  311 | uint8_t rx_inj_ena = 0;
  | ^
[2766/2774] Linking target app/dpdk-test-cmdline
[..]

OS: Ubuntu 24.04
Meson: 1.3.2
Compiler: gcc 13.2.0

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [PATCH v5] virtio: optimize stats counters performance

2024-08-04 Thread lihuisong (C)



在 2024/8/2 19:27, Morten Brørup 写道:

From: lihuisong (C) [mailto:lihuis...@huawei.com]
Sent: Friday, 2 August 2024 04.23

在 2024/8/2 0:03, Morten Brørup 写道:

   void
-virtio_update_packet_stats(struct virtnet_stats *stats, struct

rte_mbuf *mbuf)

+virtio_update_packet_stats(struct virtnet_stats *const stats,
+   const struct rte_mbuf *const mbuf)

The two const is also for performace?  Is there gain?

The "const struct rte_mbuf * mbuf" informs the optimizer that the function does 
not modify the mbuf.  This is for the benefit of callers of the function; they can rely 
on the mbuf being unchanged when the function returns.
So, if the optimizer has cached some mbuf field before calling the function, it 
does not need to read the mbuf field again, but can continue using the cached 
value after the function call.

The two "type *const ptr" probably make no performance difference with the 
compilers and function call conventions used by the CPU architectures supported by DPDK.

ok



+   RTE_BUILD_BUG_ON(offsetof(struct virtnet_stats, broadcast) !=
+   offsetof(struct virtnet_stats, multicast) +

sizeof(uint64_t));

+   if (unlikely(rte_is_multicast_ether_addr(ea)))
+   (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;

The "rte_is_broadcast_ether_addr(ea) " will be calculated twice if
packet is mulitcast.
How about coding like:
-->
is_mulitcast = rte_is_multicast_ether_addr(ea);
if (unlikely(is_mulitcast))
(&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;

I don't think "rte_is_broadcast_ether_addr(ea)" is calculated twice for 
multicast packets.
My code essentially does this:
if (mc(ea))
stats[bc(ea)]++;

Changing to this shouldn't make a difference:
m = mc(ea);
if (m)
stats[bc(ea)]++;

Yeah,you are right.




Re: [PATCH v5] virtio: optimize stats counters performance

2024-08-04 Thread lihuisong (C)

LGTM,

Acked-by: Huisong Li 

在 2024/8/2 0:03, Morten Brørup 写道:

Optimized the performance of updating the virtio statistics counters by
reducing the number of branches.

Ordered the packet size comparisons according to the probability with
typical internet traffic mix.

Signed-off-by: Morten Brørup 
---
v5:
* Do not inline the function. (Stephen)
v4:
* Consider multicast/broadcast packets unlikely.
v3:
* Eliminated a local variable.
* Note: Substituted sizeof(uint32_t)*4 by 32UL, using unsigned long type
   to keep optimal offsetting in generated assembler output.
* Removed unnecessary curly braces.
v2:
* Fixed checkpatch warning about line length.
---
  drivers/net/virtio/virtio_rxtx.c | 39 
  drivers/net/virtio/virtio_rxtx.h |  4 ++--
  2 files changed, 16 insertions(+), 27 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index f69b9453a2..b67f063b31 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -82,37 +82,26 @@ vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
  }
  
  void

-virtio_update_packet_stats(struct virtnet_stats *stats, struct rte_mbuf *mbuf)
+virtio_update_packet_stats(struct virtnet_stats *const stats,
+   const struct rte_mbuf *const mbuf)
  {
uint32_t s = mbuf->pkt_len;
-   struct rte_ether_addr *ea;
+   const struct rte_ether_addr *const ea =
+   rte_pktmbuf_mtod(mbuf, const struct rte_ether_addr *);
  
  	stats->bytes += s;
  
-	if (s == 64) {

-   stats->size_bins[1]++;
-   } else if (s > 64 && s < 1024) {
-   uint32_t bin;
-
-   /* count zeros, and offset into correct bin */
-   bin = (sizeof(s) * 8) - rte_clz32(s) - 5;
-   stats->size_bins[bin]++;
-   } else {
-   if (s < 64)
-   stats->size_bins[0]++;
-   else if (s < 1519)
-   stats->size_bins[6]++;
-   else
-   stats->size_bins[7]++;
-   }
+   if (s >= 1024)
+   stats->size_bins[6 + (s > 1518)]++;
+   else if (s <= 64)
+   stats->size_bins[s >> 6]++;
+   else
+   stats->size_bins[32UL - rte_clz32(s) - 5]++;
  
-	ea = rte_pktmbuf_mtod(mbuf, struct rte_ether_addr *);

-   if (rte_is_multicast_ether_addr(ea)) {
-   if (rte_is_broadcast_ether_addr(ea))
-   stats->broadcast++;
-   else
-   stats->multicast++;
-   }
+   RTE_BUILD_BUG_ON(offsetof(struct virtnet_stats, broadcast) !=
+   offsetof(struct virtnet_stats, multicast) + 
sizeof(uint64_t));
+   if (unlikely(rte_is_multicast_ether_addr(ea)))
+   (&stats->multicast)[rte_is_broadcast_ether_addr(ea)]++;
  }
  
  static inline void

diff --git a/drivers/net/virtio/virtio_rxtx.h b/drivers/net/virtio/virtio_rxtx.h
index afc4b74534..68034c914b 100644
--- a/drivers/net/virtio/virtio_rxtx.h
+++ b/drivers/net/virtio/virtio_rxtx.h
@@ -35,7 +35,7 @@ struct virtnet_tx {
  };
  
  int virtio_rxq_vec_setup(struct virtnet_rx *rxvq);

-void virtio_update_packet_stats(struct virtnet_stats *stats,
-   struct rte_mbuf *mbuf);
+void virtio_update_packet_stats(struct virtnet_stats *const stats,
+   const struct rte_mbuf *const mbuf);
  
  #endif /* _VIRTIO_RXTX_H_ */


[PATCH v4 00/11] support software live migration

2024-08-04 Thread Chaoyong He
This patch series aims to add the support of software live migration
feature for NFP vDPA device.

---
v4:
* Rebase to the newest main branch.
* Add the 'Review-by' tag of the external reviewer.
v3:
* Fix one compile error when using standard atomic.
v2:
* Adjust some spell in the commit message.
* Split out a commit to enable this feature.
---

Xinying Yu (11):
  mailmap: add new contributor
  vdpa/nfp: fix logic in hardware init
  vdpa/nfp: fix the logic of reconfiguration
  vdpa/nfp: refactor the logic of datapath update
  vdpa/nfp: add the live migration logic
  vdpa/nfp: add the interrupt logic of vring relay
  vdpa/nfp: setup the VF configure
  vdpa/nfp: recover the ring index on new host
  vdpa/nfp: setup vring relay thread
  vdpa/nfp: enable feature bits of live migration
  doc: update nfp document

 .mailmap |   1 +
 doc/guides/vdpadevs/nfp.rst  |   9 +
 drivers/common/nfp/nfp_common_ctrl.h |  11 +-
 drivers/vdpa/nfp/nfp_vdpa.c  | 441 +--
 drivers/vdpa/nfp/nfp_vdpa_core.c | 135 ++--
 drivers/vdpa/nfp/nfp_vdpa_core.h |  14 +
 6 files changed, 565 insertions(+), 46 deletions(-)

-- 
2.39.1



[PATCH v4 01/11] mailmap: add new contributor

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

Add new contributor.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index 4a508bafad..dc3a202a41 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1645,6 +1645,7 @@ Xieming Katty 
 Xinfeng Zhao 
 Xingguang He 
 Xingyou Chen 
+Xinying Yu 
 Xin Long 
 Xi Zhang 
 Xuan Ding 
-- 
2.39.1



[PATCH v4 02/11] vdpa/nfp: fix logic in hardware init

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

Reconfigure the NIC will fail because lack of the
initialization logic of queue configuration pointer.
Fix this by adding the correct initialization logic.

Fixes: d89f4990c14e ("vdpa/nfp: add hardware init")
Cc: chaoyong...@corigine.com
Cc: sta...@dpdk.org

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/vdpa/nfp/nfp_vdpa_core.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 7b877605e4..291798196c 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -55,7 +55,10 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
struct rte_pci_device *pci_dev)
 {
uint32_t queue;
+   uint8_t *tx_bar;
+   uint32_t start_q;
struct nfp_hw *hw;
+   uint32_t tx_bar_off;
uint8_t *notify_base;
 
hw = &vdpa_hw->super;
@@ -82,6 +85,12 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
idx + 1, vdpa_hw->notify_addr[idx + 1]);
}
 
+   /* NFP vDPA cfg queue setup */
+   start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ);
+   tx_bar_off = start_q * NFP_QCP_QUEUE_ADDR_SZ;
+   tx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off;
+   hw->qcp_cfg = tx_bar + NFP_QCP_QUEUE_ADDR_SZ;
+
vdpa_hw->features = (1ULL << VIRTIO_F_VERSION_1) |
(1ULL << VIRTIO_F_IN_ORDER) |
(1ULL << VHOST_USER_F_PROTOCOL_FEATURES);
-- 
2.39.1



[PATCH v4 03/11] vdpa/nfp: fix the logic of reconfiguration

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

The ctrl words of vDPA is located on the extended word, so it
should use the 'nfp_ext_reconfig()' rather than 'nfp_reconfig()'.

Also replace the misuse of 'NFP_NET_CFG_CTRL_SCATTER' macro
with 'NFP_NET_CFG_CTRL_VIRTIO'.

Fixes: b47a0373903f ("vdpa/nfp: add datapath update")
Cc: chaoyong...@corigine.com
Cc: sta...@dpdk.org

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/common/nfp/nfp_common_ctrl.h |  1 +
 drivers/vdpa/nfp/nfp_vdpa_core.c | 16 
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/common/nfp/nfp_common_ctrl.h 
b/drivers/common/nfp/nfp_common_ctrl.h
index 69596dd6f5..1b30f81fdb 100644
--- a/drivers/common/nfp/nfp_common_ctrl.h
+++ b/drivers/common/nfp/nfp_common_ctrl.h
@@ -205,6 +205,7 @@ struct nfp_net_fw_ver {
 #define NFP_NET_CFG_CTRL_IPSEC_LM_LOOKUP  (0x1 << 4) /**< SA long match lookup 
*/
 #define NFP_NET_CFG_CTRL_MULTI_PF (0x1 << 5)
 #define NFP_NET_CFG_CTRL_FLOW_STEER   (0x1 << 8) /**< Flow Steering */
+#define NFP_NET_CFG_CTRL_VIRTIO   (0x1 << 10) /**< Virtio offload */
 #define NFP_NET_CFG_CTRL_IN_ORDER (0x1 << 11) /**< Virtio in-order 
flag */
 #define NFP_NET_CFG_CTRL_USO  (0x1 << 16) /**< UDP segmentation 
offload */
 
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 291798196c..6d07356581 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -101,7 +101,7 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
 static uint32_t
 nfp_vdpa_check_offloads(void)
 {
-   return NFP_NET_CFG_CTRL_SCATTER |
+   return NFP_NET_CFG_CTRL_VIRTIO  |
NFP_NET_CFG_CTRL_IN_ORDER;
 }
 
@@ -112,6 +112,7 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
int ret;
uint32_t update;
uint32_t new_ctrl;
+   uint32_t new_ext_ctrl;
struct timespec wait_tst;
struct nfp_hw *hw = &vdpa_hw->super;
uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
@@ -131,8 +132,6 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
nfp_disable_queues(hw);
nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES);
 
-   new_ctrl = nfp_vdpa_check_offloads();
-
nn_cfg_writel(hw, NFP_NET_CFG_MTU, 9216);
nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, 10240);
 
@@ -147,8 +146,17 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
/* Writing new MAC to the specific port BAR address */
nfp_write_mac(hw, (uint8_t *)mac_addr);
 
+   new_ext_ctrl = nfp_vdpa_check_offloads();
+
+   update = NFP_NET_CFG_UPDATE_GEN;
+   ret = nfp_ext_reconfig(hw, new_ext_ctrl, update);
+   if (ret != 0)
+   return -EIO;
+
+   hw->ctrl_ext = new_ext_ctrl;
+
/* Enable device */
-   new_ctrl |= NFP_NET_CFG_CTRL_ENABLE;
+   new_ctrl = NFP_NET_CFG_CTRL_ENABLE;
 
/* Signal the NIC about the change */
update = NFP_NET_CFG_UPDATE_MACADDR |
-- 
2.39.1



[PATCH v4 04/11] vdpa/nfp: refactor the logic of datapath update

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

In order to add the new configuration logic of software live
migration, split the datapath update logic into two parts,
queue configuration and VF configuration.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/vdpa/nfp/nfp_vdpa_core.c | 54 +---
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 6d07356581..79ecd2b4fc 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -105,8 +105,8 @@ nfp_vdpa_check_offloads(void)
NFP_NET_CFG_CTRL_IN_ORDER;
 }
 
-int
-nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
+static int
+nfp_vdpa_vf_config(struct nfp_hw *hw,
int vid)
 {
int ret;
@@ -114,24 +114,8 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
uint32_t new_ctrl;
uint32_t new_ext_ctrl;
struct timespec wait_tst;
-   struct nfp_hw *hw = &vdpa_hw->super;
uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
 
-   nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(0), vdpa_hw->vring[1].desc);
-   nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(0), 
rte_log2_u32(vdpa_hw->vring[1].size));
-   nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(1), vdpa_hw->vring[1].avail);
-   nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(2), vdpa_hw->vring[1].used);
-
-   nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(0), vdpa_hw->vring[0].desc);
-   nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(0), 
rte_log2_u32(vdpa_hw->vring[0].size));
-   nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(1), vdpa_hw->vring[0].avail);
-   nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used);
-
-   rte_wmb();
-
-   nfp_disable_queues(hw);
-   nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES);
-
nn_cfg_writel(hw, NFP_NET_CFG_MTU, 9216);
nn_cfg_writel(hw, NFP_NET_CFG_FLBUFSZ, 10240);
 
@@ -177,6 +161,40 @@ nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
return 0;
 }
 
+static void
+nfp_vdpa_queue_config(struct nfp_vdpa_hw *vdpa_hw)
+{
+   struct nfp_hw *hw = &vdpa_hw->super;
+
+   nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(0), vdpa_hw->vring[1].desc);
+   nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(0),
+   rte_log2_u32(vdpa_hw->vring[1].size));
+   nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(1), vdpa_hw->vring[1].avail);
+   nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(2), vdpa_hw->vring[1].used);
+
+   nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(0), vdpa_hw->vring[0].desc);
+   nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(0),
+   rte_log2_u32(vdpa_hw->vring[0].size));
+   nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(1), vdpa_hw->vring[0].avail);
+   nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used);
+
+   rte_wmb();
+}
+
+int
+nfp_vdpa_hw_start(struct nfp_vdpa_hw *vdpa_hw,
+   int vid)
+{
+   struct nfp_hw *hw = &vdpa_hw->super;
+
+   nfp_vdpa_queue_config(vdpa_hw);
+
+   nfp_disable_queues(hw);
+   nfp_enable_queues(hw, NFP_VDPA_MAX_QUEUES, NFP_VDPA_MAX_QUEUES);
+
+   return nfp_vdpa_vf_config(hw, vid);
+}
+
 void
 nfp_vdpa_hw_stop(struct nfp_vdpa_hw *vdpa_hw)
 {
-- 
2.39.1



[PATCH v4 05/11] vdpa/nfp: add the live migration logic

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

Add the basic logic of software live migration.

Unset the ring notify area to stop the direct IO datapath if the
device support, then we can setup the vring relay to help the
live migration.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/vdpa/nfp/nfp_vdpa.c  | 66 +++-
 drivers/vdpa/nfp/nfp_vdpa_core.c |  2 +
 drivers/vdpa/nfp/nfp_vdpa_core.h |  4 ++
 3 files changed, 70 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c
index cef80b5476..45092cb0af 100644
--- a/drivers/vdpa/nfp/nfp_vdpa.c
+++ b/drivers/vdpa/nfp/nfp_vdpa.c
@@ -603,6 +603,30 @@ update_datapath(struct nfp_vdpa_dev *device)
return ret;
 }
 
+static int
+nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device)
+{
+   int ret;
+   int vid = device->vid;
+
+   /* Stop the direct IO data path */
+   nfp_vdpa_unset_notify_relay(device);
+   nfp_vdpa_disable_vfio_intr(device);
+
+   ret = rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, false);
+   if ((ret != 0) && (ret != -ENOTSUP)) {
+   DRV_VDPA_LOG(ERR, "Unset the host notifier failed.");
+   goto error;
+   }
+
+   device->hw.sw_fallback_running = true;
+
+   return 0;
+
+error:
+   return ret;
+}
+
 static int
 nfp_vdpa_dev_config(int vid)
 {
@@ -646,8 +670,18 @@ nfp_vdpa_dev_close(int vid)
}
 
device = node->device;
-   rte_atomic_store_explicit(&device->dev_attached, 0, 
rte_memory_order_relaxed);
-   update_datapath(device);
+   if (device->hw.sw_fallback_running) {
+   device->hw.sw_fallback_running = false;
+
+   rte_atomic_store_explicit(&device->dev_attached, 0,
+   rte_memory_order_relaxed);
+   rte_atomic_store_explicit(&device->running, 0,
+   rte_memory_order_relaxed);
+   } else {
+   rte_atomic_store_explicit(&device->dev_attached, 0,
+   rte_memory_order_relaxed);
+   update_datapath(device);
+   }
 
return 0;
 }
@@ -770,7 +804,35 @@ nfp_vdpa_get_protocol_features(struct rte_vdpa_device 
*vdev __rte_unused,
 static int
 nfp_vdpa_set_features(int32_t vid)
 {
+   int ret;
+   uint64_t features = 0;
+   struct nfp_vdpa_dev *device;
+   struct rte_vdpa_device *vdev;
+   struct nfp_vdpa_dev_node *node;
+
DRV_VDPA_LOG(DEBUG, "Start vid=%d", vid);
+
+   vdev = rte_vhost_get_vdpa_device(vid);
+   node = nfp_vdpa_find_node_by_vdev(vdev);
+   if (node == NULL) {
+   DRV_VDPA_LOG(ERR, "Invalid vDPA device: %p", vdev);
+   return -ENODEV;
+   }
+
+   rte_vhost_get_negotiated_features(vid, &features);
+
+   if (RTE_VHOST_NEED_LOG(features) == 0)
+   return 0;
+
+   device = node->device;
+   if (device->hw.sw_lm) {
+   ret = nfp_vdpa_sw_fallback(device);
+   if (ret != 0) {
+   DRV_VDPA_LOG(ERR, "Software fallback start failed");
+   return -1;
+   }
+   }
+
return 0;
 }
 
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 79ecd2b4fc..50eda4cb2c 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -91,6 +91,8 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
tx_bar = (uint8_t *)pci_dev->mem_resource[2].addr + tx_bar_off;
hw->qcp_cfg = tx_bar + NFP_QCP_QUEUE_ADDR_SZ;
 
+   vdpa_hw->sw_lm = true;
+
vdpa_hw->features = (1ULL << VIRTIO_F_VERSION_1) |
(1ULL << VIRTIO_F_IN_ORDER) |
(1ULL << VHOST_USER_F_PROTOCOL_FEATURES);
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.h b/drivers/vdpa/nfp/nfp_vdpa_core.h
index a8e0d6dd70..0f880fc0c6 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.h
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.h
@@ -36,6 +36,10 @@ struct nfp_vdpa_hw {
uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
uint8_t notify_region;
uint8_t nr_vring;
+
+   /** Software Live Migration */
+   bool sw_lm;
+   bool sw_fallback_running;
 };
 
 int nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw, struct rte_pci_device *dev);
-- 
2.39.1



[PATCH v4 06/11] vdpa/nfp: add the interrupt logic of vring relay

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

Add the interrupt setup logic of vring relay.

The epoll fd is provided here so host can get the interrupt from device
on Rx direction, all other operations on vring relay are based on this.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/vdpa/nfp/nfp_vdpa.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c
index 45092cb0af..1643ebbb8c 100644
--- a/drivers/vdpa/nfp/nfp_vdpa.c
+++ b/drivers/vdpa/nfp/nfp_vdpa.c
@@ -336,8 +336,10 @@ nfp_vdpa_stop(struct nfp_vdpa_dev *device)
 }
 
 static int
-nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device)
+nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device,
+   bool relay)
 {
+   int fd;
int ret;
uint16_t i;
int *fd_ptr;
@@ -366,6 +368,19 @@ nfp_vdpa_enable_vfio_intr(struct nfp_vdpa_dev *device)
fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd;
}
 
+   if (relay) {
+   for (i = 0; i < nr_vring; i += 2) {
+   fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+   if (fd < 0) {
+   DRV_VDPA_LOG(ERR, "Can't setup eventfd");
+   return -EINVAL;
+   }
+
+   device->intr_fd[i] = fd;
+   fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
+   }
+   }
+
ret = ioctl(device->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
if (ret != 0) {
DRV_VDPA_LOG(ERR, "Error enabling MSI-X interrupts.");
@@ -556,7 +571,7 @@ update_datapath(struct nfp_vdpa_dev *device)
if (ret != 0)
goto unlock_exit;
 
-   ret = nfp_vdpa_enable_vfio_intr(device);
+   ret = nfp_vdpa_enable_vfio_intr(device, false);
if (ret != 0)
goto dma_map_rollback;
 
@@ -619,6 +634,11 @@ nfp_vdpa_sw_fallback(struct nfp_vdpa_dev *device)
goto error;
}
 
+   /* Setup interrupt for vring relay */
+   ret = nfp_vdpa_enable_vfio_intr(device, true);
+   if (ret != 0)
+   goto error;
+
device->hw.sw_fallback_running = true;
 
return 0;
-- 
2.39.1



[PATCH v4 07/11] vdpa/nfp: setup the VF configure

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

Create the relay vring on host and then set the address of Rx
used ring to the VF config bar. So the device can DMA the
used ring information to host rather than directly to VM.

Use 'NFP_NET_CFG_CTRL_LM_RELAY' notify the device side. And
enable the MSIX interrupt on device.

Tx ring address is not needed to change since the relay vring
only assists Rx ring to do the dirty page logging.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/common/nfp/nfp_common_ctrl.h |   3 +
 drivers/vdpa/nfp/nfp_vdpa.c  | 203 ---
 drivers/vdpa/nfp/nfp_vdpa_core.c |  55 ++--
 drivers/vdpa/nfp/nfp_vdpa_core.h |   8 ++
 4 files changed, 239 insertions(+), 30 deletions(-)

diff --git a/drivers/common/nfp/nfp_common_ctrl.h 
b/drivers/common/nfp/nfp_common_ctrl.h
index 1b30f81fdb..8a760ddb4b 100644
--- a/drivers/common/nfp/nfp_common_ctrl.h
+++ b/drivers/common/nfp/nfp_common_ctrl.h
@@ -207,6 +207,9 @@ struct nfp_net_fw_ver {
 #define NFP_NET_CFG_CTRL_FLOW_STEER   (0x1 << 8) /**< Flow Steering */
 #define NFP_NET_CFG_CTRL_VIRTIO   (0x1 << 10) /**< Virtio offload */
 #define NFP_NET_CFG_CTRL_IN_ORDER (0x1 << 11) /**< Virtio in-order 
flag */
+#define NFP_NET_CFG_CTRL_LM_RELAY (0x1 << 12) /**< Virtio live 
migration relay start */
+#define NFP_NET_CFG_CTRL_NOTIFY_DATA  (0x1 << 13) /**< Virtio notification 
data flag */
+#define NFP_NET_CFG_CTRL_SWLM (0x1 << 14) /**< Virtio SW live 
migration enable */
 #define NFP_NET_CFG_CTRL_USO  (0x1 << 16) /**< UDP segmentation 
offload */
 
 #define NFP_NET_CFG_CAP_WORD1   0x00a4
diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c
index 1643ebbb8c..983123ba08 100644
--- a/drivers/vdpa/nfp/nfp_vdpa.c
+++ b/drivers/vdpa/nfp/nfp_vdpa.c
@@ -11,6 +11,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 #include "nfp_vdpa_core.h"
@@ -21,6 +23,9 @@
 #define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
sizeof(int) * (NFP_VDPA_MAX_QUEUES * 2 + 1))
 
+#define NFP_VDPA_USED_RING_LEN(size) \
+   ((size) * sizeof(struct vring_used_elem) + sizeof(struct 
vring_used))
+
 struct nfp_vdpa_dev {
struct rte_pci_device *pci_dev;
struct rte_vdpa_device *vdev;
@@ -261,15 +266,85 @@ nfp_vdpa_qva_to_gpa(int vid,
return gpa;
 }
 
+static void
+nfp_vdpa_relay_vring_free(struct nfp_vdpa_dev *device,
+   uint16_t vring_index)
+{
+   uint16_t i;
+   uint64_t size;
+   struct rte_vhost_vring vring;
+   uint64_t m_vring_iova = NFP_VDPA_RELAY_VRING;
+
+   for (i = 0; i < vring_index; i++) {
+   rte_vhost_get_vhost_vring(device->vid, i, &vring);
+
+   size = RTE_ALIGN_CEIL(vring_size(vring.size, 
rte_mem_page_size()),
+   rte_mem_page_size());
+   rte_vfio_container_dma_unmap(device->vfio_container_fd,
+   (uint64_t)(uintptr_t)device->hw.m_vring[i].desc,
+   m_vring_iova, size);
+
+   rte_free(device->hw.m_vring[i].desc);
+   m_vring_iova += size;
+   }
+}
+
 static int
-nfp_vdpa_start(struct nfp_vdpa_dev *device)
+nfp_vdpa_relay_vring_alloc(struct nfp_vdpa_dev *device)
+{
+   int ret;
+   uint16_t i;
+   uint64_t size;
+   void *vring_buf;
+   uint64_t page_size;
+   struct rte_vhost_vring vring;
+   struct nfp_vdpa_hw *vdpa_hw = &device->hw;
+   uint64_t m_vring_iova = NFP_VDPA_RELAY_VRING;
+
+   page_size = rte_mem_page_size();
+
+   for (i = 0; i < vdpa_hw->nr_vring; i++) {
+   rte_vhost_get_vhost_vring(device->vid, i, &vring);
+
+   size = RTE_ALIGN_CEIL(vring_size(vring.size, page_size), 
page_size);
+   vring_buf = rte_zmalloc("nfp_vdpa_relay", size, page_size);
+   if (vring_buf == NULL)
+   goto vring_free_all;
+
+   vring_init(&vdpa_hw->m_vring[i], vring.size, vring_buf, 
page_size);
+
+   ret = rte_vfio_container_dma_map(device->vfio_container_fd,
+   (uint64_t)(uintptr_t)vring_buf, m_vring_iova, 
size);
+   if (ret != 0) {
+   DRV_VDPA_LOG(ERR, "vDPA vring relay dma map failed.");
+   goto vring_free_one;
+   }
+
+   m_vring_iova += size;
+   }
+
+   return 0;
+
+vring_free_one:
+   rte_free(device->hw.m_vring[i].desc);
+vring_free_all:
+   nfp_vdpa_relay_vring_free(device, i);
+
+   return -ENOSPC;
+}
+
+static int
+nfp_vdpa_start(struct nfp_vdpa_dev *device,
+   bool relay)
 {
int ret;
int vid;
uint16_t i;
uint64_t gpa;
+   uint16_t size;
struct rte_vhost_vring vring;
struct nfp_vdpa_hw *vdpa_hw = &device->

[PATCH v4 08/11] vdpa/nfp: recover the ring index on new host

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

After migrating to new host, the vring information is
recovered by the value in offset 'NFP_NET_CFG_TX_USED_INDEX'
and 'NFP_NET_CFG_RX_USED_INDEX'.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/common/nfp/nfp_common_ctrl.h |  7 +--
 drivers/vdpa/nfp/nfp_vdpa_core.c | 13 +
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/common/nfp/nfp_common_ctrl.h 
b/drivers/common/nfp/nfp_common_ctrl.h
index 8a760ddb4b..b64bb1dd2d 100644
--- a/drivers/common/nfp/nfp_common_ctrl.h
+++ b/drivers/common/nfp/nfp_common_ctrl.h
@@ -214,8 +214,11 @@ struct nfp_net_fw_ver {
 
 #define NFP_NET_CFG_CAP_WORD1   0x00a4
 
-/* 16B reserved for future use (0x00b0 - 0x00c0). */
-#define NFP_NET_CFG_RESERVED0x00b0
+#define NFP_NET_CFG_TX_USED_INDEX   0x00b0
+#define NFP_NET_CFG_RX_USED_INDEX   0x00b4
+
+/* 16B reserved for future use (0x00b8 - 0x0010). */
+#define NFP_NET_CFG_RESERVED0x00b8
 #define NFP_NET_CFG_RESERVED_SZ 0x0010
 
 /*
diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 2b609dddc2..d7c48e2490 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -100,6 +100,16 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
return 0;
 }
 
+static void
+nfp_vdpa_hw_queue_init(struct nfp_vdpa_hw *vdpa_hw)
+{
+   /* Distribute ring information to firmware */
+   nn_cfg_writel(&vdpa_hw->super, NFP_NET_CFG_TX_USED_INDEX,
+   vdpa_hw->vring[1].last_used_idx);
+   nn_cfg_writel(&vdpa_hw->super, NFP_NET_CFG_RX_USED_INDEX,
+   vdpa_hw->vring[0].last_used_idx);
+}
+
 static uint32_t
 nfp_vdpa_check_offloads(void)
 {
@@ -198,6 +208,9 @@ nfp_vdpa_queue_config(struct nfp_vdpa_hw *vdpa_hw,
 
nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(2), vdpa_hw->vring[0].used);
 
+   if (!relay)
+   nfp_vdpa_hw_queue_init(vdpa_hw);
+
rte_wmb();
 }
 
-- 
2.39.1



[PATCH v4 10/11] vdpa/nfp: enable feature bits of live migration

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

Add the 'VHOST_F_LOG_ALL' feature bits inorder to enable the
live migration function.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/vdpa/nfp/nfp_vdpa_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vdpa/nfp/nfp_vdpa_core.c b/drivers/vdpa/nfp/nfp_vdpa_core.c
index 3b3481a99c..70aeb4a3ac 100644
--- a/drivers/vdpa/nfp/nfp_vdpa_core.c
+++ b/drivers/vdpa/nfp/nfp_vdpa_core.c
@@ -95,6 +95,7 @@ nfp_vdpa_hw_init(struct nfp_vdpa_hw *vdpa_hw,
 
vdpa_hw->features = (1ULL << VIRTIO_F_VERSION_1) |
(1ULL << VIRTIO_F_IN_ORDER) |
+   (1ULL << VHOST_F_LOG_ALL) |
(1ULL << VHOST_USER_F_PROTOCOL_FEATURES);
 
return 0;
-- 
2.39.1



[PATCH v4 09/11] vdpa/nfp: setup vring relay thread

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

Setup the vring relay thread to monitor the interruption from
device. And do the dirty page logging or notify device according
to event data.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 drivers/vdpa/nfp/nfp_vdpa.c  | 148 +++
 drivers/vdpa/nfp/nfp_vdpa_core.c |   9 ++
 drivers/vdpa/nfp/nfp_vdpa_core.h |   2 +
 3 files changed, 159 insertions(+)

diff --git a/drivers/vdpa/nfp/nfp_vdpa.c b/drivers/vdpa/nfp/nfp_vdpa.c
index 983123ba08..91f1b8d779 100644
--- a/drivers/vdpa/nfp/nfp_vdpa.c
+++ b/drivers/vdpa/nfp/nfp_vdpa.c
@@ -26,6 +26,8 @@
 #define NFP_VDPA_USED_RING_LEN(size) \
((size) * sizeof(struct vring_used_elem) + sizeof(struct 
vring_used))
 
+#define EPOLL_DATA_INTR1
+
 struct nfp_vdpa_dev {
struct rte_pci_device *pci_dev;
struct rte_vdpa_device *vdev;
@@ -777,6 +779,139 @@ update_datapath(struct nfp_vdpa_dev *device)
return ret;
 }
 
+static int
+nfp_vdpa_vring_epoll_ctl(uint32_t queue_num,
+   struct nfp_vdpa_dev *device)
+{
+   int ret;
+   uint32_t qid;
+   struct epoll_event ev;
+   struct rte_vhost_vring vring;
+
+   for (qid = 0; qid < queue_num; qid++) {
+   ev.events = EPOLLIN | EPOLLPRI;
+   rte_vhost_get_vhost_vring(device->vid, qid, &vring);
+   ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
+   ret = epoll_ctl(device->epoll_fd, EPOLL_CTL_ADD, vring.kickfd, 
&ev);
+   if (ret < 0) {
+   DRV_VDPA_LOG(ERR, "Epoll add error for queue %u", qid);
+   return ret;
+   }
+   }
+
+   /* vDPA driver interrupt */
+   for (qid = 0; qid < queue_num; qid += 2) {
+   ev.events = EPOLLIN | EPOLLPRI;
+   /* Leave a flag to mark it's for interrupt */
+   ev.data.u64 = EPOLL_DATA_INTR | qid << 1 |
+   (uint64_t)device->intr_fd[qid] << 32;
+   ret = epoll_ctl(device->epoll_fd, EPOLL_CTL_ADD,
+   device->intr_fd[qid], &ev);
+   if (ret < 0) {
+   DRV_VDPA_LOG(ERR, "Epoll add error for queue %u", qid);
+   return ret;
+   }
+
+   nfp_vdpa_update_used_ring(device, qid);
+   }
+
+   return 0;
+}
+
+static int
+nfp_vdpa_vring_epoll_wait(uint32_t queue_num,
+   struct nfp_vdpa_dev *device)
+{
+   int i;
+   int fds;
+   int kickfd;
+   uint32_t qid;
+   struct epoll_event events[NFP_VDPA_MAX_QUEUES * 2];
+
+   for (;;) {
+   fds = epoll_wait(device->epoll_fd, events, queue_num * 2, -1);
+   if (fds < 0) {
+   if (errno == EINTR)
+   continue;
+
+   DRV_VDPA_LOG(ERR, "Epoll wait fail");
+   return -EACCES;
+   }
+
+   for (i = 0; i < fds; i++) {
+   qid = events[i].data.u32 >> 1;
+   kickfd = (uint32_t)(events[i].data.u64 >> 32);
+
+   nfp_vdpa_read_kickfd(kickfd);
+   if ((events[i].data.u32 & EPOLL_DATA_INTR) != 0) {
+   nfp_vdpa_update_used_ring(device, qid);
+   nfp_vdpa_irq_unmask(&device->hw);
+   } else {
+   nfp_vdpa_notify_queue(&device->hw, qid);
+   }
+   }
+   }
+
+   return 0;
+}
+
+static uint32_t
+nfp_vdpa_vring_relay(void *arg)
+{
+   int ret;
+   int epoll_fd;
+   uint16_t queue_id;
+   uint32_t queue_num;
+   struct nfp_vdpa_dev *device = arg;
+
+   epoll_fd = epoll_create(NFP_VDPA_MAX_QUEUES * 2);
+   if (epoll_fd < 0) {
+   DRV_VDPA_LOG(ERR, "failed to create epoll instance.");
+   return 1;
+   }
+
+   device->epoll_fd = epoll_fd;
+
+   queue_num = rte_vhost_get_vring_num(device->vid);
+
+   ret = nfp_vdpa_vring_epoll_ctl(queue_num, device);
+   if (ret != 0)
+   goto notify_exit;
+
+   /* Start relay with a first kick */
+   for (queue_id = 0; queue_id < queue_num; queue_id++)
+   nfp_vdpa_notify_queue(&device->hw, queue_id);
+
+   ret = nfp_vdpa_vring_epoll_wait(queue_num, device);
+   if (ret != 0)
+   goto notify_exit;
+
+   return 0;
+
+notify_exit:
+   close(device->epoll_fd);
+   device->epoll_fd = -1;
+
+   return 1;
+}
+
+static int
+nfp_vdpa_setup_vring_relay(struct nfp_vdpa_dev *device)
+{
+   int ret;
+   char name[RTE_THREAD_INTERNAL_NAME_SIZE];
+
+   snprintf(name, sizeof(name), "nfp_vring%d", device->vid);
+   ret = rte_thread_create_internal_control(&device->tid, name,
+   nfp_vdpa_vring_

[PATCH v4 11/11] doc: update nfp document

2024-08-04 Thread Chaoyong He
From: Xinying Yu 

Add the software assisted vDPA live migration feature
into NFP document.

Signed-off-by: Xinying Yu 
Reviewed-by: Chaoyong He 
Reviewed-by: Long Wu 
Reviewed-by: Peng Zhang 
Reviewed-by: Maxime Coquelin 
---
 doc/guides/vdpadevs/nfp.rst | 9 +
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/vdpadevs/nfp.rst b/doc/guides/vdpadevs/nfp.rst
index dc9e94dbc8..e4736d9f61 100644
--- a/doc/guides/vdpadevs/nfp.rst
+++ b/doc/guides/vdpadevs/nfp.rst
@@ -19,6 +19,15 @@ device will be probed by net/nfp driver and will used as a 
VF net device.
 
 This PMD uses (common/nfp) code to access the device firmware.
 
+Software Live Migration
+~~~
+
+Now the NFP vDPA driver only support software assisted live migration mode.
+In this mode, the driver will setup a software relay thread when live migration
+happens, this thread will help device to log dirty pages. Although this mode
+does not require hardware to implement a dirty page logging function block, it
+will consume percentage of CPU resource depending on the network throughput.
+
 Per-Device Parameters
 ~
 
-- 
2.39.1



Re: 22.11.6 patches review and test

2024-08-04 Thread YangHang Liu
RedHat QE tested below 18 scenarios on RHEL 9.2 and didn't find any new
dpdk issues.

   - VM with device assignment(PF) throughput testing(1G hugepage size):
   PASS
   - VM with device assignment(PF) throughput testing(2M hugepage size) :
   PASS
   - VM with device assignment(VF) throughput testing: PASS
   - PVP (host dpdk testpmd as vswitch) 1Q: throughput testing: PASS
   - PVP vhost-user 2Q throughput testing: PASS
   - PVP vhost-user 1Q - cross numa node throughput testing: PASS
   - VM with vhost-user 2 queues throughput testing: PASS
   - vhost-user reconnect with dpdk-client, qemu-server qemu reconnect: PASS
   - vhost-user reconnect with dpdk-client, qemu-server ovs reconnect: PASS
   - PVP  reconnect with dpdk-client, qemu-server: PASS
   - PVP 1Q live migration testing: PASS
   - PVP 1Q cross numa node live migration testing: PASS
   - VM with ovs+dpdk+vhost-user 1Q live migration testing: PASS
   - VM with ovs+dpdk+vhost-user 1Q live migration testing (2M): PASS
   - VM with ovs+dpdk+vhost-user 2Q live migration testing: PASS
   - VM with ovs+dpdk+vhost-user 4Q live migration testing: PASS
   - Host PF + DPDK testing: PASS
   - Host VF + DPDK testing: PASS

Test Versions:

   - qemu-kvm-7.2.0
   - kernel 5.14
   - libvirt 9.0
   - openvswitch 3.1
   - git log

commit 2480dbd434234a40e7f999ced4650581fd64a24e

Author: Luca Boccassi 

Date: Wed Jul 31 20:35:00 2024 +0100

version: 22.11.6-rc1

Signed-off-by: Luca Boccassi 


   - Test device : X540-AT2 NIC(ixgbe, 10G)

Tested-by: Yanghang Liu



On Thu, Aug 1, 2024 at 3:37 AM  wrote:

> Hi all,
>
> Here is a list of patches targeted for stable release 22.11.6.
>
> The planned date for the final release is August 20th.
>
> Please help with testing and validation of your use cases and report
> any issues/results with reply-all to this mail. For the final release
> the fixes and reported validations will be added to the release notes.
>
> A release candidate tarball can be found at:
>
> https://dpdk.org/browse/dpdk-stable/tag/?id=v22.11.6-rc1
>
> These patches are located at branch 22.11 of dpdk-stable repo:
> https://dpdk.org/browse/dpdk-stable/
>
> Thanks.
>
> Luca Boccassi
>
> ---
> Abdullah Ömer Yamaç (1):
>   hash: fix RCU reclamation size
>
> Akhil Goyal (1):
>   test/crypto: fix enqueue/dequeue callback case
>
> Alex Vesker (1):
>   net/mlx5/hws: fix port ID on root item convert
>
> Alexander Kozyrev (2):
>   net/mlx5: break flow resource release loop
>   app/testpmd: add postpone option to async flow destroy
>
> Anatoly Burakov (7):
>   net/e1000/base: fix link power down
>   fbarray: fix incorrect lookahead behavior
>   fbarray: fix incorrect lookbehind behavior
>   fbarray: fix lookahead ignore mask handling
>   fbarray: fix lookbehind ignore mask handling
>   fbarray: fix finding for unaligned length
>   malloc: fix multi-process wait condition handling
>
> Andrew Boyer (1):
>   net/ionic: fix mbuf double-free when emptying array
>
> Apeksha Gupta (2):
>   bus/dpaa: fix memory leak in bus scan
>   common/dpaax: fix node array overrun
>
> Arkadiusz Kusztal (1):
>   crypto/qat: fix placement of OOP offset
>
> Bing Zhao (3):
>   net/mlx5: fix end condition of reading xstats
>   net/mlx5: fix uplink port probing in bonding mode
>   common/mlx5: remove unneeded field when modify RQ table
>
> Brian Dooley (1):
>   crypto/qat: fix GEN4 write
>
> Bruce Richardson (2):
>   net/ice: fix sizing of filter hash table
>   ethdev: fix device init without socket-local memory
>
> Chaoyong He (5):
>   app/testpmd: fix help string of BPF load command
>   net/nfp: fix IPv6 TTL and DSCP flow action
>   net/nfp: fix allocation of switch domain
>   net/nfp: forbid offload flow rules with empty action list
>   net/nfp: remove redundant function call
>
> Chengwen Feng (2):
>   net/hns3: check Rx DMA address alignmnent
>   dma/hisilicon: remove support for HIP09 platform
>
> Chenming Chang (1):
>   hash: fix return code description in Doxygen
>
> Chinh Cao (1):
>   net/ice/base: fix return type of bitmap hamming weight
>
> Christian Ehrhardt (1):
>   test: force IOVA mode on PPC64 without huge pages
>
> Ciara Loftus (4):
>   net/af_xdp: fix port ID in Rx mbuf
>   net/af_xdp: count mbuf allocation failures
>   net/af_xdp: fix stats reset
>   net/af_xdp: remove unused local statistic
>
> Ciara Power (1):
>   test/crypto: fix vector global buffer overflow
>
> Conor Fogarty (1):
>   hash: check name when creating a hash
>
> Dariusz Sosnowski (2):
>   net/mlx5: fix MTU configuration
>   net/mlx5: fix disabling E-Switch default flow rules
>
> David Marchand (14):
>   bus/pci: fix build with musl 1.2.4 / Alpine 3.19
>   eal/unix: support ZSTD compression for firmware
>   net/ice: fix check for outer UDP checksum offload
>   app/testpmd: fix outer IP checksum offload
>   net: fix outer