2022-08-27 13:31 (UTC+0000), lic121: > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote: > > 2022-08-27 09:25 (UTC+0000), chengt...@qq.com: > > > From: lic121 <lic...@chinatelecom.cn> > > > > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't > > > zero oute allocaed memory. Because memory are zeroed out when free > > > in malloc_elem_free(). But seems the initial allocated memory is > > > not zeroed out as expected. > > > > > > This patch zero out initial allocated memory in > > > malloc_heap_add_memory(). > > > > > > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers > > > this problem. > > > ``` > > > Stack trace of thread 412780: > > > #0 0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd) > > > #1 0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd) > > > #2 0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd) > > > #3 0x0000000000e8a7ca qed_probe (dpdk-testpmd) > > > #4 0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd) > > > #5 0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd) > > > #6 0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd) > > > #7 0x00000000009734e3 rte_bus_probe (dpdk-testpmd) > > > #8 0x00000000009933bd rte_eal_init (dpdk-testpmd) > > > #9 0x000000000041768f main (dpdk-testpmd) > > > #10 0x00007f41a7001b17 __libc_start_main (libc.so.6) > > > #11 0x000000000067e34a _start (dpdk-testpmd) > > > ``` > > > > > > Signed-off-by: lic121 <lic...@chinatelecom.cn> > > > --- > > > lib/librte_eal/common/malloc_heap.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/lib/librte_eal/common/malloc_heap.c > > > b/lib/librte_eal/common/malloc_heap.c > > > index f4e20ea..1607401 100644 > > > --- a/lib/librte_eal/common/malloc_heap.c > > > +++ b/lib/librte_eal/common/malloc_heap.c > > > @@ -96,11 +96,19 @@ > > > void *start, size_t len) > > > { > > > struct malloc_elem *elem = start; > > > + void *ptr; > > > + size_t data_len > > > + > > > > > > malloc_elem_init(elem, heap, msl, len, elem, len); > > > > > > malloc_elem_insert(elem); > > > > > > + /* Zero out new added memory. */ > > > + *ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN); > > > + data_len = elem->size - MALLOC_ELEM_OVERHEAD; > > > + memset(ptr, 0, data_len); > > > + > > > elem = malloc_elem_join_adjacent_free(elem); > > > > > > malloc_elem_free_list_insert(elem); > > > > Hi, > > > > The kernel ensures that the newly mapped memory is zeroed, > > and DPDK ensures that files in hugetlbfs are not re-mapped. > > What makes you think that it is not zeroed? > > Were you able to catch [start; start+len) contain non-zero bytes > > at the start of this function? > > If so, is it system memory (not an external heap)? > > If so, what is the CPU, kernel, any custom settings? > > > > Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc? > > > > This patch cannot be accepted as-is anyway: > > 1. It zeroes memory even if the code was called not via rte_zmalloc(). > > 2. It leads to zeroing on both alloc and free, which is suboptimal. > > Hi Dmitry, thanks for the review. > > In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1] > I printed > the content in gdb. It's not zero. > > print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info > > cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz > kernel: 4.19.90-2102 > > [1] > https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93
Sorry, it seems that something is wrong with your debug. Your link is for 20.11.0. In 20.11.5 (apparently always) struct qede_dev::edev is not a pointer [2]. Even if it was, in zeroed memory it would be a NULL pointer, reading a member would give a random value at NULL + some offset. I suggest to print content of the allocated memory with rte_hexdump(). [2]: http://git.dpdk.org/dpdk-stable/tree/drivers/net/qede/qede_ethdev.h?h=v20.11.5#n223