On Tue, Aug 30, 2022 at 01:11:25AM +0000, lic121 wrote: > On Mon, Aug 29, 2022 at 03:49:25PM +0300, Dmitry Kozlyuk wrote: > > 2022-08-29 14:37 (UTC+0200), Morten Brørup: > > > > From: David Marchand [mailto:david.march...@redhat.com] > > > > Sent: Monday, 29 August 2022 13.58 > > > > > > > > > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote: > > > > > > > > > > > > > > > > > The kernel ensures that the newly mapped memory is zeroed, > > > > > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped. > > > > > > David, are you suggesting that this invariant - guaranteeing that DPDK > > > memory is zeroed - was violated by SELinux in the SELinux/container issue > > > you were tracking? > > > > > > If so, the method to ensure the invariant is faulty for SELinux. Assuming > > > DPDK supports SELinux, this bug should be fixed. > > > > +1, I'd like to know more about that case. > > > > EAL checks the unlink() result, so if it fails, the allocation should fail > > and the invariant should not be broken. > > Code from 20.11.5: > > > > if (rte_eal_process_type() == RTE_PROC_PRIMARY && > > unlink(path) == -1 && > > errno != ENOENT) { > > RTE_LOG(DEBUG, EAL, "%s(): could not remove '%s': %s\n", > > __func__, path, strerror(errno)); > > return -1; > > } > > > > Can SELinux restriction result in errno == ENOENT? > > I'd expect EPERM/EACCESS. > > Thanks for your info, the selinux is disabled on my server. Also I > checked that the selinux fix is already in my dpdk. Could any other > settings may cause dirty memory? If you can think of any thing related, > I can have a try. > > BTW, this is my nic info: > ``` > Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02) > > driver: ice > version: 1.9.3 > firmware-version: 2.30 0x80005d22 1.2877.0 > expansion-rom-version: > bus-info: 0000:3b:00.1 > supports-statistics: yes > supports-test: yes > supports-eeprom-access: yes > supports-register-dump: yes > supports-priv-flags: yes > ```
update with more debugs: Preparation: 1. set hugepage size to 2 GB. ``` [root@gz15-compute-s3-55e247e16e22 huge]# grep -i huge /proc/meminfo AnonHugePages: 124928 kB ShmemHugePages: 0 kB HugePages_Total: 2 HugePages_Free: 2 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB Hugetlb: 2097152 kB ``` 2. make a simple programe to poison memory ```c #include <stdio.h> #include <sys/mman.h> #include <string.h> static int memvcmp(void *memory, unsigned char val, size_t size) { unsigned char *mm = (unsigned char*)memory; return (*mm == val) && memcmp(mm, mm + 1, size - 1) == 0; } int main(int argc, char *argv[]){ size_t size = 2 * (1 << 30)-1; void *ptr2 = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0); if (! ptr2) { printf("failed to allocted mm"); return 0; } if (argc > 1) { memset(ptr2, 0xff, size); } unsigned char * ss = ptr2; printf("ss: %x\n", *ss); if (memvcmp(ptr2, 0, size)){ printf("all zero\n"); } else { printf("not all zero\n"); } } ``` 3. insert debug info to check if memory all zero ``` diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c index 5a09247a6..026560333 100644 --- a/lib/librte_eal/common/malloc_heap.c +++ b/lib/librte_eal/common/malloc_heap.c @@ -91,16 +91,32 @@ malloc_socket_to_heap_id(unsigned int socket_id) /* * Expand the heap with a memory area. */ +static int memvcmp(void *memory, unsigned char val, size_t size) +{ + unsigned char *mm = (unsigned char*)memory; + return (*mm == val) && memcmp(mm, mm + 1, size - 1) == 0; +} static struct malloc_elem * malloc_heap_add_memory(struct malloc_heap *heap, struct rte_memseg_list *msl, void *start, size_t len) { struct malloc_elem *elem = start; + void *ptr; + size_t data_len; + malloc_elem_init(elem, heap, msl, len, elem, len); malloc_elem_insert(elem); + ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN); + data_len = elem->size - MALLOC_ELEM_OVERHEAD; + if (memvcmp(ptr, 0, data_len)){ + RTE_LOG(ERR, EAL, "liiiiiiilog: all zero\n"); + } else { + RTE_LOG(ERR, EAL, "liiiiiiilog: not all zero\n"); + } + elem = malloc_elem_join_adjacent_free(elem); malloc_elem_free_list_insert(elem); ``` debug steps: 1. poison 2GB memory ``` [root@gz15-compute-s3-55e247e16e22 secure]# rm -rf /dev/hugepages/rtemap_* ; huge/a.out 1 ss: ff not all zero ``` 2. Run testpmd(with no nic bind vfio-pci) ``` [root@gz15-compute-s3-55e247e16e22 secure]# dpdk-testpmd -l 0-3 -n 4 -- -i --nb-cores=3 EAL: Detected 64 lcore(s) EAL: Detected 2 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' EAL: Probing VFIO support... EAL: VFIO support initialized EAL: liiiiiiilog: not all zero EAL: No legacy callbacks, legacy socket not created testpmd: No probed ethernet devices Interactive-mode selected testpmd: create a new mbuf pool <mb_pool_0>: n=171456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc testpmd: create a new mbuf pool <mb_pool_1>: n=171456, size=2176, socket=1 testpmd: preferred mempool ops selected: ring_mp_mc EAL: liiiiiiilog: not all zero Done testpmd> ``` Dirty memory happens even no nic probe. I tried on two CPUs, the same issue. - Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz - Intel(R) Xeon(R) Platinum 8378A CPU @ 3.00GHz