Re: [PATCH] eal: zero out new added memory

2022-09-03 Thread lic121
On Tue, Aug 30, 2022 at 12:53:37PM +, lic121 wrote:
> On Tue, Aug 30, 2022 at 01:59:16PM +0300, Dmitry Kozlyuk wrote:
> > Thank you for the most detailed info!
> > 
> > 1. If you run the poisoner program the second time,
> >does it also see dirty memory immediately after mmap()?
> 
> No, running the poisoner program again doesn't show dirty memory
> immediately after mmap.
> 
> I assume that there are some differences on how my poisoner program
> using hugepage and how testpmd using hugepage. I notice that testpmd
> leaves some files under /dev/hugepages/rtemap_xxx even after testpmd
> process exits. But my program didn't create any file under
> /dev/hugepages/.
> > 
> > 2. Kernel 4.19.90-2102 patchlevel 2102 is very high,
> >can there be any unusual patches applied?
> >Your host has "compute" in its name,
> >can it have patches that trade security for performance?
> 
> I may need to talk to the kernel team. Indeed, I failed to reproduce the
> issue on a Virtual Machine of kernel 4.18.0-305.12.1.el8_4.x86_64.(2M
> page size insetad of 1G)

Verified that this issue is caused by an down stream linux kernel issue.
Thanks for your attention.


Re: [PATCH] eal: zero out new added memory

2022-08-27 Thread lic121
On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> 2022-08-27 09:25 (UTC+), chengt...@qq.com:
> > From: lic121 
> > 
> > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > zero oute allocaed memory. Because memory are zeroed out when free
> > in malloc_elem_free(). But seems the initial allocated memory is
> > not zeroed out as expected.
> > 
> > This patch zero out initial allocated memory in
> > malloc_heap_add_memory().
> > 
> > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > this problem.
> > ```
> >   Stack trace of thread 412780:
> >   #0  0x00e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> >   #1  0x00e4df54 ecore_get_hw_info (dpdk-testpmd)
> >   #2  0x00e504aa ecore_hw_prepare (dpdk-testpmd)
> >   #3  0x00e8a7ca qed_probe (dpdk-testpmd)
> >   #4  0x00e83c59 qede_common_dev_init (dpdk-testpmd)
> >   #5  0x00e84c8e qede_eth_dev_init (dpdk-testpmd)
> >   #6  0x009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> >   #7  0x009734e3 rte_bus_probe (dpdk-testpmd)
> >   #8  0x009933bd rte_eal_init (dpdk-testpmd)
> >   #9  0x0041768f main (dpdk-testpmd)
> >   #10 0x7f41a7001b17 __libc_start_main (libc.so.6)
> >   #11 0x0067e34a _start (dpdk-testpmd)
> > ```
> > 
> > Signed-off-by: lic121 
> > ---
> >  lib/librte_eal/common/malloc_heap.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/lib/librte_eal/common/malloc_heap.c 
> > b/lib/librte_eal/common/malloc_heap.c
> > index f4e20ea..1607401 100644
> > --- a/lib/librte_eal/common/malloc_heap.c
> > +++ b/lib/librte_eal/common/malloc_heap.c
> > @@ -96,11 +96,19 @@
> > void *start, size_t len)
> >  {
> > struct malloc_elem *elem = start;
> > +   void *ptr;
> > +   size_t data_len
> > +
> >  
> > malloc_elem_init(elem, heap, msl, len, elem, len);
> >  
> > malloc_elem_insert(elem);
> >  
> > +   /* Zero out new added memory. */
> > +   *ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > +   data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > +   memset(ptr, 0, data_len);
> > +
> > elem = malloc_elem_join_adjacent_free(elem);
> >  
> > malloc_elem_free_list_insert(elem);
> 
> Hi,
> 
> The kernel ensures that the newly mapped memory is zeroed,
> and DPDK ensures that files in hugetlbfs are not re-mapped.
> What makes you think that it is not zeroed?
> Were you able to catch [start; start+len) contain non-zero bytes
> at the start of this function?
> If so, is it system memory (not an external heap)?
> If so, what is the CPU, kernel, any custom settings?
> 
> Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?
> 
> This patch cannot be accepted as-is anyway:
> 1. It zeroes memory even if the code was called not via rte_zmalloc().
> 2. It leads to zeroing on both alloc and free, which is suboptimal.

Hi Dmitry, thanks for the review.

In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
I printed
the content in gdb. It's not zero.

print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info

cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
kernel: 4.19.90-2102

[1]
https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93



Re: [PATCH] eal: zero out new added memory

2022-08-28 Thread lic121
On Sat, Aug 27, 2022 at 05:56:54PM +0300, Dmitry Kozlyuk wrote:
> 2022-08-27 13:31 (UTC+), lic121:
> > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > > 2022-08-27 09:25 (UTC+), chengt...@qq.com:  
> > > > From: lic121 
> > > > 
> > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > > > zero oute allocaed memory. Because memory are zeroed out when free
> > > > in malloc_elem_free(). But seems the initial allocated memory is
> > > > not zeroed out as expected.
> > > > 
> > > > This patch zero out initial allocated memory in
> > > > malloc_heap_add_memory().
> > > > 
> > > > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > > > this problem.
> > > > ```
> > > >   Stack trace of thread 412780:
> > > >   #0  0x00e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> > > >   #1  0x00e4df54 ecore_get_hw_info (dpdk-testpmd)
> > > >   #2  0x00e504aa ecore_hw_prepare (dpdk-testpmd)
> > > >   #3  0x00e8a7ca qed_probe (dpdk-testpmd)
> > > >   #4  0x00e83c59 qede_common_dev_init (dpdk-testpmd)
> > > >   #5  0x00e84c8e qede_eth_dev_init (dpdk-testpmd)
> > > >   #6  0x009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> > > >   #7  0x009734e3 rte_bus_probe (dpdk-testpmd)
> > > >   #8  0x009933bd rte_eal_init (dpdk-testpmd)
> > > >   #9  0x0041768f main (dpdk-testpmd)
> > > >   #10 0x7f41a7001b17 __libc_start_main (libc.so.6)
> > > >   #11 0x0067e34a _start (dpdk-testpmd)
> > > > ```
> > > > 
> > > > Signed-off-by: lic121 
> > > > ---
> > > >  lib/librte_eal/common/malloc_heap.c | 8 
> > > >  1 file changed, 8 insertions(+)
> > > > 
> > > > diff --git a/lib/librte_eal/common/malloc_heap.c 
> > > > b/lib/librte_eal/common/malloc_heap.c
> > > > index f4e20ea..1607401 100644
> > > > --- a/lib/librte_eal/common/malloc_heap.c
> > > > +++ b/lib/librte_eal/common/malloc_heap.c
> > > > @@ -96,11 +96,19 @@
> > > > void *start, size_t len)
> > > >  {
> > > > struct malloc_elem *elem = start;
> > > > +   void *ptr;
> > > > +   size_t data_len
> > > > +
> > > >  
> > > > malloc_elem_init(elem, heap, msl, len, elem, len);
> > > >  
> > > > malloc_elem_insert(elem);
> > > >  
> > > > +   /* Zero out new added memory. */
> > > > +   *ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > > > +   data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > > > +   memset(ptr, 0, data_len);
> > > > +
> > > > elem = malloc_elem_join_adjacent_free(elem);
> > > >  
> > > > malloc_elem_free_list_insert(elem);  
> > > 
> > > Hi,
> > > 
> > > The kernel ensures that the newly mapped memory is zeroed,
> > > and DPDK ensures that files in hugetlbfs are not re-mapped.
> > > What makes you think that it is not zeroed?
> > > Were you able to catch [start; start+len) contain non-zero bytes
> > > at the start of this function?
> > > If so, is it system memory (not an external heap)?
> > > If so, what is the CPU, kernel, any custom settings?
> > > 
> > > Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?
> > > 
> > > This patch cannot be accepted as-is anyway:
> > > 1. It zeroes memory even if the code was called not via rte_zmalloc().
> > > 2. It leads to zeroing on both alloc and free, which is suboptimal.  
> > 
> > Hi Dmitry, thanks for the review.
> > 
> > In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
> > I printed
> > the content in gdb. It's not zero.
> > 
> > print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info
> > 
> > cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
> > kernel: 4.19.90-2102
> > 
> > [1]
> > https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93
> 
> Sorry, it seems that something is wrong with your debug.
> Your link is for 20.11.0.
> In 20.11.5 (apparently always) struct qede_dev::edev is not a pointer [2].
> Even if it was, in z

Re: [PATCH] eal: zero out new added memory

2022-08-29 Thread lic121
On Mon, Aug 29, 2022 at 01:18:36AM +, lic121 wrote:
> On Sat, Aug 27, 2022 at 05:56:54PM +0300, Dmitry Kozlyuk wrote:
> > 2022-08-27 13:31 (UTC+), lic121:
> > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > > > 2022-08-27 09:25 (UTC+), chengt...@qq.com:  
> > > > > From: lic121 
> > > > > 
> > > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > > > > zero oute allocaed memory. Because memory are zeroed out when free
> > > > > in malloc_elem_free(). But seems the initial allocated memory is
> > > > > not zeroed out as expected.
> > > > > 
> > > > > This patch zero out initial allocated memory in
> > > > > malloc_heap_add_memory().
> > > > > 
> > > > > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > > > > this problem.
> > > > > ```
> > > > >   Stack trace of thread 412780:
> > > > >   #0  0x00e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> > > > >   #1  0x00e4df54 ecore_get_hw_info (dpdk-testpmd)
> > > > >   #2  0x00e504aa ecore_hw_prepare (dpdk-testpmd)
> > > > >   #3  0x00e8a7ca qed_probe (dpdk-testpmd)
> > > > >   #4  0x00e83c59 qede_common_dev_init (dpdk-testpmd)
> > > > >   #5  0x00e84c8e qede_eth_dev_init (dpdk-testpmd)
> > > > >   #6  0x009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> > > > >   #7  0x009734e3 rte_bus_probe (dpdk-testpmd)
> > > > >   #8  0x009933bd rte_eal_init (dpdk-testpmd)
> > > > >   #9  0x0041768f main (dpdk-testpmd)
> > > > >   #10 0x7f41a7001b17 __libc_start_main (libc.so.6)
> > > > >   #11 0x0067e34a _start (dpdk-testpmd)
> > > > > ```
> > > > > 
> > > > > Signed-off-by: lic121 
> > > > > ---
> > > > >  lib/librte_eal/common/malloc_heap.c | 8 
> > > > >  1 file changed, 8 insertions(+)
> > > > > 
> > > > > diff --git a/lib/librte_eal/common/malloc_heap.c 
> > > > > b/lib/librte_eal/common/malloc_heap.c
> > > > > index f4e20ea..1607401 100644
> > > > > --- a/lib/librte_eal/common/malloc_heap.c
> > > > > +++ b/lib/librte_eal/common/malloc_heap.c
> > > > > @@ -96,11 +96,19 @@
> > > > >   void *start, size_t len)
> > > > >  {
> > > > >   struct malloc_elem *elem = start;
> > > > > + void *ptr;
> > > > > + size_t data_len
> > > > > +
> > > > >  
> > > > >   malloc_elem_init(elem, heap, msl, len, elem, len);
> > > > >  
> > > > >   malloc_elem_insert(elem);
> > > > >  
> > > > > + /* Zero out new added memory. */
> > > > > + *ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > > > > + data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > > > > + memset(ptr, 0, data_len);
> > > > > +
> > > > >   elem = malloc_elem_join_adjacent_free(elem);
> > > > >  
> > > > >   malloc_elem_free_list_insert(elem);  
> > > > 
> > > > Hi,
> > > > 
> > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > and DPDK ensures that files in hugetlbfs are not re-mapped.
> > > > What makes you think that it is not zeroed?
> > > > Were you able to catch [start; start+len) contain non-zero bytes
> > > > at the start of this function?
> > > > If so, is it system memory (not an external heap)?
> > > > If so, what is the CPU, kernel, any custom settings?
> > > > 
> > > > Can it be the PMD or the app that uses rte_malloc instead of 
> > > > rte_zmalloc?
> > > > 
> > > > This patch cannot be accepted as-is anyway:
> > > > 1. It zeroes memory even if the code was called not via rte_zmalloc().
> > > > 2. It leads to zeroing on both alloc and free, which is suboptimal.  
> > > 
> > > Hi Dmitry, thanks for the review.
> > > 
> > > In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
> > > I printed
> > > the content in gdb. It's not zero.
> > > 
> > > print ((struct qed

Re: [PATCH] eal: zero out new added memory

2022-08-29 Thread lic121
On Mon, Aug 29, 2022 at 03:49:25PM +0300, Dmitry Kozlyuk wrote:
> 2022-08-29 14:37 (UTC+0200), Morten Brørup:
> > > From: David Marchand [mailto:david.march...@redhat.com]
> > > Sent: Monday, 29 August 2022 13.58
> > >
> > > > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:  
> > > > > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.  
> > 
> > David, are you suggesting that this invariant - guaranteeing that DPDK 
> > memory is zeroed - was violated by SELinux in the SELinux/container issue 
> > you were tracking?
> > 
> > If so, the method to ensure the invariant is faulty for SELinux. Assuming 
> > DPDK supports SELinux, this bug should be fixed.
> 
> +1, I'd like to know more about that case.
> 
> EAL checks the unlink() result, so if it fails, the allocation should fail
> and the invariant should not be broken.
> Code from 20.11.5:
> 
>   if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
>   unlink(path) == -1 &&
>   errno != ENOENT) {
>   RTE_LOG(DEBUG, EAL, "%s(): could not remove '%s': %s\n",
>   __func__, path, strerror(errno));
>   return -1;
>   }
> 
> Can SELinux restriction result in errno == ENOENT?
> I'd expect EPERM/EACCESS.

Thanks for your info, the selinux is disabled on my server. Also I
checked that the selinux fix is already in my dpdk. Could any other
settings may cause dirty memory? If you can think of any thing related,
I can have a try.

BTW, this is my nic info:
```
Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)

driver: ice
version: 1.9.3
firmware-version: 2.30 0x80005d22 1.2877.0
expansion-rom-version:
bus-info: :3b:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
```


Re: [PATCH] eal: zero out new added memory

2022-08-30 Thread lic121
On Tue, Aug 30, 2022 at 01:11:25AM +, lic121 wrote:
> On Mon, Aug 29, 2022 at 03:49:25PM +0300, Dmitry Kozlyuk wrote:
> > 2022-08-29 14:37 (UTC+0200), Morten Brørup:
> > > > From: David Marchand [mailto:david.march...@redhat.com]
> > > > Sent: Monday, 29 August 2022 13.58
> > > >
> > > > > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote: 
> > > > > > > >  
> > > > > > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.  
> > > 
> > > David, are you suggesting that this invariant - guaranteeing that DPDK 
> > > memory is zeroed - was violated by SELinux in the SELinux/container issue 
> > > you were tracking?
> > > 
> > > If so, the method to ensure the invariant is faulty for SELinux. Assuming 
> > > DPDK supports SELinux, this bug should be fixed.
> > 
> > +1, I'd like to know more about that case.
> > 
> > EAL checks the unlink() result, so if it fails, the allocation should fail
> > and the invariant should not be broken.
> > Code from 20.11.5:
> > 
> > if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> > unlink(path) == -1 &&
> > errno != ENOENT) {
> > RTE_LOG(DEBUG, EAL, "%s(): could not remove '%s': %s\n",
> > __func__, path, strerror(errno));
> > return -1;
> > }
> > 
> > Can SELinux restriction result in errno == ENOENT?
> > I'd expect EPERM/EACCESS.
> 
> Thanks for your info, the selinux is disabled on my server. Also I
> checked that the selinux fix is already in my dpdk. Could any other
> settings may cause dirty memory? If you can think of any thing related,
> I can have a try.
> 
> BTW, this is my nic info:
> ```
> Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
> 
> driver: ice
> version: 1.9.3
> firmware-version: 2.30 0x80005d22 1.2877.0
> expansion-rom-version:
> bus-info: :3b:00.1
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
> ```


update with more debugs:

Preparation:
1. set hugepage size to 2 GB.
```
[root@gz15-compute-s3-55e247e16e22 huge]# grep -i huge /proc/meminfo
AnonHugePages:124928 kB
ShmemHugePages:0 kB
HugePages_Total:   2
HugePages_Free:2
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:1048576 kB
Hugetlb: 2097152 kB
```

2. make a simple programe to poison memory
```c
#include 
#include 
#include 

static int memvcmp(void *memory, unsigned char val, size_t size)
{
unsigned char *mm = (unsigned char*)memory;
return (*mm == val) && memcmp(mm, mm + 1, size - 1) == 0;
}

int main(int argc, char *argv[]){
size_t size = 2 * (1 << 30)-1;
void *ptr2 = mmap(NULL,  size,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS |
MAP_HUGETLB, -1, 0);
if (! ptr2) {
printf("failed to allocted mm");
return 0;
}
if (argc > 1) {
memset(ptr2, 0xff, size);
}
unsigned char * ss = ptr2;
printf("ss: %x\n", *ss);
if (memvcmp(ptr2, 0, size)){
printf("all zero\n");
} else {
printf("not all zero\n");
}
}
```

3. insert debug info to check if memory all zero
```
diff --git a/lib/librte_eal/common/malloc_heap.c
b/lib/librte_eal/common/malloc_heap.c
index 5a09247a6..026560333 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -91,16 +91,32 @@ malloc_socket_to_heap_id(unsigned int socket_id)
 /*
  * Expand the heap with a memory area.
  */
+static int memvcmp(void *memory, unsigned char val, size_t size)
+{
+unsigned char *mm = (unsigned char*)memory;
+return (*mm == val) && memcmp(mm, mm + 1, size - 1) == 0;
+}
 static struct malloc_elem *
 malloc_heap_add_memory(struct malloc_heap *heap, struct rte_memseg_list
*msl,
void *start, size_t len)
 {
struct malloc_elem *elem = start;
+   void *ptr;
+   size_t data_len;
+

malloc_elem_init(elem, heap, msl, len, elem, len);

malloc_elem_insert(elem);

+   ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
+   data_len = elem->size - MALLOC_ELEM_OVERHEAD;
+if (memvcmp(ptr, 0, data_len)){
+RTE_LOG(ERR, EAL, "liiilog: all zero\n");
+} else {
+RTE_LOG(ERR, EAL, "liiilog: not all zero\n");
+}
+

Re: [PATCH] eal: zero out new added memory

2022-08-30 Thread lic121
On Tue, Aug 30, 2022 at 01:59:16PM +0300, Dmitry Kozlyuk wrote:
> Thank you for the most detailed info!
> 
> 1. If you run the poisoner program the second time,
>does it also see dirty memory immediately after mmap()?

If I run the poisoner program the second time, no dirty memory.

There could be some difference on how my poisoner program and how
testpmd using mmap. Because I notice that testpmd leaves the hugepage
files under /dev/hugepage/rtemap_xxx even after testpmd exits. But my
poisoner program didn't create any file under /dev/hugepage.

> 
> 2. Kernel 4.19.90-2102 patchlevel 2102 is very high,
>can there be any unusual patches applied?
>Your host has "compute" in its name,
>can it have patches that trade security for performance?

I may need to talk with the kernel team on this. Indeed, I failed to
reprouce the issue on a virtuam machine of kernel
4.18.0-305.12.1.el8_4.x86_64(2M page size instead of 1G). 


Re: [PATCH] eal: zero out new added memory

2022-08-30 Thread lic121
On Tue, Aug 30, 2022 at 01:59:16PM +0300, Dmitry Kozlyuk wrote:
> Thank you for the most detailed info!
> 
> 1. If you run the poisoner program the second time,
>does it also see dirty memory immediately after mmap()?

No, running the poisoner program again doesn't show dirty memory
immediately after mmap.

I assume that there are some differences on how my poisoner program
using hugepage and how testpmd using hugepage. I notice that testpmd
leaves some files under /dev/hugepages/rtemap_xxx even after testpmd
process exits. But my program didn't create any file under
/dev/hugepages/.
> 
> 2. Kernel 4.19.90-2102 patchlevel 2102 is very high,
>can there be any unusual patches applied?
>Your host has "compute" in its name,
>can it have patches that trade security for performance?

I may need to talk to the kernel team. Indeed, I failed to reproduce the
issue on a Virtual Machine of kernel 4.18.0-305.12.1.el8_4.x86_64.(2M
page size insetad of 1G)