On 19.07.2018 12:48, Burakov, Anatoly wrote:
On 19-Jul-18 10:01 AM, Burakov, Anatoly wrote:
On 18-Jul-18 9:58 PM, Stephen Hemminger wrote:
On Wed, 18 Jul 2018 22:52:12 +0300
Andrew Rybchenko <arybche...@solarflare.com> wrote:
On 18.07.2018 20:18, Burakov, Anatoly wrote:
On 18-Jul-18 4:20 PM, Andrew Rybchenko wrote:
Hi Anatoly,
I'm investigating issue which finally comes to the fact that memory
allocated using
rte_zmalloc() has non zeros.
If I add memset just after allocation, everything is perfect and
works fine.
I've found out that memset was removed from rte_zmalloc_socket()
some
time ago:
>>>
commit b78c9175118f7d61022ddc5c62ce54a1bd73cea5
Author: Sergio Gonzalez Monroy <sergio.gonzalez.mon...@intel.com>
Date: Tue Jul 5 12:01:16 2016 +0100
mem: do not zero out memory on zmalloc
Zeroing out memory on rte_zmalloc_socket is not required
anymore
since all
allocated memory is already zeroed.
Signed-off-by: Sergio Gonzalez Monroy
<sergio.gonzalez.mon...@intel.com>
<<<
but may be something has changed now that made above statement
false.
I observe the problem when memory is reallocated. I.e. I configure 7
queues,
start, stop, reconfigure to 3 queues, start. Memory is allocated on
start and
freed on stop, since we have less queues on the second start it is
allocated
Andrew Rybchenko <arybche...@solarflare.com>
in a different way and reuses previously allocated/freed memory.
Do you have any ideas what could be wrong?
Andrew.
Hi Andrew,
I will look into it first thing tomorrow. In general, we memset(0) on
free, and kernel gives us zeroed out pages initially, so the most
likely point of failure is that i'm not overwring some malloc headers
correctly on free.
OK, at least now I know how it is supposed to work in theory.
The following region was allocated (the second number below is
pointer
plus size)
ALLOC 0x7fffa3264080-0x7fffa32640b8
Not zerod address is 16 bytes before:
(gdb) p/x ((uint64_t *)0x7fffa3264070)[0]
$4 = 0x4000000002
(gdb) p/x ((uint64_t *)0x7fffa3264070)[1]
$5 = 0x80
then freed
FREE 0x7fffa3264080-0x7fffa32640b8
but above values (gdb) are still the same
then it is allocated as the part of bigger memory chunk
ALLOC 0x7fffa3245b80-0x7fffa3265fd8
which should contain zeros, but above values are still the same.
It is interesting that it looks like it was the first block freed
on the
port stop. I'm not 100% sure since I've put printouts to my allocation
wrapper, not EAL.
Many thanks,
Andrew.
memset here is what is supposed to clear the data.
struct malloc_elem *
malloc_elem_free(struct malloc_elem *elem)
{
void *ptr;
size_t data_len;
ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN + elem->pad);
data_len = elem->size - elem->pad - MALLOC_ELEM_OVERHEAD;
elem = malloc_elem_join_adjacent_free(elem);
malloc_elem_free_list_insert(elem);
elem->pad = 0;
/* decrease heap's count of allocated elements */
elem->heap->alloc_count--;
memset(ptr, 0, data_len);
Maybe data_len is not correct either because of bug, or your
application clobbered
the malloc reserved regions in the element.
More likely, gcc is incorrectly optimizing this away.
https://wiki.sei.cmu.edu/confluence/display/c/MSC06-C.+Beware+of+compiler+optimizations
https://www.cryptologie.net/article/419/zeroing-memory-compiler-optimizations-and-memset_s/
I tend to be very wary of blaming the compiler without exhausting any
other possibilities :) It used to work before without issues, so
presumably whatever is happening, our memset works correctly.
Andrew, you write:
<snip>
ALLOC 0x7fffa3264080-0x7fffa32640b8
Not zerod address is 16 bytes before:
<snip>
Of course the memory *before* your pointer would not be zero - it is
preceded by a 64-byte malloc header, so what you're seeing is the
malloc header data (which doesn't go away if you free it - it will go
away only if it is merged with an adjacent free malloc element). So,
i'm failing to see which problem you're describing, given that all
memory regions that are supposedly not free lie outside of your
malloc-allocated memory.
I tried to highlight that non-zeroed bytes belong to malloc header of
the previously allocated memory region. Later it becomes memory
allocated region itself (significantly bigger, so merges happened):
>>>
then it is allocated as the part of bigger memory chunk
ALLOC 0x7fffa3245b80-0x7fffa3265fd8
which should contain zeros, but above values are still the same.
<<<
However, after careful analysis, i can see that there is one
possibility where memory is not zeroed on free - if the original
malloc element was padded, and there aren't any more adjacent free
elements, then newly allocated memory may contain old pad header.
I'll submit a patch for you to try shortly.
Patch:
http://patches.dpdk.org/patch/43196/
Yes, the patch fixes the problem I've observed. At least it passes
simple test which I used for debugging.
I'll run more automated tests tonight.
Many thanks,
Andrew.