Hello Olivier,

On Wednesday 29 March 2017 01:48 PM, Olivier Matz wrote:
On Tue, 28 Mar 2017 17:12:47 +0530, Shreyansh Jain <shreyansh.j...@nxp.com> 
wrote:
Hello Olivier,

On Friday 24 March 2017 09:52 PM, Olivier Matz wrote:
[..]

I tried to pass the mempool autotest, and it issues a segfault.
I think the libraries are missing in rte.app.mk, so no handler is
registered.

I have been trying to simulate the segfault that you are referring to
above. But, I think it should not be the case. If a mempool handler is
not registered (as librte_mempool_ring was not included in
mk/rte.app.mk, so, no "ring_mp_mc"), the caller would get error.

The mempool_autotest is reporting:

--->8--
RTE>>mempool_autotest
cannot allocate mp_nocache mempool
Test Failed
--->8--

Here are the reproduction steps:


git clone http://dpdk.org/git/dpdk
cd dpdk/
wget -O - http://dpdk.org/dev/patchwork/patch/21986/mbox | git am -
wget -O - http://dpdk.org/dev/patchwork/patch/21985/mbox | git am -
make config T=x86_64-native-linuxapp-gcc
make -j32 test-build
echo 128 > 
/sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
mkdir -p /mnt/huge
mount -t hugetlbfs none /mnt/huge
echo mempool_autotest | ./build/app/test --
# segfault

Thanks for the steps. I was able to reproduce this.
Don't know why it didn't work earlier.
one more comment below...


# replay with debug
make -j32 test-build
make -j32 EXTRA_CFLAGS="-O0 -g" test-build
ulimit -c unlimited
echo mempool_autotest | ./build/app/test --
# segfault + core dump
gdb -c core ./build/app/test

(gdb) bt
#1  0x000000000064dead in rte_mempool_ops_alloc (mp=0x7f8816abdb40)
    at /root/dpdk/lib/librte_mempool/rte_mempool_ops.c:101
#2  0x000000000064c1e7 in rte_mempool_populate_phys (mp=0x7f8816abdb40,
    vaddr=0x7f880987a800 <error: Cannot access memory at address 
0x7f880987a800>,
    paddr=6958852096, len=26761152, free_cb=0x64c032 
<rte_mempool_memchunk_mz_free>,
    opaque=0x7f8822334d4c) at /root/dpdk/lib/librte_mempool/rte_mempool.c:359
#3  0x000000000064c9db in rte_mempool_populate_default (mp=0x7f8816abdb40)
    at /root/dpdk/lib/librte_mempool/rte_mempool.c:572

I think adding the code that you suggested is not the right place. The problem is not in rte_mempool_ops_alloc, where you had suggested the check for NULL.

The problem is in rte_mempool_create where return value for rte_mempool_set_ops_byname is not being checked.

When the libraries are not statically compiled in, rte_mempool_set_ops_byname is returning NULL, which rte_mempool_create doesn't handle and goes on to call rte_mempool_ops_alloc - eventually segfaulting.


#4  0x000000000064d3d4 in rte_mempool_create (name=0x9b1ff0 "test_nocache", 
n=12671,
    elt_size=2048, cache_size=0, private_data_size=0, mp_init=0x0, 
mp_init_arg=0x0,
    obj_init=0x49f309 <my_obj_init>, obj_init_arg=0x0, socket_id=-1, flags=0)
    at /root/dpdk/lib/librte_mempool/rte_mempool.c:895
#5  0x00000000004a20ed in test_mempool () at 
/root/dpdk/test/test/test_mempool.c:519
#6  0x0000000000435189 in cmd_autotest_parsed (parsed_result=0x7ffe55006420,
    cl=0x7c87090, data=0x0) at /root/dpdk/test/test/commands.c:103
#7  0x00000000006749df in cmdline_parse (cl=0x7c87090,
    buf=0x7c870d8 "mempool_autotest\n")
    at /root/dpdk/lib/librte_cmdline/cmdline_parse.c:359
(gdb) up
#1  0x000000000064dead in rte_mempool_ops_alloc (mp=0x7f8816abdb40)
    at /root/dpdk/lib/librte_mempool/rte_mempool_ops.c:101
101             return ops->alloc(mp);
(gdb) print ops
$1 = (struct rte_mempool_ops *) 0x4e69c00 <rte_mempool_ops_table+64>
(gdb) print *ops
$2 = {name = '\000' <repeats 31 times>, alloc = 0x0, free = 0x0, enqueue = 0x0,
  dequeue = 0x0, get_count = 0x0}


Regards,
Olivier




Adding the following code in lib/librte_mempool/rte_mempool_ops.c
fixes the crash.

        ops = rte_mempool_get_ops(mp->ops_index);
+       if (ops == NULL || ops->alloc == NULL)
+               return -ENOTSUP;
        return ops->alloc(mp);

If you think above explanation suffices, I will push a patch for error handling in rte_mempool_create returned by rte_mempool_set_ops_byname rather than above change originally suggested by you.


Can you tell me for which case did your code reach
rte_mempool_ops_alloc() and segfault?

In my case, librte_mempool_ring and librte_mempool_stack are not added
to mk/rte.app.mk and it is static compilation.


Now that drivers are not linked to the mempool library, it can
happen that there is no handler. Could you please add this patch in your
patchset?

Yes, once I can get this issue reproduced. Because I think there is one
more place similar code should go (rte_mempool_ops_getcount).
As per what I can see, this would only happen if rte_mempool_xmem_create
is called and then directly alloc is called. That is not happening for
mempool_autotest.

-
Shreyansh




Reply via email to