Hi Damjan,Dave I have tried running (mheap validation )CLI "test heap-validate now" and it leads to crash that means mheap is corrupted when system is coming up.
Then I have created a mheap_validation function and put the same in various places , just to check which code leg is causing mheap corruption. I found below code which is called for 12 Workers is causing issue once "fm->fp_per_worker_sessions" is "1M" , but works fine when we set it to 524288. mheap_validation_check(); pool_alloc_aligned(wk->fp_sessions_pool, fm->fp_per_worker_sessions , CLIB_CACHE_LINE_BYTES); mheap_validation_check(); --> panic here Any suggestion/advise is really helpful. Thanks, Chetan Bhasin On Tue, Jan 29, 2019 at 5:35 PM Damjan Marion <dmar...@me.com> wrote: > Please search this mailing list archive, Dave provided some hints some > time ago.... > > 90M is not terribly high, but it can also be victim of something else > holding memory. > > > On 29 Jan 2019, at 12:54, chetan bhasin <chetan.bhasin...@gmail.com> > wrote: > > Hi Damjan, > > Thanks for the reply. > > what should be a typical way of debugging a corrupt vector pointer eg. can > we set a watchpoint on some field in vector header which will most > likelygetting disturbed so that we can nab who is corrupting the vector. > > With 1M entries do you think 90M is an issue. > > > Clearly we have a lurking bug somewhere. > > Thanks, > Chetan Bhasin > > > On Tue, Jan 29, 2019, 16:53 Damjan Marion <dmar...@me.com wrote: > >> >> typically this happens when you run out of memory / main heap size or you >> have corrupted vector pointer.. >> >> It will be easier to read your traceback if it is captured with debug >> image, but according to frame 11, your vector is already 90MB big. >> Is this expected to be? >> >> >> On 29 Jan 2019, at 11:31, chetan bhasin <chetan.bhasin...@gmail.com> >> wrote: >> >> Hello Everyone, I know 18.01 is not supported now , but just want to >> understand what could be the reason for the below crash, we are adding >> entries in pool using pool_get_alligned which is causing vec_resize. This >> issue comes when reaches around 1M entries. Whether it is due to limited >> memory or some memory corruption or something else? Core was generated by >> `bin/vpp >> -c co'. >> Program terminated with signal 6, Aborted. >> #0 0x00002ab534028207 in __GI_raise (sig=sig@entry=6) at >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >> 56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); >> Missing separate debuginfos, use: debuginfo-install >> OPWVmepCR-7.0-el7.x86_64 >> (gdb) bt >> #0 0x00002ab534028207 in __GI_raise (sig=sig@entry=6) at >> ../nptl/sysdeps/unix/sysv/linux/raise.c:56 >> #1 0x00002ab5340298f8 in __GI_abort () at abort.c:90 >> #2 0x0000000000405ea9 in os_panic () at >> /bfs-build/build-area.42/builds/LinuxNBngp_7.X_RH7/2019-01-07-2044/third-party/vpp/vpp_1801/build-data/../src/vpp/vnet/main.c:266 >> #3 0x00002ab53213aad9 in unix_signal_handler (signum=<optimized out>, >> si=<optimized out>, uc=<optimized out>) >> at vpp/vpp_1801/build-data/../src/vlib/unix/main.c:126 >> #4 <signal handler called> >> #5 _mm_storeu_si128 (__B=..., __P=<optimized out>) at >> /usr/lib/gcc/x86_64-redhat-linux/4.8.5/include/emmintrin.h:702 >> #6 clib_mov16 (src=<optimized out>, dst=<optimized out>) >> at vpp/vpp_1801/build-data/../src/vppinfra/memcpy_sse3.h:60 >> #7 clib_mov32 (src=<optimized out>, dst=<optimized out>) >> at vpp/vpp_1801/build-data/../src/vppinfra/memcpy_sse3.h:66 >> #8 clib_mov64 (src=0x2ab62d1b04e0 "", dst=0x2ab5426e1fe0 "") >> at vpp/vpp_1801/build-data/../src/vppinfra/memcpy_sse3.h:74 >> #9 clib_mov128 (src=0x2ab62d1b04e0 "", dst=0x2ab5426e1fe0 "") >> at vpp/vpp_1801/build-data/../src/vppinfra/memcpy_sse3.h:80 >> #10 clib_mov256 (src=0x2ab62d1b04e0 "", dst=0x2ab5426e1fe0 "") >> at vpp/vpp_1801/build-data/../src/vppinfra/memcpy_sse3.h:87 >> #11 clib_memcpy (n=90646888, src=0x2ab62d1b04e0, dst=0x2ab5426e1fe0) >> at vpp/vpp_1801/build-data/../src/vppinfra/memcpy_sse3.h:325 >> #12 vec_resize_allocate_memory (v=<optimized out>, >> length_increment=length_increment@entry=1, data_bytes=<optimized out>, >> header_bytes=<optimized out>, header_bytes@entry=48, >> data_align=data_align@entry=64) at >> vpp/vpp_1801/build-data/../src/vppinfra/vec.c:95 >> #13 0x00002ab7b74a61c1 in _vec_resize (data_align=64, header_bytes=48, >> data_bytes=<optimized out>, length_increment=1, v=<optimized out>) >> at include/vppinfra/vec.h:142 >> #14 xxx_allocate_flow (fm=0x2ab7b76c8fc0 <fp_main>) >> atvpp/plugins/src/fastpath/fastpath.c:1502 Regards, Chetan Bhasin >> -=-=-=-=-=-=-=-=-=-=-=- >> Links: You receive all messages sent to this group. >> >> View/Reply Online (#12039): https://lists.fd.io/g/vpp-dev/message/12039 >> Mute This Topic: https://lists.fd.io/mt/29580803/675642 >> Group Owner: vpp-dev+ow...@lists.fd.io >> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [dmar...@me.com] >> -=-=-=-=-=-=-=-=-=-=-=- >> >> >> -- >> Damjan >> >> -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > > View/Reply Online (#12042): https://lists.fd.io/g/vpp-dev/message/12042 > Mute This Topic: https://lists.fd.io/mt/29580803/675642 > Group Owner: vpp-dev+ow...@lists.fd.io > Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [dmar...@me.com] > -=-=-=-=-=-=-=-=-=-=-=- > > > -- > Damjan > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#12292): https://lists.fd.io/g/vpp-dev/message/12292 Mute This Topic: https://lists.fd.io/mt/29580803/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-