I tried some memory leak debugging with jemalloc before. The experience is mentioned here - https://cwiki.apache.org/confluence/display/TS/Presentations+-+2017?preview=/70255385/74684709/ATSSummit_jemalloc.pptx
Perhaps we can then see what components of the code is contributing to the memory buildup. Thanks. Kit On Thu, Mar 15, 2018 at 3:57 PM, Chou, Peter <pbc...@labs.att.com> wrote: > Hi All, > > We have been experiencing a slow memory leak with ATS 6.2.1 which requires a > process restart (before memory exhaust) on the order of weeks. We have tried > to gather information with a debug build (compiled with --enable-debug and > --with-tcmalloc-lib), but the process seems to be unstable and crash after a > couple of days with these build options. Since the heap-check seems to output > only on normal program exit, we have manually stopped the process after a > couple of days with the following results -- > > (pprof) top50 > Total: 737811 objects > 737376 99.9% 99.9% 737376 99.9% ats_malloc > /usr/src/git/trafficserver/lib/ts/ink_memory.cc:59 > 334 0.0% 100.0% 667 0.1% BaseLogFile::open_file > /usr/src/git/trafficserver/lib/ts/BaseLogFile.cc:326 > 59 0.0% 100.0% 59 0.0% ClusterVConnectionCache::init > /usr/src/git/trafficserver/iocore/cluster/ClusterCache.cc:225 > 41 0.0% 100.0% 41 0.0% ats_memalign > /usr/src/git/trafficserver/lib/ts/ink_memory.cc:105 > 1 0.0% 100.0% 2 0.0% BaseLogFile::open_file > /usr/src/git/trafficserver/lib/ts/BaseLogFile.cc:320 > 0 0.0% 100.0% 728003 98.7% AIOCallbackInternal::io_complete > /usr/src/git/trafficserver/iocore/cache/../../iocore/aio/P_AIO.h:117 > ... > 0 0.0% 100.0% 737024 99.9% CacheVC::handleReadDone > /usr/src/git/trafficserver/iocore/cache/Cache.cc:2403 > ... > 0 0.0% 100.0% 737803 100.0% Continuation::handleEvent > /usr/src/git/trafficserver/proxy/../iocore/eventsystem/I_Continuation.h:153 > ... > > -- So does this look like a possible memory leak signature (with the > cache-related AIO)? Appreciate any recommendations and insights for this > issue. > > Thanks, > Peter