Hi Folks,

Based on information provided https://docs.trafficserver.apache.org/en/8.1.x/admin-guide/performance/index.en.html#memory-allocation and with a fixed ram_cache.size setting (32GB), we expected the memory usage to be plateaued a couple of days usage.    This is not however what we saw in multiple production environments. It seemed the memory usage increases steadily overtime, abeilt as a slow pace once the system’s memory usage reaches 80-85% (there aren’t many other processes running on the system), until to a point ATS process is killed by kernel (oom kill) or human intervention (server restart). On a system with 192GB ram (32GB used for RAM disk, and ATS configured to use up to 32GB ram cache), peaking streaming throughput at 10Gbps, ATS has to be killed/restared in about 2 weeks.  At peak hours, there are about 5k-6k client connections and less than 1k upstream connections (to mid tier caches). 

We did some analysis on the Freelist dump (kill -USR1 pid) output (an example is attached) and found the allocated in ioBufAllocator[0-14] slots appeared to be main contributor to the total and also likely to be the source of the increase overtime.  

In terms of configurations and plugin usage,  in addition to ram_cache setting to 32GB, we also changed proxy.config.http.default_buffer_water_mark INT 15000000 (from default 64k) to allow the entire video segment to be buffered on the upstream connection to avoid client starvation issue when the first client comes from a slow draining link, and proxy.config.cache.target_fragment_size INT 4096 to allow upstream chunked responses to be written into cache storage timely.  There is no connection limits (# of connections appeared to be always in the normal range).  The inactivity timeout values are fairly low (<120 secs).The only plugin we used is header_rewrite.so. No https, no http/2. 

I would appreciate if someone can shed some lights on how to further track this down, and any practical tips for short term mitigation. In particular:
1. Inside HttpSM, which states require allocate/re-use ioBuf? Is there a way to put a ceiling on each slot or total allocation? 
2. Is the ioBufAllocation ceiling a function of total connections in which case I should set a connection limit? 
3. The memory/RamCacheLRUEntry shows 5.2M, how is this related to the actual ram_cache usage reported by traffic_top (32GB used)?
4. At the point of the freelist dump, ATS process size was 78GB, the freelist total showed about 44GB, with 32GB ram_cache used (traffic_top reports). Assuming these two number are not overlapping, I also know the  in-memory (disk) directory entry cache takes at least 10GB, then these numbers do add up. 44+32+10 >> 78. What am I missing? 


Thanks,
-Hongfei
     Allocated      |        In-Use      | Type Size  |   Free List Name
--------------------|--------------------|------------|----------------------------------
         8388608000 |         4125097984 |    2097152 | 
memory/ioBufAllocator[14]
        28689039360 |        24322768896 |    1048576 | 
memory/ioBufAllocator[13]
         3573547008 |         2275934208 |     524288 | 
memory/ioBufAllocator[12]
          947912704 |          637534208 |     262144 | 
memory/ioBufAllocator[11]
          679477248 |          454950912 |     131072 | 
memory/ioBufAllocator[10]
          801112064 |          275644416 |      65536 | memory/ioBufAllocator[9]
          473956352 |           99221504 |      32768 | memory/ioBufAllocator[8]
           33554432 |            7667712 |      16384 | memory/ioBufAllocator[7]
           64749568 |           20176896 |       8192 | memory/ioBufAllocator[6]
          120586240 |           77496320 |       4096 | memory/ioBufAllocator[5]
             524288 |                  0 |       2048 | memory/ioBufAllocator[4]
             131072 |                  0 |       1024 | memory/ioBufAllocator[3]
             131072 |               1536 |        512 | memory/ioBufAllocator[2]
              65536 |               3584 |        256 | memory/ioBufAllocator[1]
           45350912 |              92416 |        128 | memory/ioBufAllocator[0]
            2285568 |             670560 |         96 | memory/eventAllocator
            2766240 |            1918400 |         80 | memory/mutexAllocator
           47153152 |            1697920 |         64 | memory/ioBlockAllocator
           20954880 |            4207968 |         48 | memory/ioDataAllocator
            5940480 |            5809200 |        240 | memory/ioAllocator
                  0 |                  0 |        432 | memory/socksAllocator
                  0 |                  0 |        128 | 
memory/udpReadContAllocator
                  0 |                  0 |        160 | 
memory/udpPacketAllocator
           16619200 |            9008208 |        752 | memory/netVCAllocator
                  0 |                  0 |        128 | 
memory/UDPIOEventAllocator
                  0 |                  0 |        880 | memory/sslNetVCAllocator
            5201920 |            4293824 |         64 | memory/RamCacheLRUEntry
                  0 |                  0 |         96 | 
memory/RamCacheCLFUSEntry
             245760 |             239200 |        160 | memory/openDirEntry
                  0 |                  0 |         48 | memory/evacuationKey
               8192 |                  0 |         64 | memory/cacheRemoveCont
            1044480 |            1032672 |         96 | memory/evacuationBlock
           10431200 |           10361344 |        944 | memory/cacheVConnection
                  0 |                  0 |         48 | 
memory/ClusterVConnectionCache::Entry
                  0 |                  0 |        576 | 
memory/cacheContAllocator
                  0 |                  0 |         32 | memory/byteBankAllocator
                  0 |                  0 |        592 | 
memory/clusterVCAllocator
                  0 |                  0 |        112 | 
memory/inControlAllocator
                  0 |                  0 |        128 | 
memory/outControlAllocator
                  0 |                  0 |         16 | 
memory/DNSRequestDataAllocator
             135424 |              33856 |      33856 | memory/dnsBufAllocator
             163840 |                  0 |       1280 | memory/dnsEntryAllocator
               4096 |                320 |         16 | memory/expiryQueueEntry
               8192 |               1280 |         64 | 
memory/refCountCacheHashingValueAllocator
                  0 |                  0 |         96 | 
memory/hostDBFileContAllocator
           22568960 |               2320 |       2320 | 
memory/hostDBContAllocator
                  0 |                  0 |        128 | 
memory/OneWayTunnelAllocator
           54525952 |           49588224 |       2048 | memory/hdrStrHeap
           63438848 |           46405632 |       2048 | memory/hdrHeap
             622592 |              34304 |        256 | 
memory/httpCacheAltAllocator
                  0 |                  0 |         48 | 
memory/CongestRequestParamAllocator
                  0 |                  0 |        160 | 
memory/CongestionDBContAllocator
                  0 |                  0 |        128 | memory/RemapPluginsAlloc
                  0 |                  0 |        960 | 
memory/http2ClientSessionAllocator
                  0 |                  0 |        960 | 
memory/http2StreamAllocator
                  0 |                  0 |         48 | 
memory/CacheLookupHttpConfigAllocator
            1179648 |            1179264 |        192 | 
memory/httpServerSessionAllocator
           80621568 |            5007744 |       7776 | memory/httpSMAllocator
           15253504 |           15223040 |        896 | 
memory/http1ClientSessionAllocator
                  0 |                  0 |        128 | 
memory/socksProxyAllocator
               4096 |                  0 |         32 | 
memory/MIMEFieldSDKHandle
                  0 |                  0 |        240 | memory/INKVConnAllocator
             921600 |              34656 |         96 | memory/INKContAllocator
            1224704 |              43840 |         32 | memory/apiHookAllocator
                  0 |                  0 |        592 | 
memory/ICPRequestCont_allocator
                  0 |                  0 |        128 | 
memory/ICPPeerReadContAllocator
                  0 |                  0 |        432 | 
memory/PeerReadDataAllocator
                  0 |                  0 |        512 | memory/FetchSMAllocator
           10616832 |             659456 |       1024 | memory/ArenaBlock
        44182686784 |        32454043824 |            | TOTAL
-----------------------------------------------------------------------------------------

Reply via email to