Hello, I have an application that runs under Tomcat 7.0.23 that periodically crashes. The java process running tomcat keeps growing in memory until the Linux oom-killer kills the process. I do not get an OutOfMemoryError because the memory leak is not in the Java heap. In fact, it seems the heap is only using 4GB of the max 6GB specified in the -Xmx parameter. Nevertheless the total memory held by the java process keeps growing up to 16GB when the OS kills the process. I haven't been able to find the conditions to reproduce this problem, so I am not able to replicate it. Nevertheless it keeps occurring; sometimes at midnight with no user activity sometimes in the middle of a busy day.
The web application uses Spring/Postgres/Mongo. I know this is not a Tomcat related problem, but some of you may have experience a similar problem and may have some suggestions on how to troubleshoot it. I already have read many of the links that come after searching the web for "java invoked oom-killer" but I still don't have any clue on what causes the problem and how to solve it. It looks like a memory leak in native code, not java code; so my usual java toolset is not useful. Tomcat runs behind nginx in a EC2 instance. The application uses Sun (now Oracle) JDK 1.6. Any suggestions on what should I look at? -Jorge Jun 4 16:02:49 ip-10-83-35-78 kernel: [1468800.179218] 3795110 pages non-shared Jun 5 06:50:07 ip-10-83-35-78 rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="599" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'. Jun 5 22:06:40 ip-10-83-35-78 kernel: [1576977.209487] java invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209492] java cpuset=/ mems_allowed=0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209496] Pid: 15618, comm: java Not tainted 2.6.32-317-ec2 #36-Ubuntu Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209498] Call Trace: Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209507] [<ffffffff8107cbbc>] ? cpuset_print_task_mems_allowed+0x8c/0xc0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209513] [<ffffffff810b1723>] oom_kill_process+0xe3/0x210 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209516] [<ffffffff810b18a0>] __out_of_memory+0x50/0xb0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209519] [<ffffffff810b195f>] out_of_memory+0x5f/0xc0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209523] [<ffffffff810b4641>] __alloc_pages_slowpath+0x561/0x580 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209526] [<ffffffff810b47d1>] __alloc_pages_nodemask+0x171/0x180 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209530] [<ffffffff810b76f7>] __do_page_cache_readahead+0xd7/0x220 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209534] [<ffffffff810b785c>] ra_submit+0x1c/0x20 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209536] [<ffffffff810b01fe>] filemap_fault+0x3fe/0x450 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209541] [<ffffffff810cbef0>] __do_fault+0x50/0x680 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209547] [<ffffffff8102afdb>] ? __dequeue_entity+0x2b/0x50 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209550] [<ffffffff810cde30>] handle_mm_fault+0x260/0x4f0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209555] [<ffffffff814b3ab7>] do_page_fault+0x147/0x390 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209558] [<ffffffff814b18e8>] page_fault+0x28/0x30 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209560] Mem-Info: Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209561] DMA per-cpu: Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209563] CPU 0: hi: 0, btch: 1 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209565] CPU 1: hi: 0, btch: 1 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209567] CPU 2: hi: 0, btch: 1 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209569] CPU 3: hi: 0, btch: 1 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209570] DMA32 per-cpu: Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209572] CPU 0: hi: 155, btch: 38 usd: 44 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209574] CPU 1: hi: 155, btch: 38 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209575] CPU 2: hi: 155, btch: 38 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209577] CPU 3: hi: 155, btch: 38 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209578] Normal per-cpu: Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209580] CPU 0: hi: 155, btch: 38 usd: 32 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209582] CPU 1: hi: 155, btch: 38 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209584] CPU 2: hi: 155, btch: 38 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209585] CPU 3: hi: 155, btch: 38 usd: 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209590] active_anon:3513144 inactive_anon:266669 isolated_anon:0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209591] active_file:101 inactive_file:15 isolated_file:0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209592] unevictable:16 dirty:2 writeback:0 unstable:0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209593] free:19129 slab_reclaimable:959 slab_unreclaimable:2729 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209594] mapped:0 shmem:52 pagetables:0 bounce:0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209600] DMA free:16384kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:16160kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209604] lowmem_reserve[]: 0 4024 15134 15134 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209611] DMA32 free:48712kB min:4184kB low:5228kB high:6276kB active_anon:3632260kB inactive_anon:24572kB active_file:224kB inactive_file:28kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:4120800kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:4kB slab_reclaimable:472kB slab_unreclaimable:240kB kernel_stack:80kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:107 all_unreclaimable? no Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209616] lowmem_reserve[]: 0 0 11109 11109 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209623] Normal free:11420kB min:11548kB low:14432kB high:17320kB active_anon:10420316kB inactive_anon:1042104kB active_file:180kB inactive_file:32kB unevictable:64kB isolated(anon):0kB isolated(file):0kB present:11376528kB mlocked:64kB dirty:8kB writeback:0kB mapped:0kB shmem:204kB slab_reclaimable:3364kB slab_unreclaimable:10676kB kernel_stack:1960kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:99 all_unreclaimable? no Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209628] lowmem_reserve[]: 0 0 0 0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209631] DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 4*4096kB = 16384kB Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209639] DMA32: 336*4kB 194*8kB 141*16kB 97*32kB 73*64kB 57*128kB 28*256kB 18*512kB 6*1024kB 1*2048kB 1*4096kB = 48896kB Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209646] Normal: 953*4kB 23*8kB 11*16kB 14*32kB 12*64kB 11*128kB 8*256kB 2*512kB 2*1024kB 0*2048kB 0*4096kB = 11916kB Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209653] 225 total pagecache pages Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209654] 0 pages in swap cache Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209656] Swap cache stats: add 0, delete 0, find 0/0 Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209657] Free swap = 0kB Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.209658] Total swap = 0kB Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.252355] 3934208 pages RAM Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.252359] 118009 pages reserved Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.252361] 749 pages shared Jun 5 22:06:41 ip-10-83-35-78 kernel: [1576977.252362] 3795607 pages non-shared Jun 6 06:48:07 ip-10-83-35-78 rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="599" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org