Hello, We still facing same memory leak issue even if we specify bluestore_cache_size to 100M which caused ceph OSD process killed by out of memory .
Mar 27 01:57:05 cn1 kernel: ceph-osd invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0 Mar 27 01:57:05 cn1 kernel: ceph-osd cpuset=/ mems_allowed=0-1 Mar 27 01:57:05 cn1 kernel: CPU: 0 PID: 422861 Comm: ceph-osd Not tainted 3.10.0-327.el7.x86_64 #1 Mar 27 01:57:05 cn1 kernel: Hardware name: HP ProLiant XL450 Gen9 Server/ProLiant XL450 Gen9 Server, BIOS U21 09/12/2016 Mar 27 01:57:05 cn1 kernel: ffff884546751700 00000000275e2e50 ffff88454137b6f0 ffffffff816351f1 Mar 27 01:57:05 cn1 kernel: ffff88454137b780 ffffffff81630191 000000000487ffff ffff8846f0665590 Mar 27 01:57:05 cn1 kernel: ffff8845411b3ad8 ffff88454137b7d0 ffffffffffffffd5 0000000000000001 Mar 27 01:57:05 cn1 kernel: Call Trace: Mar 27 01:57:05 cn1 kernel: [<ffffffff816351f1>] dump_stack+0x19/0x1b Mar 27 01:57:05 cn1 kernel: [<ffffffff81630191>] dump_header+0x8e/0x214 Mar 27 01:57:05 cn1 kernel: [<ffffffff8116cdee>] oom_kill_process+0x24e/0x3b0 Mar 27 01:57:05 cn1 kernel: [<ffffffff8116c956>] ? find_lock_task_mm+0x56/0xc0 Mar 27 01:57:05 cn1 kernel: [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0 Mar 27 01:57:05 cn1 kernel: [<ffffffff811737f5>] __alloc_pages_nodemask+0xa95/0xb90 Mar 27 01:57:05 cn1 kernel: [<ffffffff811b78ca>] alloc_pages_vma+0x9a/0x140 Mar 27 01:57:05 cn1 kernel: [<ffffffff81197655>] handle_mm_fault+0xb85/0xf50 Mar 27 01:57:05 cn1 kernel: [<ffffffffa04f5b22>] ? xfs_perag_get_tag+0x42/0xe0 [xfs] Mar 27 01:57:05 cn1 kernel: [<ffffffff81640e22>] __do_page_fault+0x152/0x420 Mar 27 01:57:05 cn1 kernel: [<ffffffff81641113>] do_page_fault+0x23/0x80 Mar 27 01:57:05 cn1 kernel: [<ffffffff8163d408>] page_fault+0x28/0x30 Mar 27 01:57:05 cn1 kernel: [<ffffffff813000c9>] ? copy_user_enhanced_fast_string+0x9/0x20 Mar 27 01:57:05 cn1 kernel: [<ffffffff8130600a>] ? memcpy_toiovec+0x4a/0x90 Mar 27 01:57:05 cn1 kernel: [<ffffffff8151f91f>] skb_copy_datagram_iovec+0x12f/0x2a0 Mar 27 01:57:05 cn1 kernel: [<ffffffff81574418>] tcp_recvmsg+0x248/0xbc0 Mar 27 01:57:05 cn1 kernel: [<ffffffff810bb685>] ? sched_clock_cpu+0x85/0xc0 Mar 27 01:57:05 cn1 kernel: [<ffffffff815a10eb>] inet_recvmsg+0x7b/0xa0 Mar 27 01:57:05 cn1 kernel: [<ffffffff8150ffb6>] sock_aio_read.part.7+0x146/0x160 Mar 27 01:57:05 cn1 kernel: [<ffffffff8150fff1>] sock_aio_read+0x21/0x30 Mar 27 01:57:05 cn1 kernel: [<ffffffff811ddcdd>] do_sync_read+0x8d/0xd0 Mar 27 01:57:05 cn1 kernel: [<ffffffff811de4e5>] vfs_read+0x145/0x170 Mar 27 01:57:05 cn1 kernel: [<ffffffff811def8f>] SyS_read+0x7f/0xe0 Mar 27 01:57:05 cn1 kernel: [<ffffffff81645909>] system_call_fastpath+0x16/0x1b For several occurance of OOM event #dmesg -T | grep -i memory [Mon Mar 27 02:51:25 2017] Out of memory: Kill process 459076 (ceph-osd) score 18 or sacrifice child [Mon Mar 27 06:41:16 2017] [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0 [Mon Mar 27 06:41:16 2017] Out of memory: Kill process 976901 (java) score 31 or sacrifice child [Mon Mar 27 06:43:55 2017] [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0 [Mon Mar 27 06:43:55 2017] Out of memory: Kill process 37351 (java) score 31 or sacrifice child [Mon Mar 27 06:43:55 2017] [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0 [Mon Mar 27 06:43:55 2017] Out of memory: Kill process 435981 (ceph-osd) score 17 or sacrifice child [Mon Mar 27 11:06:07 2017] [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0 # numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 node 0 size: 294786 MB node 0 free: 3447 MB ===>>> Used almost 98% While analysing numastat results, here you can find each osd consumes more than 5G. ==== # numastat -s ceph Per-node process memory usage (in MBs) PID Node 0 Node 1 Total ----------------- --------------- --------------- --------------- 372602 (ceph-osd) 5418.34 2.84 5421.18 491602 (ceph-osd) 5351.95 2.83 5354.78 417717 (ceph-osd) 5175.98 2.83 5178.81 273980 (ceph-osd) 5167.83 2.82 5170.65 311956 (ceph-osd) 5167.04 2.84 5169.88 440537 (ceph-osd) 5161.57 2.84 5164.41 368422 (ceph-osd) 5157.87 2.83 5160.70 292227 (ceph-osd) 5156.42 2.83 5159.25 ==== Is there any way to fix the memory leak? Awaiting for your comments. --- bluestore_cache_size = 107374182 bluefs_buffered_io = true --- Env:- RHEL7.2 v11.2.0 kraken , EC 4+1 FYI - http://tracker.ceph.com/issues/18924 Raised already a tracker for this issue. Thanks On Mon, Feb 20, 2017 at 11:18 AM, Jay Linux <jaylinuxg...@gmail.com> wrote: > Hello Shinobu, > > We already raised ticket for this issue. FYI - http://tracker.ceph.com/ > issues/18924 > > Thanks > Jayaram > > > On Mon, Feb 20, 2017 at 12:36 AM, Shinobu Kinjo <ski...@redhat.com> wrote: > >> Please open ticket at http://tracker.ceph.com, if you haven't yet. >> >> On Thu, Feb 16, 2017 at 6:07 PM, Muthusamy Muthiah >> <muthiah.muthus...@gmail.com> wrote: >> > Hi Wido, >> > >> > Thanks for the information and let us know if this is a bug. >> > As workaround we will go with small bluestore_cache_size to 100MB. >> > >> > Thanks, >> > Muthu >> > >> > On 16 February 2017 at 14:04, Wido den Hollander <w...@42on.com> wrote: >> >> >> >> >> >> > Op 16 februari 2017 om 7:19 schreef Muthusamy Muthiah >> >> > <muthiah.muthus...@gmail.com>: >> >> > >> >> > >> >> > Thanks IIya Letkowski for the information we will change this value >> >> > accordingly. >> >> > >> >> >> >> What I understand from yesterday's performance meeting is that this >> seems >> >> like a bug. Lowering this buffer reduces memory, but the root-cause >> seems to >> >> be memory not being freed. A few bytes of a larger allocation still >> >> allocated causing this buffer not to be freed. >> >> >> >> Tried: >> >> >> >> debug_mempools = true >> >> >> >> $ ceph daemon osd.X dump_mempools >> >> >> >> Might want to view the YouTube video of yesterday when it's online: >> >> https://www.youtube.com/channel/UCno-Fry25FJ7B4RycCxOtfw/videos >> >> >> >> Wido >> >> >> >> > Thanks, >> >> > Muthu >> >> > >> >> > On 15 February 2017 at 17:03, Ilya Letkowski < >> mj12.svetz...@gmail.com> >> >> > wrote: >> >> > >> >> > > Hi, Muthusamy Muthiah >> >> > > >> >> > > I'm not totally sure that this is a memory leak. >> >> > > We had same problems with bluestore on ceph v11.2.0. >> >> > > Reduce bluestore cache helped us to solve it and stabilize OSD >> memory >> >> > > consumption on the 3GB level. >> >> > > >> >> > > Perhaps this will help you: >> >> > > >> >> > > bluestore_cache_size = 104857600 >> >> > > >> >> > > >> >> > > >> >> > > On Tue, Feb 14, 2017 at 11:52 AM, Muthusamy Muthiah < >> >> > > muthiah.muthus...@gmail.com> wrote: >> >> > > >> >> > >> Hi All, >> >> > >> >> >> > >> On all our 5 node cluster with ceph 11.2.0 we encounter memory >> leak >> >> > >> issues. >> >> > >> >> >> > >> Cluster details : 5 node with 24/68 disk per node , EC : 4+1 , >> RHEL >> >> > >> 7.2 >> >> > >> >> >> > >> Some traces using sar are below and attached the memory >> utilisation >> >> > >> graph >> >> > >> . >> >> > >> >> >> > >> (16:54:42)[cn2.c1 sa] # sar -r >> >> > >> 07:50:01 kbmemfree kbmemused %memused kbbuffers kbcached kbcommit >> >> > >> %commit >> >> > >> kbactive kbinact kbdirty >> >> > >> 10:20:01 32077264 132754368 80.54 16176 3040244 77767024 47.18 >> >> > >> 51991692 >> >> > >> 2676468 260 >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> *10:30:01 32208384 132623248 80.46 16176 3048536 77832312 47.22 >> >> > >> 51851512 >> >> > >> 2684552 1210:40:01 32067244 132764388 80.55 16176 3059076 77832316 >> >> > >> 47.22 >> >> > >> 51983332 2694708 26410:50:01 30626144 134205488 81.42 16176 >> 3064340 >> >> > >> 78177232 47.43 53414144 2693712 411:00:01 28927656 135903976 82.45 >> >> > >> 16176 >> >> > >> 3074064 78958568 47.90 55114284 2702892 1211:10:01 27158548 >> 137673084 >> >> > >> 83.52 >> >> > >> 16176 3080600 80553936 48.87 56873664 2708904 1211:20:01 26455556 >> >> > >> 138376076 >> >> > >> 83.95 16176 3080436 81991036 49.74 57570280 2708500 811:30:01 >> >> > >> 26002252 >> >> > >> 138829380 84.22 16176 3090556 82223840 49.88 58015048 2718036 >> >> > >> 1611:40:01 >> >> > >> 25965924 138865708 84.25 16176 3089708 83734584 50.80 58049980 >> >> > >> 2716740 >> >> > >> 1211:50:01 26142888 138688744 84.14 16176 3089544 83800100 50.84 >> >> > >> 57869628 >> >> > >> 2715400 16* >> >> > >> >> >> > >> ... >> >> > >> ... >> >> > >> >> >> > >> In the attached graph, there is increase in memory utilisation by >> >> > >> ceph-osd during soak test. And when it reaches the system limit of >> >> > >> 128GB >> >> > >> RAM , we could able to see the below dmesg logs related to memory >> out >> >> > >> when >> >> > >> the system reaches close to 128GB RAM. OSD.3 killed due to Out of >> >> > >> memory >> >> > >> and started again. >> >> > >> >> >> > >> [Tue Feb 14 03:51:02 2017] *tp_osd_tp invoked oom-killer: >> >> > >> gfp_mask=0x280da, order=0, oom_score_adj=0* >> >> > >> [Tue Feb 14 03:51:02 2017] tp_osd_tp cpuset=/ mems_allowed=0-1 >> >> > >> [Tue Feb 14 03:51:02 2017] CPU: 20 PID: 11864 Comm: tp_osd_tp Not >> >> > >> tainted >> >> > >> 3.10.0-327.13.1.el7.x86_64 #1 >> >> > >> [Tue Feb 14 03:51:02 2017] Hardware name: HP ProLiant XL420 >> >> > >> Gen9/ProLiant >> >> > >> XL420 Gen9, BIOS U19 09/12/2016 >> >> > >> [Tue Feb 14 03:51:02 2017] ffff8819ccd7a280 0000000030e84036 >> >> > >> ffff881fa58f7528 ffffffff816356f4 >> >> > >> [Tue Feb 14 03:51:02 2017] ffff881fa58f75b8 ffffffff8163068f >> >> > >> ffff881fa3478360 ffff881fa3478378 >> >> > >> [Tue Feb 14 03:51:02 2017] ffff881fa58f75e8 ffff8819ccd7a280 >> >> > >> 0000000000000001 000000000001f65f >> >> > >> [Tue Feb 14 03:51:02 2017] Call Trace: >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff816356f4>] >> dump_stack+0x19/0x1b >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8163068f>] >> >> > >> dump_header+0x8e/0x214 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8116ce7e>] >> >> > >> oom_kill_process+0x24e/0x3b0 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8116c9e6>] ? >> >> > >> find_lock_task_mm+0x56/0xc0 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8116d6a6>] >> >> > >> *out_of_memory+0x4b6/0x4f0* >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81173885>] >> >> > >> __alloc_pages_nodemask+0xa95/0xb90 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811b792a>] >> >> > >> alloc_pages_vma+0x9a/0x140 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811976c5>] >> >> > >> handle_mm_fault+0xb85/0xf50 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811957fb>] ? >> >> > >> follow_page_mask+0xbb/0x5c0 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81197c2b>] >> >> > >> __get_user_pages+0x19b/0x640 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8119843d>] >> >> > >> get_user_pages_unlocked+0x15d/0x1f0 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8106544f>] >> >> > >> get_user_pages_fast+0x9f/0x1a0 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8121de78>] >> >> > >> do_blockdev_direct_IO+0x1a78/0x2610 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81218c40>] ? >> I_BDEV+0x10/0x10 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8121ea65>] >> >> > >> __blockdev_direct_IO+0x55/0x60 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81218c40>] ? >> I_BDEV+0x10/0x10 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81219297>] >> >> > >> blkdev_direct_IO+0x57/0x60 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81218c40>] ? >> I_BDEV+0x10/0x10 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8116af63>] >> >> > >> generic_file_aio_read+0x6d3/0x750 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffffa038ad5c>] ? >> >> > >> xfs_iunlock+0x11c/0x130 [xfs] >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811690db>] ? >> >> > >> unlock_page+0x2b/0x30 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81192f21>] ? >> >> > >> __do_fault+0x401/0x510 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff8121970c>] >> >> > >> blkdev_aio_read+0x4c/0x70 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811ddcfd>] >> >> > >> do_sync_read+0x8d/0xd0 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811de45c>] >> vfs_read+0x9c/0x170 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff811df182>] >> >> > >> SyS_pread64+0x92/0xc0 >> >> > >> [Tue Feb 14 03:51:02 2017] [<ffffffff81645e89>] >> >> > >> system_call_fastpath+0x16/0x1b >> >> > >> >> >> > >> >> >> > >> Feb 14 03:51:40 fr-paris kernel: *Out of memory: Kill process 7657 >> >> > >> (ceph-osd) score 45 or sacrifice child* >> >> > >> Feb 14 03:51:40 fr-paris kernel: Killed process 7657 (ceph-osd) >> >> > >> total-vm:8650208kB, anon-rss:6124660kB, file-rss:1560kB >> >> > >> Feb 14 03:51:41 fr-paris systemd:* ceph-osd@3.service: main >> process >> >> > >> exited, code=killed, status=9/KILL* >> >> > >> Feb 14 03:51:41 fr-paris systemd: Unit ceph-osd@3.service entered >> >> > >> failed >> >> > >> state. >> >> > >> Feb 14 03:51:41 fr-paris systemd: *ceph-osd@3.service failed.* >> >> > >> Feb 14 03:51:41 fr-paris systemd: cassandra.service: main process >> >> > >> exited, >> >> > >> code=killed, status=9/KILL >> >> > >> Feb 14 03:51:41 fr-paris systemd: Unit cassandra.service entered >> >> > >> failed >> >> > >> state. >> >> > >> Feb 14 03:51:41 fr-paris systemd: cassandra.service failed. >> >> > >> Feb 14 03:51:41 fr-paris ceph-mgr: 2017-02-14 03:51:41.978878 >> >> > >> 7f51a3154700 -1 mgr ms_dispatch osd_map(7517..7517 src has >> >> > >> 6951..7517) v3 >> >> > >> Feb 14 03:51:42 fr-paris systemd: Device >> >> > >> dev-disk-by\x2dpartlabel-ceph\x5cx20block.device >> >> > >> appeared twice with different sysfs paths >> >> > >> /sys/devices/pci0000:00/0000:0 >> >> > >> 0:03.2/0000:03:00.0/host0/target0:0:0/0:0:0:9/block/sdj/sdj2 and >> >> > >> /sys/devices/pci0000:00/0000:00:03.2/0000:03:00.0/host0/targ >> >> > >> et0:0:0/0:0:0:4/block/sde/sde2 >> >> > >> Feb 14 03:51:42 fr-paris ceph-mgr: 2017-02-14 03:51:42.992477 >> >> > >> 7f51a3154700 -1 mgr ms_dispatch osd_map(7518..7518 src has >> >> > >> 6951..7518) v3 >> >> > >> Feb 14 03:51:43 fr-paris ceph-mgr: 2017-02-14 03:51:43.508990 >> >> > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1 >> >> > >> Feb 14 03:51:48 fr-paris ceph-mgr: 2017-02-14 03:51:48.508970 >> >> > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1 >> >> > >> Feb 14 03:51:53 fr-paris ceph-mgr: 2017-02-14 03:51:53.509592 >> >> > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1 >> >> > >> Feb 14 03:51:58 fr-paris ceph-mgr: 2017-02-14 03:51:58.509936 >> >> > >> 7f51a3154700 -1 mgr ms_dispatch mgrdigest v1 >> >> > >> Feb 14 03:52:01 fr-paris systemd: ceph-osd@3.service holdoff time >> >> > >> over, >> >> > >> scheduling restart. >> >> > >> Feb 14 03:52:02 fr-paris systemd: *Starting Ceph object storage >> >> > >> daemon >> >> > >> osd.3.*.. >> >> > >> Feb 14 03:52:02 fr-paris systemd: Started Ceph object storage >> daemon >> >> > >> osd.3. >> >> > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.307106 >> >> > >> 7f1e499bb940 >> >> > >> -1 WARNING: the following dangerous and experimental features are >> >> > >> enabled: >> >> > >> bluestore,rocksdb >> >> > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.317687 >> >> > >> 7f1e499bb940 >> >> > >> -1 WARNING: the following dangerous and experimental features are >> >> > >> enabled: >> >> > >> bluestore,rocksdb >> >> > >> Feb 14 03:52:02 fr-paris numactl: starting osd.3 at - osd_data >> >> > >> /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal >> >> > >> Feb 14 03:52:02 fr-paris numactl: 2017-02-14 03:52:02.333522 >> >> > >> 7f1e499bb940 >> >> > >> -1 WARNING: experimental feature 'bluestore' is enabled >> >> > >> Feb 14 03:52:02 fr-paris numactl: Please be aware that this >> feature >> >> > >> is >> >> > >> experimental, untested, >> >> > >> Feb 14 03:52:02 fr-paris numactl: unsupported, and may result in >> data >> >> > >> corruption, data loss, >> >> > >> Feb 14 03:52:02 fr-paris numactl: and/or irreparable damage to >> your >> >> > >> cluster. Do not use >> >> > >> Feb 14 03:52:02 fr-paris numactl: feature with important data. >> >> > >> >> >> > >> This seems to happen only in 11.2.0 and not in 11.1.x . Could you >> >> > >> please >> >> > >> help us in resolving this issue by means of any config change to >> >> > >> limit the >> >> > >> memory use on ceph-osd or a bug in the current kraken release. >> >> > >> >> >> > >> Thanks, >> >> > >> Muthu >> >> > >> >> >> > >> _______________________________________________ >> >> > >> ceph-users mailing list >> >> > >> ceph-users@lists.ceph.com >> >> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > >> >> >> > >> >> >> > > >> >> > > >> >> > > -- >> >> > > С уважением / Best regards >> >> > > >> >> > > Илья Летковский / Ilya Letkouski >> >> > > >> >> > > Phone, Viber: +375 29 3237335 >> >> > > >> >> > > Minsk, Belarus (GMT+3) >> >> > > >> >> > _______________________________________________ >> >> > ceph-users mailing list >> >> > ceph-users@lists.ceph.com >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> > >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com