With default memory settings, the assumed memory requirements of Ceph are 1GB RAM/1TB of OSD size. Increasing any settings from default will increase that baseline.
On Tue, Mar 27, 2018 at 1:10 AM Alex Gorbachev <a...@iss-integration.com> wrote: > On Mon, Mar 26, 2018 at 3:08 PM, Igor Fedotov <ifedo...@suse.de> wrote: > > Hi Alex, > > > > I can see your bug report: https://tracker.ceph.com/issues/23462 > > > > if your settings from there are applicable for your comment here then you > > have bluestore cache size limit set to 5 Gb that totals in 90 Gb RAM > for 18 > > OSD for BlueStore cache only. > > > > There is also additional memory overhead per OSD hence the amount of free > > memory you should expect isn't that much. If any at all... > > > > Can you reduce bluestore cache size limits and check if out-of-memory > issue > > is still happening? > > > > Thank you Igor, reducing to 3GB now and will advise. I did not > realize there's additional memory on top of the 90GB, the nodes each > have 128 GB. > > > -- > Alex Gorbachev > Storcium > > > > > Thanks, > > > > Igor > > > > > > > > On 3/26/2018 5:09 PM, Alex Gorbachev wrote: > >> > >> On Wed, Mar 21, 2018 at 2:26 PM, Kjetil Joergensen <kje...@medallia.com > > > >> wrote: > >>> > >>> I retract my previous statement(s). > >>> > >>> My current suspicion is that this isn't a leak as much as it being > >>> load-driven, after enough waiting - it generally seems to settle around > >>> some > >>> equilibrium. We do seem to sit on the mempools x 2.4 ~ ceph-osd RSS, > >>> which > >>> is on the higher side (I see documentation alluding to expecting > ~1.5x). > >>> > >>> -KJ > >>> > >>> On Mon, Mar 19, 2018 at 3:05 AM, Konstantin Shalygin <k0...@k0ste.ru> > >>> wrote: > >>>> > >>>> > >>>>> We don't run compression as far as I know, so that wouldn't be it. We > >>>>> do > >>>>> actually run a mix of bluestore & filestore - due to the rest of the > >>>>> cluster predating a stable bluestore by some amount. > >>>> > >>>> > >>>> > >>>> 12.2.2 -> 12.2.4 at 2018/03/10: I don't see increase of memory usage. > No > >>>> any compressions of course. > >>>> > >>>> > >>>> > >>>> > >>>> > http://storage6.static.itmages.com/i/18/0319/h_1521453809_9131482_859b1fb0a5.png > >>>> > >> I am seeing these entries under load - should be plenty of RAM on a > >> node with 128GB RAM and 18 OSDs > >> > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193331] winbindd > >> cpuset=/ mems_allowed=0-1 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193337] CPU: 3 PID: > >> 3406 Comm: winbindd Not tainted 4.14.14-041414-generic #201801201219 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193338] Hardware name: > >> Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 > >> 03/04/2015 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193339] Call Trace: > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193347] > >> dump_stack+0x5c/0x85 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193351] > >> dump_header+0x94/0x229 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193355] ? > >> do_try_to_free_pages+0x2a1/0x330 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193357] ? > >> get_page_from_freelist+0xa3/0xb20 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193359] > >> oom_kill_process+0x213/0x410 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193361] > >> out_of_memory+0x2af/0x4d0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193363] > >> __alloc_pages_slowpath+0xab2/0xe40 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193366] > >> __alloc_pages_nodemask+0x261/0x280 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193370] > >> filemap_fault+0x33f/0x6b0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193373] ? > >> filemap_map_pages+0x18a/0x3a0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193376] > >> ext4_filemap_fault+0x2c/0x40 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193379] > >> __do_fault+0x19/0xe0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193381] > >> __handle_mm_fault+0xcd6/0x1180 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193383] > >> handle_mm_fault+0xaa/0x1f0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193387] > >> __do_page_fault+0x25d/0x4e0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193391] ? > >> page_fault+0x36/0x60 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193393] > >> page_fault+0x4c/0x60 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193396] RIP: > >> 0033:0x56443d3d1239 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193397] RSP: > >> 002b:00007ffe6e44b3a0 EFLAGS: 00010246 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193399] Mem-Info: > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] > >> active_anon:30843938 inactive_anon:1403277 isolated_anon:0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] > >> active_file:121 inactive_file:977 isolated_file:18 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] > >> unevictable:3203 dirty:2 writeback:0 unstable:0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] > >> slab_reclaimable:51522 slab_unreclaimable:95924 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] mapped:2926 > >> shmem:5220 pagetables:77204 bounce:0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193407] free:328371 > >> free_pcp:0 free_cma:0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193411] Node 0 > >> active_anon:61155956kB inactive_anon:3014752kB active_file:864kB > >> inactive_file:1432kB unevictable:10440kB isolated(anon):0kB > >> isolated(file):80kB mapped:7648kB dirty:0kB writeback:0kB > >> shmem:14460kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB > >> writeback_tmp:0kB unstable:0kB all_unreclaimable? no > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193414] Node 1 > >> active_anon:62219796kB inactive_anon:2598356kB active_file:0kB > >> inactive_file:2476kB unevictable:2372kB isolated(anon):0kB > >> isolated(file):0kB mapped:4056kB dirty:8kB writeback:0kB shmem:6420kB > >> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB > >> unstable:0kB all_unreclaimable? no > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193416] Node 0 DMA > >> free:15896kB min:124kB low:152kB high:180kB active_anon:0kB > >> inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB > >> writepending:0kB present:15980kB managed:15896kB mlocked:0kB > >> kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB > >> free_cma:0kB > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193420] > >> lowmem_reserve[]: 0 1889 64319 64319 64319 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193424] Node 0 DMA32 > >> free:265308kB min:15732kB low:19664kB high:23596kB > >> active_anon:1642352kB inactive_anon:63060kB active_file:0kB > >> inactive_file:0kB unevictable:0kB writepending:0kB present:2045868kB > >> managed:1980300kB mlocked:0kB kernel_stack:48kB pagetables:832kB > >> bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193428] > >> lowmem_reserve[]: 0 0 62430 62430 62430 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193432] Node 0 Normal > >> free:507908kB min:507928kB low:634908kB high:761888kB > >> active_anon:59513604kB inactive_anon:2951692kB active_file:732kB > >> inactive_file:1720kB unevictable:10440kB writepending:0kB > >> present:65011712kB managed:63934936kB mlocked:10440kB > >> kernel_stack:16392kB pagetables:164944kB bounce:0kB free_pcp:0kB > >> local_pcp:0kB free_cma:0kB > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193436] > >> lowmem_reserve[]: 0 0 0 0 0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193440] Node 1 Normal > >> free:524372kB min:524784kB low:655980kB high:787176kB > >> active_anon:62219796kB inactive_anon:2598356kB active_file:504kB > >> inactive_file:1392kB unevictable:2372kB writepending:8kB > >> present:67108864kB managed:66056740kB mlocked:2372kB > >> kernel_stack:17912kB pagetables:143040kB bounce:0kB free_pcp:0kB > >> local_pcp:0kB free_cma:0kB > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193444] > >> lowmem_reserve[]: 0 0 0 0 0 > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193447] Node 0 DMA: > >> 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) > >> 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB > >> Mar 26 07:55:32 roc04r-sc3a085 kernel: [733474.193459] Node 0 DMA32: > >> 403*4kB (UME) 238*8kB (UME) 196*16kB (UME) 102*32kB (UME) 56*64kB > >> (UME) 24*128kB (UE) 25*256kB (UM) 11*512kB (UME) 4*1024kB (UE) > >> 6*2048kB (UM) 54*4096kB (UM) = 266172kB > >> > >> > >>>> > >>>> > >>>> k > >>> > >>> > >>> > >>> > >>> -- > >>> Kjetil Joergensen <kje...@medallia.com> > >>> SRE, Medallia Inc > >>> > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@lists.ceph.com > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com