> On 8 Mar 2019, at 14.30, Mark Nelson <mnel...@redhat.com> wrote:
> 
> 
> On 3/8/19 5:56 AM, Steffen Winther Sørensen wrote:
>> 
>>> On 5 Mar 2019, at 10.02, Paul Emmerich <paul.emmer...@croit.io 
>>> <mailto:paul.emmer...@croit.io>> wrote:
>>> 
>>> Yeah, there's a bug in 13.2.4. You need to set it to at least ~1.2GB.
>> Yeap thanks, setting it at 1G+256M worked :)
>> Hope this won’t bloat memory during coming weekend VM backups through CephFS
>> 
> 
> 
> FWIW, setting it to 1.2G will almost certainly result in the bluestore caches 
> being stuck at cache_min, ie 128MB and the autotuner may not be able to keep 
> the OSD memory that low.  I typically recommend a bare minimum of 2GB per 
> OSD, and on SSD/NVMe backed OSDs 3-4+ can improve performance significantly.
This a smaller dev cluster, not much IO, 4 nodes of 16GB & 6x HDD OSD 

Just want to avoid consuming swap, which bloated after patching to 13.2.4 from 
13.2.2 after performing VM snapshots to CephFS, Otherwise cluster has been fine 
for ages…
/Steffen


> 
> 
> Mark
> 
> 
> 
>>> On Tue, Mar 5, 2019 at 9:00 AM Steffen Winther Sørensen
>>> <ste...@gmail.com> wrote:
>>>> 
>>>> 
>>>> On 4 Mar 2019, at 16.09, Paul Emmerich <paul.emmer...@croit.io> wrote:
>>>> 
>>>> Bloated to ~4 GB per OSD and you are on HDDs?
>>>> 
>>>> Something like that yes.
>>>> 
>>>> 
>>>> 13.2.3 backported the cache auto-tuning which targets 4 GB memory
>>>> usage by default.
>>>> 
>>>> 
>>>> See https://ceph.com/releases/13-2-4-mimic-released/
>>>> 
>>>> Right, thanks…
>>>> 
>>>> 
>>>> The bluestore_cache_* options are no longer needed. They are replaced
>>>> by osd_memory_target, defaulting to 4GB. BlueStore will expand
>>>> and contract its cache to attempt to stay within this
>>>> limit. Users upgrading should note this is a higher default
>>>> than the previous bluestore_cache_size of 1GB, so OSDs using
>>>> BlueStore will use more memory by default.
>>>> For more details, see the BlueStore docs.
>>>> 
>>>> Adding a 'osd memory target’ value to our ceph.conf and restarting an OSD 
>>>> just makes the OSD dump like this:
>>>> 
>>>> [osd]
>>>>   ; this key makes 13.2.4 OSDs abort???
>>>>   osd memory target = 1073741824
>>>> 
>>>>   ; other OSD key settings
>>>>   osd pool default size = 2  # Write an object 2 times.
>>>>   osd pool default min size = 1 # Allow writing one copy in a degraded 
>>>> state.
>>>> 
>>>>   osd pool default pg num = 256
>>>>   osd pool default pgp num = 256
>>>> 
>>>>   client cache size = 131072
>>>>   osd client op priority = 40
>>>>   osd op threads = 8
>>>>   osd client message size cap = 512
>>>>   filestore min sync interval = 10
>>>>   filestore max sync interval = 60
>>>> 
>>>>   recovery max active = 2
>>>>   recovery op priority = 30
>>>>   osd max backfills = 2
>>>> 
>>>> 
>>>> 
>>>> 
>>>> osd log snippet:
>>>>  -472> 2019-03-05 08:36:02.233 7f2743a8c1c0  1 -- - start start
>>>>  -471> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 init 
>>>> /var/lib/ceph/osd/ceph-12 (looks like hdd)
>>>>  -470> 2019-03-05 08:36:02.234 7f2743a8c1c0  2 osd.12 0 journal 
>>>> /var/lib/ceph/osd/ceph-12/journal
>>>>  -469> 2019-03-05 08:36:02.234 7f2743a8c1c0  1 
>>>> bluestore(/var/lib/ceph/osd/ceph-12) _mount path /var/lib/ceph/osd/ceph-12
>>>>  -468> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev create path 
>>>> /var/lib/ceph/osd/ceph-12/block type kernel
>>>>  -467> 2019-03-05 08:36:02.235 7f2743a8c1c0  1 bdev(0x55b31af4a000 
>>>> /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
>>>>  -466> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 bdev(0x55b31af4a000 
>>>> /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c800000, 137 
>>>> GiB) block_size 4096 (4 KiB) rotational
>>>>  -465> 2019-03-05 08:36:02.236 7f2743a8c1c0  1 
>>>> bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 
>>>> 1073741824 meta 0.4 kv 0.4 data 0.2
>>>>  -464> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev create path 
>>>> /var/lib/ceph/osd/ceph-12/block type kernel
>>>>  -463> 2019-03-05 08:36:02.237 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
>>>> /var/lib/ceph/osd/ceph-12/block) open path /var/lib/ceph/osd/ceph-12/block
>>>>  -462> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bdev(0x55b31af4aa80 
>>>> /var/lib/ceph/osd/ceph-12/block) open size 146775474176 (0x222c800000, 137 
>>>> GiB) block_size 4096 (4 KiB) rotational
>>>>  -461> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs add_block_device 
>>>> bdev 1 path /var/lib/ceph/osd/ceph-12/block size 137 GiB
>>>>  -460> 2019-03-05 08:36:02.238 7f2743a8c1c0  1 bluefs mount
>>>>  -459> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>>>> compaction_readahead_size = 2097152
>>>>  -458> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>>>> compression = kNoCompression
>>>>  -457> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>>>> max_write_buffer_number = 4
>>>>  -456> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>>>> min_write_buffer_number_to_merge = 1
>>>>  -455> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>>>> recycle_log_file_num = 4
>>>>  -454> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>>>> writable_file_max_buffer_size = 0
>>>>  -453> 2019-03-05 08:36:02.339 7f2743a8c1c0  0  set rocksdb option 
>>>> write_buffer_size = 268435456
>>>>  -452> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>>>> compaction_readahead_size = 2097152
>>>>  -451> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>>>> compression = kNoCompression
>>>>  -450> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>>>> max_write_buffer_number = 4
>>>>  -449> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>>>> min_write_buffer_number_to_merge = 1
>>>>  -448> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>>>> recycle_log_file_num = 4
>>>>  -447> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>>>> writable_file_max_buffer_size = 0
>>>>  -446> 2019-03-05 08:36:02.340 7f2743a8c1c0  0  set rocksdb option 
>>>> write_buffer_size = 268435456
>>>>  -445> 2019-03-05 08:36:02.340 7f2743a8c1c0  1 rocksdb: do_open column 
>>>> families: [default]
>>>>  -444> 2019-03-05 08:36:02.341 7f2743a8c1c0  4 rocksdb: RocksDB version: 
>>>> 5.13.0
>>>>  -443> 2019-03-05 08:36:02.342 7f2743a8c1c0  4 rocksdb: Git sha 
>>>> rocksdb_build_git_sha:@0@
>>>>  -442> 2019-03-05 08:36:02.342 7f2743a8c1c0  4 rocksdb: Compile date Jan  
>>>> 4 2019
>>>> ...
>>>>  -271> 2019-03-05 08:36:02.431 7f2743a8c1c0  1 freelist init
>>>>  -270> 2019-03-05 08:36:02.535 7f2743a8c1c0  1 
>>>> bluestore(/var/lib/ceph/osd/ceph-12) _open_alloc opening allocation 
>>>> metadata
>>>>  -269> 2019-03-05 08:36:02.714 7f2743a8c1c0  1 
>>>> bluestore(/var/lib/ceph/osd/ceph-12) _open_alloc loaded 93 GiB in 31828 
>>>> extents
>>>>  -268> 2019-03-05 08:36:02.722 7f2743a8c1c0  2 osd.12 0 journal looks like 
>>>> hdd
>>>>  -267> 2019-03-05 08:36:02.722 7f2743a8c1c0  2 osd.12 0 boot
>>>>  -266> 2019-03-05 08:36:02.723 7f272a0f3700  5 
>>>> bluestore.MempoolThread(0x55b31af46a30) _tune_cache_size target: 
>>>> 1073741824 heap: 64675840 unmapped: 786432 mapped: 63889408 old 
>>>> cache_size: 134217728 new cache size: 17349132402135320576
>>>>  -265> 2019-03-05 08:36:02.723 7f272a0f3700  5 
>>>> bluestore.MempoolThread(0x55b31af46a30) _trim_shards cache_size: 
>>>> 17349132402135320576 kv_alloc: 134217728 kv_used: 5099462 meta_alloc: 0 
>>>> meta_used: 21301 data_alloc: 0 data_used: 0
>>>> ...
>>>> 2019-03-05 08:36:40.166 7f03fc57f700  1 osd.12 pg_epoch: 7063 pg[2.93( v 
>>>> 6687'5 (0'0,6687'5] local-lis/les=7015/7016 n=1 ec=103/103 lis/c 7015/7015 
>>>> les/c/f 7016/7016/0 7063/7063/7063) [12,19] r=0 lpr=7063 pi=[7015,7063)/1 
>>>> crt=6687'5 lcod 0'0 mlcod 0'0 unknown NOTIFY mbc={}] 
>>>> start_peering_interval up [19] -> [12,19], acting [19] -> [12,19], 
>>>> acting_primary 19 -> 12, up_primary 19 -> 12, role -1 -> 0, features 
>>>> acting 4611087854031142907 upacting 4611087854031142907
>>>> 2019-03-05 08:36:40.167 7f03fc57f700  1 osd.12 pg_epoch: 7063 pg[2.93( v 
>>>> 6687'5 (0'0,6687'5] local-lis/les=7015/7016 n=1 ec=103/103 lis/c 7015/7015 
>>>> les/c/f 7016/7016/0 7063/7063/7063) [12,19] r=0 lpr=7063 pi=[7015,7063)/1 
>>>> crt=6687'5 lcod 0'0 mlcod 0'0 unknown mbc={}] state<Start>: transitioning 
>>>> to Primary
>>>> 2019-03-05 08:36:40.167 7f03fb57d700  1 osd.12 pg_epoch: 7061 pg[2.40( v 
>>>> 6964'703 (0'0,6964'703] local-lis/les=6999/7000 n=1 ec=103/103 lis/c 
>>>> 6999/6999 les/c/f 7000/7000/0 7061/7061/6999) [8] r=-1 lpr=7061 
>>>> pi=[6999,7061)/1 crt=6964'703 lcod 0'0 unknown mbc={}] 
>>>> start_peering_interval up [8,12] -> [8], acting [8,12] -> [8], 
>>>> acting_primary 8 -> 8, up_primary 8 -> 8, role 1 -> -1, features acting 
>>>> 4611087854031142907 upacting 4611087854031142907
>>>>   1/ 5 heartbeatmap
>>>>   1/ 5 perfcounter
>>>>   1/ 5 rgw
>>>>   1/ 5 rgw_sync
>>>>   1/10 civetweb
>>>>   1/ 5 javaclient
>>>>   1/ 5 asok
>>>>   1/ 1 throttle
>>>>   0/ 0 refs
>>>>   1/ 5 xio
>>>>   1/ 5 compressor
>>>>   1/ 5 bluestore
>>>>   1/ 5 bluefs
>>>>   1/ 3 bdev
>>>>   1/ 5 kstore
>>>>   4/ 5 rocksdb
>>>>   4/ 5 leveldb
>>>>   4/ 5 memdb
>>>>   1/ 5 kinetic
>>>>   1/ 5 fuse
>>>>   1/ 5 mgr
>>>>   1/ 5 mgrc
>>>>   1/ 5 dpdk
>>>>   1/ 5 eventtrace
>>>>  -2/-2 (syslog threshold)
>>>>  -1/-1 (stderr threshold)
>>>>  max_recent     10000
>>>>  max_new         1000
>>>>  log_file /var/log/ceph/ceph-osd.12.log
>>>> --- end dump of recent events ---
>>>> 
>>>> 2019-03-05 08:36:07.750 7f272a0f3700 -1 *** Caught signal (Aborted) **
>>>> in thread 7f272a0f3700 thread_name:bstore_mempool
>>>> 
>>>> ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
>>>> (stable)
>>>> 1: (()+0x911e70) [0x55b318337e70]
>>>> 2: (()+0xf5d0) [0x7f2737a4e5d0]
>>>> 3: (gsignal()+0x37) [0x7f2736a6f207]
>>>> 4: (abort()+0x148) [0x7f2736a708f8]
>>>> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>>>> const*)+0x242) [0x7f273aec62b2]
>>>> 6: (()+0x25a337) [0x7f273aec6337]
>>>> 7: (()+0x7a886e) [0x55b3181ce86e]
>>>> 8: (BlueStore::MempoolThread::entry()+0x3b0) [0x55b3181d0060]
>>>> 9: (()+0x7dd5) [0x7f2737a46dd5]
>>>> 10: (clone()+0x6d) [0x7f2736b36ead]
>>>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed 
>>>> to interpret this.
>>>> 
>>>> 
>>>> Even without the ‘osd memory target’ conf key, OSD claims on start:
>>>> 
>>>> bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 1073741824
>>>> 
>>>> Any hints appreciated!
>>>> 
>>>> /Steffen
>>>> 
>>>> 
>>>> Paul
>>>> 
>>>> --
>>>> Paul Emmerich
>>>> 
>>>> Looking for help with your Ceph cluster? Contact us at https://croit.io
>>>> 
>>>> croit GmbH
>>>> Freseniusstr. 31h
>>>> 81247 München
>>>> www.croit.io
>>>> Tel: +49 89 1896585 90
>>>> 
>>>> On Mon, Mar 4, 2019 at 3:55 PM Steffen Winther Sørensen
>>>> <ste...@gmail.com> wrote:
>>>> 
>>>> 
>>>> List Members,
>>>> 
>>>> patched a centos 7  based cluster from 13.2.2 to 13.2.4 last monday, 
>>>> everything appeared working fine.
>>>> 
>>>> Only this morning I found all OSDs in the cluster to be bloated in memory 
>>>> foot print, possible after weekend backup through MDS.
>>>> 
>>>> Anyone else seeing possible memory leak in 13.2.4 OSD possible primarily 
>>>> when using MDS?
>>>> 
>>>> TIA
>>>> 
>>>> /Steffen
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to