I mean: ceph tell mds.x heap start_profiler
... wait for some time ceph tell mds.x heap stop_profiler pprof --text /usr/bin/ceph-mds /var/log/ceph/ceph-mds.x.profile.<largest number>.heap On Tue, Jul 24, 2018 at 3:18 PM Daniel Carrasco <d.carra...@i2tic.com> wrote: > > This is what i get: > > -------------------------------------------------------- > -------------------------------------------------------- > -------------------------------------------------------- > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap dump > 2018-07-24 09:05:19.350720 7fc562ffd700 0 client.1452545 ms_handle_reset on > 10.22.0.168:6800/1685786126 > 2018-07-24 09:05:29.103903 7fc563fff700 0 client.1452548 ms_handle_reset on > 10.22.0.168:6800/1685786126 > mds.kavehome-mgto-pro-fs01 dumping heap profile now. > ------------------------------------------------ > MALLOC: 760199640 ( 725.0 MiB) Bytes in use by application > MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist > MALLOC: + 246962320 ( 235.5 MiB) Bytes in central cache freelist > MALLOC: + 43933664 ( 41.9 MiB) Bytes in transfer cache freelist > MALLOC: + 41012664 ( 39.1 MiB) Bytes in thread cache freelists > MALLOC: + 10186912 ( 9.7 MiB) Bytes in malloc metadata > MALLOC: ------------ > MALLOC: = 1102295200 ( 1051.2 MiB) Actual memory used (physical + swap) > MALLOC: + 4268335104 ( 4070.6 MiB) Bytes released to OS (aka unmapped) > MALLOC: ------------ > MALLOC: = 5370630304 ( 5121.8 MiB) Virtual address space used > MALLOC: > MALLOC: 33027 Spans in use > MALLOC: 19 Thread heaps in use > MALLOC: 8192 Tcmalloc page size > ------------------------------------------------ > Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). > Bytes released to the OS take up virtual address space but no physical memory. > > > -------------------------------------------------------- > -------------------------------------------------------- > -------------------------------------------------------- > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap stats > 2018-07-24 09:14:25.747706 7f94fffff700 0 client.1452578 ms_handle_reset on > 10.22.0.168:6800/1685786126 > 2018-07-24 09:14:25.754034 7f95057fa700 0 client.1452581 ms_handle_reset on > 10.22.0.168:6800/1685786126 > mds.kavehome-mgto-pro-fs01 tcmalloc heap > stats:------------------------------------------------ > MALLOC: 960649328 ( 916.1 MiB) Bytes in use by application > MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist > MALLOC: + 108867288 ( 103.8 MiB) Bytes in central cache freelist > MALLOC: + 37179424 ( 35.5 MiB) Bytes in transfer cache freelist > MALLOC: + 40143000 ( 38.3 MiB) Bytes in thread cache freelists > MALLOC: + 10186912 ( 9.7 MiB) Bytes in malloc metadata > MALLOC: ------------ > MALLOC: = 1157025952 ( 1103.4 MiB) Actual memory used (physical + swap) > MALLOC: + 4213604352 ( 4018.4 MiB) Bytes released to OS (aka unmapped) > MALLOC: ------------ > MALLOC: = 5370630304 ( 5121.8 MiB) Virtual address space used > MALLOC: > MALLOC: 33028 Spans in use > MALLOC: 19 Thread heaps in use > MALLOC: 8192 Tcmalloc page size > ------------------------------------------------ > Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). > Bytes released to the OS take up virtual address space but no physical memory. > > -------------------------------------------------------- > -------------------------------------------------------- > -------------------------------------------------------- > After heap release: > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap stats > 2018-07-24 09:15:28.540203 7f2f7affd700 0 client.1443339 ms_handle_reset on > 10.22.0.168:6800/1685786126 > 2018-07-24 09:15:28.547153 7f2f7bfff700 0 client.1443342 ms_handle_reset on > 10.22.0.168:6800/1685786126 > mds.kavehome-mgto-pro-fs01 tcmalloc heap > stats:------------------------------------------------ > MALLOC: 710315776 ( 677.4 MiB) Bytes in use by application > MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist > MALLOC: + 246471880 ( 235.1 MiB) Bytes in central cache freelist > MALLOC: + 40802848 ( 38.9 MiB) Bytes in transfer cache freelist > MALLOC: + 38689304 ( 36.9 MiB) Bytes in thread cache freelists > MALLOC: + 10186912 ( 9.7 MiB) Bytes in malloc metadata > MALLOC: ------------ > MALLOC: = 1046466720 ( 998.0 MiB) Actual memory used (physical + swap) > MALLOC: + 4324163584 ( 4123.8 MiB) Bytes released to OS (aka unmapped) > MALLOC: ------------ > MALLOC: = 5370630304 ( 5121.8 MiB) Virtual address space used > MALLOC: > MALLOC: 33177 Spans in use > MALLOC: 19 Thread heaps in use > MALLOC: 8192 Tcmalloc page size > ------------------------------------------------ > Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()). > Bytes released to the OS take up virtual address space but no physical memory. > > > The other commands fails with a curl error: > Failed to get profile: curl 'http:///pprof/profile?seconds=30' > > /root/pprof/.tmp.ceph-mds.1532416424.: > > > Greetings!! > > 2018-07-24 5:35 GMT+02:00 Yan, Zheng <uker...@gmail.com>: >> >> could you profile memory allocation of mds >> >> http://docs.ceph.com/docs/mimic/rados/troubleshooting/memory-profiling/ >> On Tue, Jul 24, 2018 at 7:54 AM Daniel Carrasco <d.carra...@i2tic.com> wrote: >> > >> > Yeah, is also my thread. This thread was created before lower the cache >> > size from 512Mb to 8Mb. I thought that maybe was my fault and I did a >> > misconfiguration, so I've ignored the problem until now. >> > >> > Greetings! >> > >> > El mar., 24 jul. 2018 1:00, Gregory Farnum <gfar...@redhat.com> escribió: >> >> >> >> On Mon, Jul 23, 2018 at 11:08 AM Patrick Donnelly <pdonn...@redhat.com> >> >> wrote: >> >>> >> >>> On Mon, Jul 23, 2018 at 5:48 AM, Daniel Carrasco <d.carra...@i2tic.com> >> >>> wrote: >> >>> > Hi, thanks for your response. >> >>> > >> >>> > Clients are about 6, and 4 of them are the most of time on standby. >> >>> > Only two >> >>> > are active servers that are serving the webpage. Also we've a varnish >> >>> > on >> >>> > front, so are not getting all the load (below 30% in PHP is not much). >> >>> > About the MDS cache, now I've the mds_cache_memory_limit at 8Mb. >> >>> >> >>> What! Please post `ceph daemon mds.<name> config diff`, `... perf >> >>> dump`, and `... dump_mempools ` from the server the active MDS is on. >> >>> >> >>> > I've tested >> >>> > also 512Mb, but the CPU usage is the same and the MDS RAM usage grows >> >>> > up to >> >>> > 15GB (on a 16Gb server it starts to swap and all fails). With 8Mb, at >> >>> > least >> >>> > the memory usage is stable on less than 6Gb (now is using about 1GB of >> >>> > RAM). >> >>> >> >>> We've seen reports of possible memory leaks before and the potential >> >>> fixes for those were in 12.2.6. How fast does your MDS reach 15GB? >> >>> Your MDS cache size should be configured to 1-8GB (depending on your >> >>> preference) so it's disturbing to see you set it so low. >> >> >> >> >> >> See also the thread "[ceph-users] Fwd: MDS memory usage is very high", >> >> which had more discussion of that. The MDS daemon seemingly had 9.5GB of >> >> allocated RSS but only believed 489MB was in use for the cache... >> >> -Greg >> >> >> >>> >> >>> >> >>> -- >> >>> Patrick Donnelly >> >>> _______________________________________________ >> >>> ceph-users mailing list >> >>> ceph-users@lists.ceph.com >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > -- > _________________________________________ > > Daniel Carrasco Marín > Ingeniería para la Innovación i2TIC, S.L. > Tlf: +34 911 12 32 84 Ext: 223 > www.i2tic.com > _________________________________________ _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com