On Tue, Jul 24, 2018 at 4:59 PM Daniel Carrasco <d.carra...@i2tic.com> wrote: > > Hello, > > How many time is neccesary?, because is a production environment and memory > profiler + low cache size because the problem, gives a lot of CPU usage from > OSD and MDS that makes it fails while profiler is running. Is there any > problem if is done in a low traffic time? (less usage and maybe it don't > fails, but maybe less info about usage). >
just one time, wait a few minutes between start_profiler and stop_profiler > Greetings! > > 2018-07-24 10:21 GMT+02:00 Yan, Zheng <uker...@gmail.com>: >> >> I mean: >> >> ceph tell mds.x heap start_profiler >> >> ... wait for some time >> >> ceph tell mds.x heap stop_profiler >> >> pprof --text /usr/bin/ceph-mds >> /var/log/ceph/ceph-mds.x.profile.<largest number>.heap >> >> >> >> >> On Tue, Jul 24, 2018 at 3:18 PM Daniel Carrasco <d.carra...@i2tic.com> wrote: >> > >> > This is what i get: >> > >> > -------------------------------------------------------- >> > -------------------------------------------------------- >> > -------------------------------------------------------- >> > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap dump >> > 2018-07-24 09:05:19.350720 7fc562ffd700 0 client.1452545 ms_handle_reset >> > on 10.22.0.168:6800/1685786126 >> > 2018-07-24 09:05:29.103903 7fc563fff700 0 client.1452548 ms_handle_reset >> > on 10.22.0.168:6800/1685786126 >> > mds.kavehome-mgto-pro-fs01 dumping heap profile now. >> > ------------------------------------------------ >> > MALLOC: 760199640 ( 725.0 MiB) Bytes in use by application >> > MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist >> > MALLOC: + 246962320 ( 235.5 MiB) Bytes in central cache freelist >> > MALLOC: + 43933664 ( 41.9 MiB) Bytes in transfer cache freelist >> > MALLOC: + 41012664 ( 39.1 MiB) Bytes in thread cache freelists >> > MALLOC: + 10186912 ( 9.7 MiB) Bytes in malloc metadata >> > MALLOC: ------------ >> > MALLOC: = 1102295200 ( 1051.2 MiB) Actual memory used (physical + swap) >> > MALLOC: + 4268335104 ( 4070.6 MiB) Bytes released to OS (aka unmapped) >> > MALLOC: ------------ >> > MALLOC: = 5370630304 ( 5121.8 MiB) Virtual address space used >> > MALLOC: >> > MALLOC: 33027 Spans in use >> > MALLOC: 19 Thread heaps in use >> > MALLOC: 8192 Tcmalloc page size >> > ------------------------------------------------ >> > Call ReleaseFreeMemory() to release freelist memory to the OS (via >> > madvise()). >> > Bytes released to the OS take up virtual address space but no physical >> > memory. >> > >> > >> > -------------------------------------------------------- >> > -------------------------------------------------------- >> > -------------------------------------------------------- >> > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap stats >> > 2018-07-24 09:14:25.747706 7f94fffff700 0 client.1452578 ms_handle_reset >> > on 10.22.0.168:6800/1685786126 >> > 2018-07-24 09:14:25.754034 7f95057fa700 0 client.1452581 ms_handle_reset >> > on 10.22.0.168:6800/1685786126 >> > mds.kavehome-mgto-pro-fs01 tcmalloc heap >> > stats:------------------------------------------------ >> > MALLOC: 960649328 ( 916.1 MiB) Bytes in use by application >> > MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist >> > MALLOC: + 108867288 ( 103.8 MiB) Bytes in central cache freelist >> > MALLOC: + 37179424 ( 35.5 MiB) Bytes in transfer cache freelist >> > MALLOC: + 40143000 ( 38.3 MiB) Bytes in thread cache freelists >> > MALLOC: + 10186912 ( 9.7 MiB) Bytes in malloc metadata >> > MALLOC: ------------ >> > MALLOC: = 1157025952 ( 1103.4 MiB) Actual memory used (physical + swap) >> > MALLOC: + 4213604352 ( 4018.4 MiB) Bytes released to OS (aka unmapped) >> > MALLOC: ------------ >> > MALLOC: = 5370630304 ( 5121.8 MiB) Virtual address space used >> > MALLOC: >> > MALLOC: 33028 Spans in use >> > MALLOC: 19 Thread heaps in use >> > MALLOC: 8192 Tcmalloc page size >> > ------------------------------------------------ >> > Call ReleaseFreeMemory() to release freelist memory to the OS (via >> > madvise()). >> > Bytes released to the OS take up virtual address space but no physical >> > memory. >> > >> > -------------------------------------------------------- >> > -------------------------------------------------------- >> > -------------------------------------------------------- >> > After heap release: >> > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap stats >> > 2018-07-24 09:15:28.540203 7f2f7affd700 0 client.1443339 ms_handle_reset >> > on 10.22.0.168:6800/1685786126 >> > 2018-07-24 09:15:28.547153 7f2f7bfff700 0 client.1443342 ms_handle_reset >> > on 10.22.0.168:6800/1685786126 >> > mds.kavehome-mgto-pro-fs01 tcmalloc heap >> > stats:------------------------------------------------ >> > MALLOC: 710315776 ( 677.4 MiB) Bytes in use by application >> > MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist >> > MALLOC: + 246471880 ( 235.1 MiB) Bytes in central cache freelist >> > MALLOC: + 40802848 ( 38.9 MiB) Bytes in transfer cache freelist >> > MALLOC: + 38689304 ( 36.9 MiB) Bytes in thread cache freelists >> > MALLOC: + 10186912 ( 9.7 MiB) Bytes in malloc metadata >> > MALLOC: ------------ >> > MALLOC: = 1046466720 ( 998.0 MiB) Actual memory used (physical + swap) >> > MALLOC: + 4324163584 ( 4123.8 MiB) Bytes released to OS (aka unmapped) >> > MALLOC: ------------ >> > MALLOC: = 5370630304 ( 5121.8 MiB) Virtual address space used >> > MALLOC: >> > MALLOC: 33177 Spans in use >> > MALLOC: 19 Thread heaps in use >> > MALLOC: 8192 Tcmalloc page size >> > ------------------------------------------------ >> > Call ReleaseFreeMemory() to release freelist memory to the OS (via >> > madvise()). >> > Bytes released to the OS take up virtual address space but no physical >> > memory. >> > >> > >> > The other commands fails with a curl error: >> > Failed to get profile: curl 'http:///pprof/profile?seconds=30' > >> > /root/pprof/.tmp.ceph-mds.1532416424.: >> > >> > >> > Greetings!! >> > >> > 2018-07-24 5:35 GMT+02:00 Yan, Zheng <uker...@gmail.com>: >> >> >> >> could you profile memory allocation of mds >> >> >> >> http://docs.ceph.com/docs/mimic/rados/troubleshooting/memory-profiling/ >> >> On Tue, Jul 24, 2018 at 7:54 AM Daniel Carrasco <d.carra...@i2tic.com> >> >> wrote: >> >> > >> >> > Yeah, is also my thread. This thread was created before lower the cache >> >> > size from 512Mb to 8Mb. I thought that maybe was my fault and I did a >> >> > misconfiguration, so I've ignored the problem until now. >> >> > >> >> > Greetings! >> >> > >> >> > El mar., 24 jul. 2018 1:00, Gregory Farnum <gfar...@redhat.com> >> >> > escribió: >> >> >> >> >> >> On Mon, Jul 23, 2018 at 11:08 AM Patrick Donnelly >> >> >> <pdonn...@redhat.com> wrote: >> >> >>> >> >> >>> On Mon, Jul 23, 2018 at 5:48 AM, Daniel Carrasco >> >> >>> <d.carra...@i2tic.com> wrote: >> >> >>> > Hi, thanks for your response. >> >> >>> > >> >> >>> > Clients are about 6, and 4 of them are the most of time on standby. >> >> >>> > Only two >> >> >>> > are active servers that are serving the webpage. Also we've a >> >> >>> > varnish on >> >> >>> > front, so are not getting all the load (below 30% in PHP is not >> >> >>> > much). >> >> >>> > About the MDS cache, now I've the mds_cache_memory_limit at 8Mb. >> >> >>> >> >> >>> What! Please post `ceph daemon mds.<name> config diff`, `... perf >> >> >>> dump`, and `... dump_mempools ` from the server the active MDS is on. >> >> >>> >> >> >>> > I've tested >> >> >>> > also 512Mb, but the CPU usage is the same and the MDS RAM usage >> >> >>> > grows up to >> >> >>> > 15GB (on a 16Gb server it starts to swap and all fails). With 8Mb, >> >> >>> > at least >> >> >>> > the memory usage is stable on less than 6Gb (now is using about 1GB >> >> >>> > of RAM). >> >> >>> >> >> >>> We've seen reports of possible memory leaks before and the potential >> >> >>> fixes for those were in 12.2.6. How fast does your MDS reach 15GB? >> >> >>> Your MDS cache size should be configured to 1-8GB (depending on your >> >> >>> preference) so it's disturbing to see you set it so low. >> >> >> >> >> >> >> >> >> See also the thread "[ceph-users] Fwd: MDS memory usage is very high", >> >> >> which had more discussion of that. The MDS daemon seemingly had 9.5GB >> >> >> of allocated RSS but only believed 489MB was in use for the cache... >> >> >> -Greg >> >> >> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> Patrick Donnelly >> >> >>> _______________________________________________ >> >> >>> ceph-users mailing list >> >> >>> ceph-users@lists.ceph.com >> >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > >> >> > _______________________________________________ >> >> > ceph-users mailing list >> >> > ceph-users@lists.ceph.com >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> > >> > >> > >> > -- >> > _________________________________________ >> > >> > Daniel Carrasco Marín >> > Ingeniería para la Innovación i2TIC, S.L. >> > Tlf: +34 911 12 32 84 Ext: 223 >> > www.i2tic.com >> > _________________________________________ > > > > > -- > _________________________________________ > > Daniel Carrasco Marín > Ingeniería para la Innovación i2TIC, S.L. > Tlf: +34 911 12 32 84 Ext: 223 > www.i2tic.com > _________________________________________ _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com