Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though.
looking at perf top I'm getting most of the CPU usage in mutex lock/unlock 5.02% libpthread-2.19.so [.] pthread_mutex_unlock 3.82% libsoftokn3.so [.] 0x000000000001e7cb 3.46% libpthread-2.19.so [.] pthread_mutex_lock I could try to use jemalloc, are you aware of any built binaries? Can I mix a cluster with different malloc binaries? On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum <g...@gregs42.com> wrote: > On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito <periqu...@gmail.com> > wrote: > > The ceph-mon is already taking a lot of memory, and I ran a heap stats > > ------------------------------------------------ > > MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application > > MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist > > MALLOC: + 16598552 ( 15.8 MiB) Bytes in central cache freelist > > MALLOC: + 14693536 ( 14.0 MiB) Bytes in transfer cache freelist > > MALLOC: + 17441592 ( 16.6 MiB) Bytes in thread cache freelists > > MALLOC: + 116387992 ( 111.0 MiB) Bytes in malloc metadata > > MALLOC: ------------ > > MALLOC: = 27794649240 (26507.0 MiB) Actual memory used (physical + swap) > > MALLOC: + 26116096 ( 24.9 MiB) Bytes released to OS (aka unmapped) > > MALLOC: ------------ > > MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used > > MALLOC: > > MALLOC: 5683 Spans in use > > MALLOC: 21 Thread heaps in use > > MALLOC: 8192 Tcmalloc page size > > ------------------------------------------------ > > > > after that I ran the heap release and it went back to normal. > > ------------------------------------------------ > > MALLOC: 22919616 ( 21.9 MiB) Bytes in use by application > > MALLOC: + 4792320 ( 4.6 MiB) Bytes in page heap freelist > > MALLOC: + 18743448 ( 17.9 MiB) Bytes in central cache freelist > > MALLOC: + 20645776 ( 19.7 MiB) Bytes in transfer cache freelist > > MALLOC: + 18456088 ( 17.6 MiB) Bytes in thread cache freelists > > MALLOC: + 116387992 ( 111.0 MiB) Bytes in malloc metadata > > MALLOC: ------------ > > MALLOC: = 201945240 ( 192.6 MiB) Actual memory used (physical + swap) > > MALLOC: + 27618820096 (26339.4 MiB) Bytes released to OS (aka unmapped) > > MALLOC: ------------ > > MALLOC: = 27820765336 (26531.9 MiB) Virtual address space used > > MALLOC: > > MALLOC: 5639 Spans in use > > MALLOC: 29 Thread heaps in use > > MALLOC: 8192 Tcmalloc page size > > ------------------------------------------------ > > > > So it just seems the monitor is not returning unused memory into the OS > or > > reusing already allocated memory it deems as free... > > Yep. This is a bug (best we can tell) in some versions of tcmalloc > combined with certain distribution stacks, although I don't think > we've seen it reported on Trusty (nor on a tcmalloc distribution that > new) before. Alternatively some folks are seeing tcmalloc use up lots > of CPU in other scenarios involving memory return and it may manifest > like this, but I'm not sure. You could look through the mailing list > for information on it. > -Greg >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com