On Sat, Nov 9, 2013 at 7:53 AM, Mark Nelson <mark.nel...@inktank.com> wrote:

> One thing to try is run the mon and then attach to it with perf and see
> what it's doing.  If CPU usage is high and leveldb is doing tons of
> compaction work that could indicate that this is the same or a similar
> problem to what we were seeing back around cuttlefish.
>

I am sorry,  I don't quite understand what does "attach to mon with perf"
mean, so could you please elaborate how to do it?


> Mark
>
>
> On 11/08/2013 04:53 PM, Gregory Farnum wrote:
>
>> Hrm, there's nothing too odd in those dumps. I asked around and it
>> sounds like the last time we saw this sort of strange memory use it
>> was a result of leveldb not being able to compact quickly enough. Joao
>> can probably help diagnose that faster than I can.
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>> On Fri, Nov 8, 2013 at 5:00 AM, Yu Changyuan <rei...@gmail.com> wrote:
>>
>>> I try to dump perf counter via admin socket, but I don't know what does
>>> these numbers actual mean or does these numbers have any thing to do with
>>> the different memory usage between arm and amd processors, so I attach
>>> the
>>> dump log as attachment(mon.a runs on AMD processor, mon.c runs on ARM
>>> processor).
>>>
>>> PS: after days of running(mon.b is 6 day, mon.c is 3 day), the memory
>>> consumption of both monitor running on arm board become stable, some what
>>> about 600MB, here is the heap stats:
>>>
>>> mon.btcmalloc heap stats:------------------------
>>> ------------------------
>>> MALLOC:      594258992 (  566.7 MiB) Bytes in use by application
>>> MALLOC: +     19529728 (   18.6 MiB) Bytes in page heap freelist
>>> MALLOC: +      3885120 (    3.7 MiB) Bytes in central cache freelist
>>> MALLOC: +      6486528 (    6.2 MiB) Bytes in transfer cache freelist
>>> MALLOC: +     12202384 (   11.6 MiB) Bytes in thread cache freelists
>>> MALLOC: +      2889952 (    2.8 MiB) Bytes in malloc metadata
>>> MALLOC:   ------------
>>> MALLOC: =    639252704 (  609.6 MiB) Actual memory used (physical + swap)
>>> MALLOC: +       122880 (    0.1 MiB) Bytes released to OS (aka unmapped)
>>> MALLOC:   ------------
>>> MALLOC: =    639375584 (  609.8 MiB) Virtual address space used
>>> MALLOC:
>>> MALLOC:          10231              Spans in use
>>> MALLOC:             24              Thread heaps in use
>>>
>>> MALLOC:           8192              Tcmalloc page size
>>> ------------------------------------------------
>>> Call ReleaseFreeMemory() to release freelist memory to the OS (via
>>> madvise()).
>>> Bytes released to the
>>>
>>> mon.ctcmalloc heap stats:------------------------
>>> ------------------------
>>> MALLOC:      593987584 (  566.5 MiB) Bytes in use by application
>>> MALLOC: +     23969792 (   22.9 MiB) Bytes in page heap freelist
>>> MALLOC: +      2172640 (    2.1 MiB) Bytes in central cache freelist
>>> MALLOC: +      5874688 (    5.6 MiB) Bytes in transfer cache freelist
>>> MALLOC: +      9268512 (    8.8 MiB) Bytes in thread cache freelists
>>> MALLOC: +      2889952 (    2.8 MiB) Bytes in malloc metadata
>>> MALLOC:   ------------
>>> MALLOC: =    638163168 (  608.6 MiB) Actual memory used (physical + swap)
>>> MALLOC: +       163840 (    0.2 MiB) Bytes released to OS (aka unmapped)
>>> MALLOC:   ------------
>>> MALLOC: =    638327008 (  608.8 MiB) Virtual address space used
>>> MALLOC:
>>> MALLOC:           9796              Spans in use
>>>
>>> MALLOC:             14              Thread heaps in use
>>> MALLOC:           8192              Tcmalloc page size
>>> ------------------------------------------------
>>> Call ReleaseFreeMemory() to release freelist memory to the OS (via
>>> madvise()).
>>> Bytes released to the
>>>
>>>
>>>
>>>
>>> On Fri, Nov 8, 2013 at 12:03 PM, Gregory Farnum <g...@inktank.com>
>>> wrote:
>>>
>>>>
>>>> I don't think this is anything we've observed before. Normally when a
>>>> Ceph node is using more memory than its peers it's a consequence of
>>>> something in that node getting backed up. You might try looking at the
>>>> perf counters via the admin socket and seeing if something about them
>>>> is different between your ARM and AMD processors.
>>>> -Greg
>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>
>>>>
>>>> On Tue, Nov 5, 2013 at 7:21 AM, Yu Changyuan <rei...@gmail.com> wrote:
>>>>
>>>>> Finally, my tiny ceph cluster get 3 monitors, newly added mon.b and
>>>>> mon.c
>>>>> both running on cubieboard2, which is cheap but still with enough cpu
>>>>> power(dual-core arm A7 cpu, 1.2G) and memory(1G).
>>>>>
>>>>> But compare to mon.a which running on an amd64 cpu, both mon.b and
>>>>> mon.c
>>>>> easily consume too much memory, so I want to know whether this is
>>>>> caused
>>>>> by
>>>>> memory leak. Below is the output of 'ceph tell mon.a heap stats' and
>>>>> 'ceph
>>>>> tell mon.c heap stats'(mon.c only start 12hr ago, while mon.a already
>>>>> running for more than 10 days)
>>>>>
>>>>> mon.atcmalloc heap
>>>>> stats:------------------------------------------------
>>>>> MALLOC:        5480160 (    5.2 MiB) Bytes in use by application
>>>>> MALLOC: +     28065792 (   26.8 MiB) Bytes in page heap freelist
>>>>> MALLOC: +     15242312 (   14.5 MiB) Bytes in central cache freelist
>>>>> MALLOC: +     10116608 (    9.6 MiB) Bytes in transfer cache freelist
>>>>> MALLOC: +     10432216 (    9.9 MiB) Bytes in thread cache freelists
>>>>> MALLOC: +      1667224 (    1.6 MiB) Bytes in malloc metadata
>>>>> MALLOC:   ------------
>>>>> MALLOC: =     71004312 (   67.7 MiB) Actual memory used (physical +
>>>>> swap)
>>>>> MALLOC: +     57540608 (   54.9 MiB) Bytes released to OS (aka
>>>>> unmapped)
>>>>> MALLOC:   ------------
>>>>> MALLOC: =    128544920 (  122.6 MiB) Virtual address space used
>>>>> MALLOC:
>>>>> MALLOC:           4655              Spans in use
>>>>> MALLOC:             34              Thread heaps in use
>>>>> MALLOC:           8192              Tcmalloc page size
>>>>> ------------------------------------------------
>>>>> Call ReleaseFreeMemory() to release freelist memory to the OS (via
>>>>> madvise()).
>>>>> Bytes released to the
>>>>>
>>>>>
>>>>> mon.ctcmalloc heap
>>>>> stats:------------------------------------------------
>>>>> MALLOC:      175861640 (  167.7 MiB) Bytes in use by application
>>>>> MALLOC: +      2220032 (    2.1 MiB) Bytes in page heap freelist
>>>>> MALLOC: +      1007560 (    1.0 MiB) Bytes in central cache freelist
>>>>> MALLOC: +      2871296 (    2.7 MiB) Bytes in transfer cache freelist
>>>>> MALLOC: +      4686000 (    4.5 MiB) Bytes in thread cache freelists
>>>>> MALLOC: +      2758880 (    2.6 MiB) Bytes in malloc metadata
>>>>> MALLOC:   ------------
>>>>> MALLOC: =    189405408 (  180.6 MiB) Actual memory used (physical +
>>>>> swap)
>>>>> MALLOC: +            0 (    0.0 MiB) Bytes released to OS (aka
>>>>> unmapped)
>>>>> MALLOC:   ------------
>>>>> MALLOC: =    189405408 (  180.6 MiB) Virtual address space used
>>>>> MALLOC:
>>>>> MALLOC:           3445              Spans in use
>>>>> MALLOC:             14              Thread heaps in use
>>>>> MALLOC:           8192              Tcmalloc page size
>>>>> ------------------------------------------------
>>>>> Call ReleaseFreeMemory() to release freelist memory to the OS (via
>>>>> madvise()).
>>>>> Bytes released to the
>>>>>
>>>>> The ceph versin is 0.67.4, compiled with tcmalloc enabled,
>>>>> gcc(armv7a-hardfloat-linux-gnueabi-gcc) version 4.7.3 and I also try
>>>>> to
>>>>> dump
>>>>> heap, but I can not find anything useful, below is a recent dump,
>>>>> output
>>>>> by
>>>>> command "pprof --text /usr/bin/ceph-mon mon.c.profile.0021.heap". What
>>>>> extra
>>>>> step should I  take to make the dump more meaningful?
>>>>>
>>>>> Using local file /usr/bin/ceph-mon.
>>>>> Using local file mon.c.profile.0021.heap.
>>>>> Total: 149.3 MB
>>>>>     146.2  97.9%  97.9%    146.2  97.9% 00000000b6a7ce7c
>>>>>       1.4   0.9%  98.9%      1.4   0.9%
>>>>> std::basic_string::_Rep::_S_create
>>>>> ??:0
>>>>>       1.4   0.9%  99.8%      1.4   0.9% 00000000002dd794
>>>>>       0.1   0.1%  99.9%      0.1   0.1% 00000000b6a81170
>>>>>       0.1   0.1%  99.9%      0.1   0.1% 00000000b6a80894
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7e2ac
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a81410
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000367450
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000001d4474
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000028847c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7e8d8
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000020c80c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000028bd20
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a63248
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a83478
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a806f0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000002eb8b8
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000024efb4
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000027e550
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a77104
>>>>>       0.0   0.0% 100.0%      0.0   0.0% _dl_mcount ??:0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000003673ec
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7a91c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000295e44
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7ee38
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000283948
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000002a53c4
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7665c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000002c4590
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7e88c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a8456c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a76ed4
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a842f0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a72bd0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a73cf8
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7100c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7dec4
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000035e6e8
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a78f68
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7de9c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000220528
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000035e7c0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a6b2f8
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a80a04
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a62e7c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a66f50
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a7e958
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a6cfb8
>>>>>       0.0   0.0% 100.0%      0.0   0.0% leveldb::DBImpl::
>>>>> MakeRoomForWrite
>>>>> (inline) ??:0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000020797c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a69de0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000001d0af0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000001d0ebc
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000002a0cd4
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000036909c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000040b02c
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000001d0b68
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000392fa0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a64404
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a791b4
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000001d9824
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000213928
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000002a0cb8
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000002a4fcc
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a725ac
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a66308
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a79068
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000000013b2
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000040b000
>>>>>       0.0   0.0% 100.0%      0.1   0.1% 00000000004d839b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f29887
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f3eb6b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f6e1cb
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f6edab
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f873ab
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f8a26b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f8b0cb
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f92dab
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000f9c96b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fa24bf
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fadd8b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fb06ab
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fb0d0b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fb494b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fbad6b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fbb2cb
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fbea6b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fed0eb
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000000fed69b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000129920b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000014250eb
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000166cfc5
>>>>>       0.0   0.0% 100.0%      0.1   0.1% 000000000166d711
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000003531d2b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000379adbb
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000004e888fb
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000004e894ab
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 0000000004e8951b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000060146d3
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000601482f
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000060fcd2b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000060fd33b
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000060fdfbb
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000a820749
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 000000000bfb1950
>>>>>       0.0   0.0% 100.0%      0.0   0.0% 00000000b6a43f23
>>>>>       0.0   0.0% 100.0%      0.0   0.0% __clone ??:0
>>>>>       0.0   0.0% 100.0%      0.0   0.0% leveldb::DBImpl::
>>>>> MakeRoomForWrite
>>>>> ??:0
>>>>>       0.0   0.0% 100.0%      0.2   0.1% std::num_put::do_put@806e4??:0
>>>>>       0.0   0.0% 100.0%      0.4   0.2% std::num_put::do_put@80b44??:0
>>>>>       0.0   0.0% 100.0%      0.1   0.1% std::num_put::do_put@80e00??:0
>>>>>
>>>>>
>>>>> PS: there's a cubietruck
>>>>> board(http://docs.cubieboard.org/products/start#cubietruck_cubieboard3
>>>>> )
>>>>> released recently, which features a dual-core arm A7 cpu, 2G RAM, 1Gbit
>>>>> eth
>>>>> port, and a sata 2.0 port, for $89, maybe suitable for cheap dedicate
>>>>> osd
>>>>> server with single disk.
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Changyuan
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Changyuan
>>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Best regards,
Changyuan
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to