I've tested this on the latest Kraken RC (installed on RHEL from the el7
repo) and it seemed promising at first but the OSDs still gradually consume
all available memory until OOM killed, they just do so slower. It takes
them a couple of hours to go from 500M each to >2G each. After they're
restarted, they start at ~1G instead of 500M.

On Fri, Nov 18, 2016 at 6:32 PM, bobobo1...@gmail.com <bobobo1...@gmail.com>
wrote:

> Just to update, this is still an issue as of the latest Git commit
> (64bcf92e87f9fbb3045de49b7deb53aca1989123).
>
> On Fri, Nov 11, 2016 at 1:31 PM, bobobo1...@gmail.com <
> bobobo1...@gmail.com> wrote:
>
>> Here's another: http://termbin.com/smnm
>>
>> On Fri, Nov 11, 2016 at 1:28 PM, Sage Weil <sw...@redhat.com> wrote:
>> > On Fri, 11 Nov 2016, bobobo1...@gmail.com wrote:
>> >> Any more data needed?
>> >>
>> >> On Wed, Nov 9, 2016 at 9:29 AM, bobobo1...@gmail.com
>> >> <bobobo1...@gmail.com> wrote:
>> >> > Here it is after running overnight (~9h): http://ix.io/1DNi
>> >
>> > I'm getting a 500 on that URL...
>> >
>> > sage
>> >
>> >
>> >> >
>> >> > On Tue, Nov 8, 2016 at 11:00 PM, bobobo1...@gmail.com
>> >> > <bobobo1...@gmail.com> wrote:
>> >> >> Ah, I was actually mistaken. After running without Valgrind, it
>> seems
>> >> >> I just estimated how slowed down it was. I'll leave it to run
>> >> >> overnight as suggested.
>> >> >>
>> >> >> On Tue, Nov 8, 2016 at 10:44 PM, bobobo1...@gmail.com
>> >> >> <bobobo1...@gmail.com> wrote:
>> >> >>> Okay, I left it for 3h and it seemed to actually stabilise at
>> around
>> >> >>> 2.3G: http://ix.io/1DEK
>> >> >>>
>> >> >>> This was only after disabling other services on the system however.
>> >> >>> Generally this much RAM isn't available to Ceph (hence the OOM
>> >> >>> previously).
>> >> >>>
>> >> >>> On Tue, Nov 8, 2016 at 9:00 AM, Mark Nelson <mnel...@redhat.com>
>> wrote:
>> >> >>>> It should be running much slower through valgrind so probably
>> won't
>> >> >>>> accumulate very quickly.  That was the problem with the earlier
>> trace, there
>> >> >>>> wasn't enough memory used yet to really get us out of the weeds.
>> If it's
>> >> >>>> still accumulating quickly, try to wait until the OSD is up to
>> 4+GB RSS if
>> >> >>>> you can.  I usually kill the valgrind/osd process with SIGTERM to
>> make sure
>> >> >>>> the output is preserved.  Not sure what will happen with OOM
>> killer as I
>> >> >>>> haven't let it get that far before killing.
>> >> >>>>
>> >> >>>> Mark
>> >> >>>>
>> >> >>>> On 11/08/2016 10:37 AM, bobobo1...@gmail.com wrote:
>> >> >>>>>
>> >> >>>>> Unfortunately I don't think overnight is possible. The OOM will
>> kill it
>> >> >>>>> in hours, if not minutes. Will the output be preserved/usable if
>> the
>> >> >>>>> process is uncleanly terminated?
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> On 8 Nov 2016 08:33, "Mark Nelson" <mnel...@redhat.com
>> >> >>>>> <mailto:mnel...@redhat.com>> wrote:
>> >> >>>>>
>> >> >>>>>     Heya,
>> >> >>>>>
>> >> >>>>>     Sorry got distracted with other stuff yesterday.  Any chance
>> you
>> >> >>>>>     could run this for longer?  It's tough to tell what's going
>> on from
>> >> >>>>>     this run unfortunately.  Maybe overnight if possible.
>> >> >>>>>
>> >> >>>>>     Thanks!
>> >> >>>>>     Mark
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>     On 11/08/2016 01:10 AM, bobobo1...@gmail.com
>> >> >>>>>     <mailto:bobobo1...@gmail.com> wrote:
>> >> >>>>>
>> >> >>>>>         Just bumping this and CCing directly since I foolishly
>> broke the
>> >> >>>>>         threading on my reply.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>         On 4 Nov. 2016 8:40 pm, "bobobo1...@gmail.com
>> >> >>>>>         <mailto:bobobo1...@gmail.com>
>> >> >>>>>         <mailto:bobobo1...@gmail.com <mailto:
>> bobobo1...@gmail.com>>"
>> >> >>>>>         <bobobo1...@gmail.com <mailto:bobobo1...@gmail.com>
>> >> >>>>>
>> >> >>>>>         <mailto:bobobo1...@gmail.com <mailto:
>> bobobo1...@gmail.com>>>
>> >> >>>>> wrote:
>> >> >>>>>
>> >> >>>>>             > Then you can view the output data with ms_print or
>> with
>> >> >>>>>             massif-visualizer.  This may help narrow down where
>> in the
>> >> >>>>>         code we
>> >> >>>>>             are using the memory.
>> >> >>>>>
>> >> >>>>>             Done! I've dumped the output from ms_print here:
>> >> >>>>>         http://ix.io/1CrS
>> >> >>>>>
>> >> >>>>>             It seems most of the memory comes from here:
>> >> >>>>>
>> >> >>>>>             92.78% (998,248,799B) (heap allocation functions)
>> >> >>>>>         malloc/new/new[],
>> >> >>>>>             --alloc-fns, etc.
>> >> >>>>>             ->46.63% (501,656,678B) 0xD38936:
>> >> >>>>>             ceph::buffer::create_aligned(unsigned int, unsigned
>> int) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             | ->45.07% (484,867,174B) 0xDAFED9:
>> >> >>>>>         AsyncConnection::process() (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             | | ->45.07% (484,867,174B) 0xC410EB:
>> >> >>>>>         EventCenter::process_events(int)
>> >> >>>>>             (in /usr/bin/ceph-osd)
>> >> >>>>>             | |   ->45.07% (484,867,174B) 0xC45210: ??? (in
>> >> >>>>>         /usr/bin/ceph-osd)
>> >> >>>>>             | |     ->45.07% (484,867,174B) 0xC6FA31D:
>> >> >>>>>             execute_native_thread_routine (thread.cc:83)
>> >> >>>>>             | |       ->45.07% (484,867,174B) 0xBE06452:
>> start_thread (in
>> >> >>>>>             /usr/lib/libpthread-2.24.so <
>> http://libpthread-2.24.so>
>> >> >>>>>         <http://libpthread-2.24.so>)
>> >> >>>>>             | |         ->45.07% (484,867,174B) 0xCFCA7DD: clone
>> (in
>> >> >>>>>             /usr/lib/libc-2.24.so <http://libc-2.24.so>
>> >> >>>>>         <http://libc-2.24.so>)
>> >> >>>>>             | |
>> >> >>>>>             | ->01.56% (16,789,504B) in 6 places, all below
>> massif's
>> >> >>>>>         threshold
>> >> >>>>>             (1.00%)
>> >> >>>>>             |
>> >> >>>>>             ->22.70% (244,179,072B) 0x9C9807:
>> BitMapZone::init(long,
>> >> >>>>>         long, bool)
>> >> >>>>>             (in /usr/bin/ceph-osd)
>> >> >>>>>             | ->22.70% (244,179,072B) 0x9CACED:
>> >> >>>>>         BitMapAreaLeaf::init(long, long,
>> >> >>>>>             bool) (in /usr/bin/ceph-osd)
>> >> >>>>>             |   ->22.70% (244,179,072B) 0x9CAE88:
>> >> >>>>>             BitMapAreaLeaf::BitMapAreaLeaf(long, long, bool) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     ->22.67% (243,924,992B) 0x9CAF79:
>> >> >>>>>         BitMapAreaIN::init(long, long,
>> >> >>>>>             bool) (in /usr/bin/ceph-osd)
>> >> >>>>>             |     | ->12.46% (134,086,656B) 0x9CAFBE:
>> >> >>>>>         BitMapAreaIN::init(long,
>> >> >>>>>             long, bool) (in /usr/bin/ceph-osd)
>> >> >>>>>             |     | | ->12.46% (134,086,656B) 0x9CB237:
>> >> >>>>>             BitAllocator::init_check(long, long,
>> bmap_alloc_mode, bool,
>> >> >>>>>         bool) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     | |   ->12.46% (134,086,656B) 0x9CB431:
>> >> >>>>>             BitAllocator::BitAllocator(long, long,
>> bmap_alloc_mode,
>> >> >>>>>         bool) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     | |     ->12.46% (134,086,656B) 0x9C5C32:
>> >> >>>>>             BitMapAllocator::BitMapAllocator(long, long) (in
>> >> >>>>>         /usr/bin/ceph-osd)
>> >> >>>>>             |     | |       ->12.46% (134,086,656B) 0x968FF1:
>> >> >>>>>             Allocator::create(std::__cxx11::basic_string<char,
>> >> >>>>>             std::char_traits<char>, std::allocator<char> >,
>> long, long)
>> >> >>>>> (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     | |         ->12.46% (134,086,656B) 0x87F65C:
>> >> >>>>>             BlueStore::_open_alloc() (in /usr/bin/ceph-osd)
>> >> >>>>>             |     | |           ->12.46% (134,086,656B) 0x8D8CDD:
>> >> >>>>>             BlueStore::mount() (in /usr/bin/ceph-osd)
>> >> >>>>>             |     | |             ->12.46% (134,086,656B)
>> 0x4C15EA:
>> >> >>>>>         OSD::init()
>> >> >>>>>             (in /usr/bin/ceph-osd)
>> >> >>>>>             |     | |               ->12.46% (134,086,656B)
>> 0x40854C:
>> >> >>>>>         main (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     | |
>> >> >>>>>             |     | ->10.21% (109,838,336B) 0x9CB00B:
>> >> >>>>>         BitMapAreaIN::init(long,
>> >> >>>>>             long, bool) (in /usr/bin/ceph-osd)
>> >> >>>>>             |     |   ->10.21% (109,838,336B) 0x9CB237:
>> >> >>>>>             BitAllocator::init_check(long, long,
>> bmap_alloc_mode, bool,
>> >> >>>>>         bool) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     |     ->10.21% (109,838,336B) 0x9CB431:
>> >> >>>>>             BitAllocator::BitAllocator(long, long,
>> bmap_alloc_mode,
>> >> >>>>>         bool) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     |       ->10.21% (109,838,336B) 0x9C5C32:
>> >> >>>>>             BitMapAllocator::BitMapAllocator(long, long) (in
>> >> >>>>>         /usr/bin/ceph-osd)
>> >> >>>>>             |     |         ->10.21% (109,838,336B) 0x968FF1:
>> >> >>>>>             Allocator::create(std::__cxx11::basic_string<char,
>> >> >>>>>             std::char_traits<char>, std::allocator<char> >,
>> long, long)
>> >> >>>>> (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     |           ->10.21% (109,838,336B) 0x87F65C:
>> >> >>>>>             BlueStore::_open_alloc() (in /usr/bin/ceph-osd)
>> >> >>>>>             |     |             ->10.21% (109,838,336B) 0x8D8CDD:
>> >> >>>>>             BlueStore::mount() (in /usr/bin/ceph-osd)
>> >> >>>>>             |     |               ->10.21% (109,838,336B)
>> 0x4C15EA:
>> >> >>>>>         OSD::init()
>> >> >>>>>             (in /usr/bin/ceph-osd)
>> >> >>>>>             |     |                 ->10.21% (109,838,336B)
>> 0x40854C:
>> >> >>>>>         main (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |     |
>> >> >>>>>             |     ->00.02% (254,080B) in 1+ places, all below
>> ms_print's
>> >> >>>>>             threshold (01.00%)
>> >> >>>>>             |
>> >> >>>>>             ->12.77% (137,350,728B) 0x9CACD8:
>> BitMapAreaLeaf::init(long,
>> >> >>>>>         long,
>> >> >>>>>             bool) (in /usr/bin/ceph-osd)
>> >> >>>>>             | ->12.77% (137,350,728B) 0x9CAE88:
>> >> >>>>>             BitMapAreaLeaf::BitMapAreaLeaf(long, long, bool) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |   ->12.75% (137,207,808B) 0x9CAF79:
>> >> >>>>>         BitMapAreaIN::init(long, long,
>> >> >>>>>             bool) (in /usr/bin/ceph-osd)
>> >> >>>>>             |   | ->07.01% (75,423,744B) 0x9CAFBE:
>> >> >>>>>         BitMapAreaIN::init(long, long,
>> >> >>>>>             bool) (in /usr/bin/ceph-osd)
>> >> >>>>>             |   | | ->07.01% (75,423,744B) 0x9CB237:
>> >> >>>>>             BitAllocator::init_check(long, long,
>> bmap_alloc_mode, bool,
>> >> >>>>>         bool) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |   | |   ->07.01% (75,423,744B) 0x9CB431:
>> >> >>>>>             BitAllocator::BitAllocator(long, long,
>> bmap_alloc_mode,
>> >> >>>>>         bool) (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |   | |     ->07.01% (75,423,744B) 0x9C5C32:
>> >> >>>>>             BitMapAllocator::BitMapAllocator(long, long) (in
>> >> >>>>>         /usr/bin/ceph-osd)
>> >> >>>>>             |   | |       ->07.01% (75,423,744B) 0x968FF1:
>> >> >>>>>             Allocator::create(std::__cxx11::basic_string<char,
>> >> >>>>>             std::char_traits<char>, std::allocator<char> >,
>> long, long)
>> >> >>>>> (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |   | |         ->07.01% (75,423,744B) 0x87F65C:
>> >> >>>>>             BlueStore::_open_alloc() (in /usr/bin/ceph-osd)
>> >> >>>>>             |   | |           ->07.01% (75,423,744B) 0x8D8CDD:
>> >> >>>>>         BlueStore::mount()
>> >> >>>>>             (in /usr/bin/ceph-osd)
>> >> >>>>>             |   | |             ->07.01% (75,423,744B) 0x4C15EA:
>> >> >>>>>         OSD::init() (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>             |   | |               ->07.01% (75,423,744B)
>> 0x40854C: main
>> >> >>>>> (in
>> >> >>>>>             /usr/bin/ceph-osd)
>> >> >>>>>
>> >> >>>>
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> ceph-users@lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> >>
>>
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to