Hey Wido,
We upgraded a 550-osd cluster from 14.2.4 to 14.2.6 and everything seems to
be working fine. Here's top:

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND

1432693 ceph      20   0 3246580   2.0g  18260 S  78.4 13.9   2760:58
ceph-mgr

2075038 ceph      20   0 2235072   1.1g  16408 S  11.6  7.6 176:15.30
ceph-mon

And the balancer is quick:

# ceph balancer status
{
    "last_optimize_duration": "0:00:02.806449",
    "plans": [],
    "mode": "upmap",
    "active": true,
    "optimize_result": "Optimization plan created successfully",
    "last_optimize_started": "Thu Jan 16 11:26:19 2020"
}

Cheers, Dan


On Thu, Jan 16, 2020 at 11:19 AM Wido den Hollander <w...@42on.com> wrote:

> Anybody upgraded to 14.2.6 yet?
>
> On a 1800 OSD cluster I see that ceph-mgr is consuming 200 to 450% CPU
> on a 4C/8T system (Intel Xeon E3-1230 3.3Ghz CPU).
>
> The logs don't show anything very special, it's just that the mgr is
> super busy.
>
> I noticed this when I executed:
>
> $ ceph balancer status
>
> That command wouldn't return and then I checked the mgr. Only after
> restarting ceph-mgr the balancer module returned results again. It
> didn't change the CPU usage, it's still consuming a lot of CPU, but at
> least the balancer seems to work again.
>
> Wido
>
> On 1/9/20 10:21 AM, Lars Täuber wrote:
> > yesterday:
> > https://ceph.io/releases/v14-2-6-nautilus-released/
> >
> >
> > Cheers,
> > Lars
> >
> > Thu, 9 Jan 2020 10:10:12 +0100
> > Wido den Hollander <w...@42on.com> ==> Neha Ojha <no...@redhat.com>,
> Sasha Litvak <alexander.v.lit...@gmail.com> :
> >> On 12/24/19 9:19 PM, Neha Ojha wrote:
> >>> The root cause of this issue is the overhead added by the network ping
> >>> time monitoring feature for the mgr to process.
> >>> We have a fix that disables sending the network ping times related
> >>> stats to the mgr and Eric has helped verify the fix(Thanks Eric!) -
> >>> https://tracker.ceph.com/issues/43364#note-9. We'll get this fix out
> >>> in 14.2.6 after the holidays.
> >>>
> >>
> >> It's after the holidays now and this is affecting a lot of deployments.
> >> Can people expect 14.2.6 soon?
> >>
> >> Wido
> >>
> >>>
> >>>
> >>> On Fri, Dec 20, 2019 at 6:24 PM Neha Ojha <no...@redhat.com> wrote:
> >>>>
> >>>> Not yet, but we have a theory and a test build in
> >>>> https://tracker.ceph.com/issues/43364#note-6, if anybody would like
> to
> >>>> give it a try.
> >>>>
> >>>> Thanks,
> >>>> Neha
> >>>>
> >>>> On Fri, Dec 20, 2019 at 2:31 PM Sasha Litvak
> >>>> <alexander.v.lit...@gmail.com> wrote:
> >>>>>
> >>>>> Was the root cause found and fixed?  If so, will the fix be
> available in 14.2.6 or sooner?
> >>>>>
> >>>>> On Thu, Dec 19, 2019 at 5:48 PM Mark Nelson <mnel...@redhat.com>
> wrote:
> >>>>>>
> >>>>>> Hi Paul,
> >>>>>>
> >>>>>>
> >>>>>> Thanks for gathering this!  It looks to me like at the very least we
> >>>>>> should redo the fixed_u_to_string and fixed_to_string functions in
> >>>>>> common/Formatter.cc.  That alone looks like it's having a pretty
> >>>>>> significant impact.
> >>>>>>
> >>>>>>
> >>>>>> Mark
> >>>>>>
> >>>>>>
> >>>>>> On 12/19/19 2:09 PM, Paul Mezzanini wrote:
> >>>>>>> Based on what we've seen with perf, we think this is the relevant
> section.  (attached is also the whole file)
> >>>>>>>
> >>>>>>> Thread: 73 (mgr-fin) - 1000 samples
> >>>>>>>
> >>>>>>> + 100.00% clone
> >>>>>>>    + 100.00% start_thread
> >>>>>>>      + 100.00% Finisher::finisher_thread_entry()
> >>>>>>>        + 99.40% Context::complete(int)
> >>>>>>>        | + 99.40% FunctionContext::finish(int)
> >>>>>>>        |   + 99.40% ActivePyModule::notify(std::string const&,
> std::string const&)
> >>>>>>>        |     + 91.30% PyObject_CallMethod
> >>>>>>>        |     | + 91.30% call_function_tail
> >>>>>>>        |     |   + 91.30% PyObject_Call
> >>>>>>>        |     |     + 91.30% instancemethod_call
> >>>>>>>        |     |       + 91.30% PyObject_Call
> >>>>>>>        |     |         + 91.30% function_call
> >>>>>>>        |     |           + 91.30% PyEval_EvalCodeEx
> >>>>>>>        |     |             + 88.40% PyEval_EvalFrameEx
> >>>>>>>        |     |             | + 88.40% PyEval_EvalFrameEx
> >>>>>>>        |     |             |   + 88.40%
> ceph_state_get(BaseMgrModule*, _object*)
> >>>>>>>        |     |             |     + 88.40%
> ActivePyModules::get_python(std::string const&)
> >>>>>>>        |     |             |       + 51.10%
> PGMap::dump_osd_stats(ceph::Formatter*) const
> >>>>>>>        |     |             |       | + 51.10%
> osd_stat_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   + 22.50%
> ceph::fixed_u_to_string(unsigned long, int)
> >>>>>>>        |     |             |       |   | + 10.50%
> std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char>
> >::basic_ostringstream(std::_Ios_Openmode)
> >>>>>>>        |     |             |       |   | | + 9.30%
> std::basic_ios<char, std::char_traits<char>
> >::init(std::basic_streambuf<char, std::char_traits<char> >*)
> >>>>>>>        |     |             |       |   | | | + 7.00%
> std::basic_ios<char, std::char_traits<char> >::_M_cache_locale(std::locale
> const&)
> >>>>>>>        |     |             |       |   | | | | + 1.60%
> std::ctype<char> const& std::use_facet<std::ctype<char> >(std::locale
> const&)
> >>>>>>>        |     |             |       |   | | | | | + 1.50%
> __dynamic_cast
> >>>>>>>        |     |             |       |   | | | | |   + 0.80%
> __cxxabiv1::__vmi_class_type_info::__do_dyncast(long,
> __cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info
> const*, void const*, __cxxabiv1::__class_type_info const*, void const*,
> __cxxabiv1::__class_type_info::__dyncast_result&) const
> >>>>>>>        |     |             |       |   | | | | + 1.40% bool
> std::has_facet<std::ctype<char> >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | | + 1.30%
> __dynamic_cast
> >>>>>>>        |     |             |       |   | | | | |   + 0.90%
> __cxxabiv1::__vmi_class_type_info::__do_dyncast(long,
> __cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info
> const*, void const*, __cxxabiv1::__class_type_info const*, void const*,
> __cxxabiv1::__class_type_info::__dyncast_result&) const
> >>>>>>>        |     |             |       |   | | | | + 1.10% bool
> std::has_facet<std::num_put<char, std::ostreambuf_iterator<char,
> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | | + 0.90%
> __dynamic_cast
> >>>>>>>        |     |             |       |   | | | | + 1.00% bool
> std::has_facet<std::num_get<char, std::istreambuf_iterator<char,
> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | | + 0.70%
> __dynamic_cast
> >>>>>>>        |     |             |       |   | | | | | + 0.10%
> std::locale::id::_M_id() const
> >>>>>>>        |     |             |       |   | | | | | + 0.10%
> _ZNKSt6locale2id5_M_idEv@plt
> >>>>>>>        |     |             |       |   | | | | + 0.80%
> std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> >
> > const& std::use_facet<std::num_put<char, std::ostreambuf_iterator<char,
> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.70%
> std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> >
> > const& std::use_facet<std::num_get<char, std::istreambuf_iterator<char,
> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.10%
> _ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt
> >>>>>>>        |     |             |       |   | | | + 2.00%
> std::ios_base::_M_init()
> >>>>>>>        |     |             |       |   | | | | + 0.80%
> std::locale::operator=(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.80%
> std::locale::locale()
> >>>>>>>        |     |             |       |   | | | | + 0.30%
> std::locale::~locale()
> >>>>>>>        |     |             |       |   | | | | + 0.10%
> _ZNSt6localeC1Ev@plt
> >>>>>>>        |     |             |       |   | | | + 0.20%
> _ZNSt8ios_base7_M_initEv@plt
> >>>>>>>        |     |             |       |   | | + 0.90%
> std::locale::locale()
> >>>>>>>        |     |             |       |   | | + 0.10%
> std::ios_base::ios_base()
> >>>>>>>        |     |             |       |   | | + 0.10%
> _ZNSt9basic_iosIcSt11char_traitsIcEE4initEPSt15basic_streambufIcS1_E@plt
> >>>>>>>        |     |             |       |   | + 2.80% std::ostream&
> std::ostream::_M_insert<unsigned long>(unsigned long)
> >>>>>>>        |     |             |       |   | | + 2.40%
> std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> >
> >::do_put(std::ostreambuf_iterator<char, std::char_traits<char> >,
> std::ios_base&, char, unsigned long) const
> >>>>>>>        |     |             |       |   | | | + 2.10%
> std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char,
> std::ostreambuf_iterator<char, std::char_traits<char> >
> >::_M_insert_int<unsigned long>(std::ostreambuf_iterator<char,
> std::char_traits<char> >, std::ios_base&, char, unsigned long) const
> >>>>>>>        |     |             |       |   | | | | + 1.60%
> std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*,
> long)
> >>>>>>>        |     |             |       |   | | | | | + 1.40%
> std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char>
> >::overflow(int)
> >>>>>>>        |     |             |       |   | | | | | | + 0.90%
> std::string::reserve(unsigned long)
> >>>>>>>        |     |             |       |   | | | | | | + 0.10%
> std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char>
> >::_M_sync(char*, unsigned long, unsigned long)
> >>>>>>>        |     |             |       |   | | | | | | + 0.10%
> _ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEE7_M_syncEPcmm@plt
> >>>>>>>        |     |             |       |   | | | | | + 0.20%
> __memcpy_ssse3_back
> >>>>>>>        |     |             |       |   | | | | + 0.20% ???
> >>>>>>>        |     |             |       |   | | | | + 0.10%
> std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> >
> >::_M_pad(char, long, std::ios_base&, char*, char const*, int&) const
> >>>>>>>        |     |             |       |   | | | + 0.10%
> _ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE13_M_insert_intImEES3_S3_RSt8ios_basecT_@plt
> >>>>>>>        |     |             |       |   | | + 0.10%
> std::ostream::sentry::sentry(std::ostream&)
> >>>>>>>        |     |             |       |   | + 2.80%
> std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char>
> >::str() const
> >>>>>>>        |     |             |       |   | | + 1.00%
> std::string::assign(std::string const&)
> >>>>>>>        |     |             |       |   | | + 0.90% char*
> std::string::_S_construct<char*>(char*, char*, std::allocator<char> const&,
> std::forward_iterator_tag) [clone .part.1796]
> >>>>>>>        |     |             |       |   | + 1.50%
> std::string::append(char const*, unsigned long)
> >>>>>>>        |     |             |       |   | | + 1.20%
> std::string::reserve(unsigned long)
> >>>>>>>        |     |             |       |   | |   + 0.60%
> std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long)
> >>>>>>>        |     |             |       |   | |   + 0.10% tc_free
> >>>>>>>        |     |             |       |   | + 1.20%
> std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone .isra.97]
> [clone .part.98]
> >>>>>>>        |     |             |       |   | + 1.00%
> std::string::append(std::string const&)
> >>>>>>>        |     |             |       |   | | + 0.70%
> std::string::reserve(unsigned long)
> >>>>>>>        |     |             |       |   | | + 0.10%
> __memcpy_ssse3_back
> >>>>>>>        |     |             |       |   | + 1.00%
> std::basic_string<char, std::char_traits<char>, std::allocator<char>
> >::basic_string(std::string const&, unsigned long, unsigned long)
> >>>>>>>        |     |             |       |   | | + 0.80% char*
> std::string::_S_construct<char*>(char*, char*, std::allocator<char> const&,
> std::forward_iterator_tag) [clone .part.220]
> >>>>>>>        |     |             |       |   | + 0.40%
> std::locale::~locale()
> >>>>>>>        |     |             |       |   | + 0.20% tc_free
> >>>>>>>        |     |             |       |   | + 0.20%
> __strlen_sse2_pminub
> >>>>>>>        |     |             |       |   | + 0.10%
> std::ios_base::~ios_base()
> >>>>>>>        |     |             |       |   | + 0.10%
> _ZNSt8ios_baseD2Ev@plt
> >>>>>>>        |     |             |       |   | + 0.10%
> _ZNKSt15basic_stringbufIcSt11char_traitsIcESaIcEE3strEv@plt
> >>>>>>>        |     |             |       |   + 18.20%
> PyFormatter::open_object_section(char const*)
> >>>>>>>        |     |             |       |   | + 17.10% PyDict_New
> >>>>>>>        |     |             |       |   | | + 16.70%
> _PyObject_GC_New
> >>>>>>>        |     |             |       |   | |   + 16.70%
> _PyObject_GC_Malloc
> >>>>>>>        |     |             |       |   | |     + 16.60% collect
> >>>>>>>        |     |             |       |   | |     | + 8.10%
> dict_traverse
> >>>>>>>        |     |             |       |   | |     | | + 3.20%
> visit_reachable
> >>>>>>>        |     |             |       |   | |     | | | + 0.10%
> type_is_gc
> >>>>>>>        |     |             |       |   | |     | | + 2.80%
> visit_decref
> >>>>>>>        |     |             |       |   | |     | | + 1.60%
> PyDict_Next
> >>>>>>>        |     |             |       |   | |     | + 1.30%
> list_traverse
> >>>>>>>        |     |             |       |   | |     | | + 0.40%
> visit_decref
> >>>>>>>        |     |             |       |   | |     | | + 0.30%
> visit_reachable
> >>>>>>>        |     |             |       |   | |     | + 0.60%
> func_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.40%
> _PyDict_MaybeUntrack
> >>>>>>>        |     |             |       |   | |     | + 0.10%
> type_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.10%
> subtype_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.10%
> set_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.10%
> class_traverse
> >>>>>>>        |     |             |       |   | |     | + 0.10%
> _PyDict_MaybeUntrack@plt
> >>>>>>>        |     |             |       |   | |     + 0.10%
> PyObject_Malloc
> >>>>>>>        |     |             |       |   | + 1.00%
> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   |   + 0.40%
> PyString_FromString
> >>>>>>>        |     |             |       |   |   + 0.20%
> dict_set_item_by_hash_or_entry
> >>>>>>>        |     |             |       |   |   + 0.20% PyDict_SetItem
> >>>>>>>        |     |             |       |   |   + 0.10% app1
> >>>>>>>        |     |             |       |   + 6.60%
> ceph::Formatter::dump_format_unquoted(char const*, char const*, ...)
> >>>>>>>        |     |             |       |   | + 6.60%
> PyFormatter::dump_format_va(char const*, char const*, bool, char const*,
> __va_list_tag*)
> >>>>>>>        |     |             |       |   |   + 3.90% __vsnprintf_chk
> >>>>>>>        |     |             |       |   |   | + 3.40% vfprintf
> >>>>>>>        |     |             |       |   |   | | + 0.50% strchrnul
> >>>>>>>        |     |             |       |   |   | | + 0.40%
> __GI__IO_default_xsputn
> >>>>>>>        |     |             |       |   |   | | + 0.20% tc_free
> >>>>>>>        |     |             |       |   |   | | + 0.10% free@plt
> >>>>>>>        |     |             |       |   |   | | + 0.10% (anonymous
> namespace)::free_null_or_invalid(void*, void (*)(void*)) [clone
> .constprop.41]
> >>>>>>>        |     |             |       |   |   | + 0.20% _IO_no_init
> >>>>>>>        |     |             |       |   |   | + 0.10%
> _IO_str_init_static_internal
> >>>>>>>        |     |             |       |   |   + 1.50%
> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   |   | + 0.50%
> PyString_FromString
> >>>>>>>        |     |             |       |   |   | + 0.40% PyDict_SetItem
> >>>>>>>        |     |             |       |   |   | + 0.10%
> dict_set_item_by_hash_or_entry
> >>>>>>>        |     |             |       |   |   | + 0.10%
> PyDict_SetItem@plt
> >>>>>>>        |     |             |       |   |   + 1.20%
> PyString_FromString
> >>>>>>>        |     |             |       |   |     + 0.60%
> PyObject_Malloc
> >>>>>>>        |     |             |       |   |     + 0.20%
> __strlen_sse2_pminub
> >>>>>>>        |     |             |       |   |     + 0.10%
> __memcpy_ssse3_back
> >>>>>>>        |     |             |       |   + 0.90% ctime_r
> >>>>>>>        |     |             |       |   + 0.80%
> PyFormatter::open_array_section(char const*)
> >>>>>>>        |     |             |       |   + 0.40%
> std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone
> .isra.846] [clone .part.847]
> >>>>>>>        |     |             |       |   + 0.30%
> PyFormatter::dump_int(char const*, long)
> >>>>>>>        |     |             |       |   + 0.20%
> PyFormatter::close_section()
> >>>>>>>        |     |             |       |   + 0.10% tc_free
> >>>>>>>        |     |             |       |   + 0.10%
> std::basic_string<char, std::char_traits<char>, std::allocator<char>
> >::basic_string(char const*, std::allocator<char> const&)
> >>>>>>>        |     |             |       |   + 0.10%
> std::_Rb_tree_increment(std::_Rb_tree_node_base const*)
> >>>>>>>        |     |             |       |   + 0.10%
> pow2_hist_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   + 0.10%
> objectstore_perf_stat_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   + 0.10%
> PyFormatter::dump_string(char const*, std::basic_string_view<char,
> std::char_traits<char> >)
> >>>>>>>        |     |             |       |   + 0.10%
> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       + 21.80% Mutex::lock(bool)
> >>>>>>>        |     |             |       | + 21.80% pthread_mutex_lock
> >>>>>>>        |     |             |       |   + 21.80% _L_lock_883
> >>>>>>>        |     |             |       |     + 21.80% __lll_lock_wait
> >>>>>>>        |     |             |       + 11.70%
> PGMap::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       | + 11.70%
> PGMap::dump_pg_stats(ceph::Formatter*, bool) const
> >>>>>>>        |     |             |       |   + 10.90%
> pg_stat_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   | + 4.20%
> PyFormatter::dump_stream(char const*)
> >>>>>>>        |     |             |       |   | | + 2.80%
> std::basic_ios<char, std::char_traits<char>
> >::init(std::basic_streambuf<char, std::char_traits<char> >*)
> >>>>>>>        |     |             |       |   | | | + 2.10%
> std::basic_ios<char, std::char_traits<char> >::_M_cache_locale(std::locale
> const&)
> >>>>>>>        |     |             |       |   | | | | + 0.50%
> std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> >
> > const& std::use_facet<std::num_put<char, std::ostreambuf_iterator<char,
> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.50% bool
> std::has_facet<std::num_put<char, std::ostreambuf_iterator<char,
> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.40% bool
> std::has_facet<std::num_get<char, std::istreambuf_iterator<char,
> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.20%
> std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> >
> > const& std::use_facet<std::num_get<char, std::istreambuf_iterator<char,
> std::char_traits<char> > > >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.20%
> std::ctype<char> const& std::use_facet<std::ctype<char> >(std::locale
> const&)
> >>>>>>>        |     |             |       |   | | | | + 0.20% bool
> std::has_facet<std::ctype<char> >(std::locale const&)
> >>>>>>>        |     |             |       |   | | | | + 0.10%
> _ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt
> >>>>>>>        |     |             |       |   | | | + 0.70%
> std::ios_base::_M_init()
> >>>>>>>        |     |             |       |   | | + 0.50%
> tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, int)
> >>>>>>>        |     |             |       |   | | + 0.40%
> std::string::assign(char const*, unsigned long)
> >>>>>>>        |     |             |       |   | | + 0.20%
> std::locale::locale()
> >>>>>>>        |     |             |       |   | | + 0.10%
> std::ios_base::ios_base()
> >>>>>>>        |     |             |       |   | + 1.80%
> object_stat_collection_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   | | + 1.70%
> object_stat_sum_t::dump(ceph::Formatter*) const
> >>>>>>>        |     |             |       |   | | | + 1.40%
> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   | | | | + 0.60% dictresize
> >>>>>>>        |     |             |       |   | | | | + 0.30%
> PyString_FromString
> >>>>>>>        |     |             |       |   | | | | + 0.20%
> PyDict_SetItem
> >>>>>>>        |     |             |       |   | | | | + 0.10%
> dict_set_item_by_hash_or_entry
> >>>>>>>        |     |             |       |   | | | + 0.20%
> PyFormatter::dump_int(char const*, long)
> >>>>>>>        |     |             |       |   | | + 0.10%
> PyFormatter::open_object_section(char const*)
> >>>>>>>        |     |             |       |   | + 1.80%
> PyFormatter::open_array_section(char const*)
> >>>>>>>        |     |             |       |   | | + 1.60% PyList_New
> >>>>>>>        |     |             |       |   | | | + 1.60%
> _PyObject_GC_New
> >>>>>>>        |     |             |       |   | | |   + 1.60%
> _PyObject_GC_Malloc
> >>>>>>>        |     |             |       |   | | |     + 1.60% collect
> >>>>>>>        |     |             |       |   | | |       + 0.80%
> dict_traverse
> >>>>>>>        |     |             |       |   | | |       + 0.10%
> subtype_traverse
> >>>>>>>        |     |             |       |   | | |       + 0.10%
> list_traverse
> >>>>>>>        |     |             |       |   | | |       + 0.10%
> func_traverse
> >>>>>>>        |     |             |       |   | | |       + 0.10%
> _PyDict_MaybeUntrack
> >>>>>>>        |     |             |       |   | | + 0.20%
> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   | + 1.70%
> utime_t::localtime(std::ostream&) const
> >>>>>>>        |     |             |       |   | | + 1.00% std::ostream&
> std::ostream::_M_insert<long>(long)
> >>>>>>>        |     |             |       |   | | | + 0.60%
> std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> >
> >::do_put(std::ostreambuf_iterator<char, std::char_traits<char> >,
> std::ios_base&, char, long) const
> >>>>>>>        |     |             |       |   | | + 0.30%
> std::basic_ostream<char, std::char_traits<char> >&
> std::__ostream_insert<char, std::char_traits<char>
> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
> >>>>>>>        |     |             |       |   | | + 0.20% __tz_convert
> >>>>>>>        |     |             |       |   | + 0.90%
> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       |   | + 0.20%
> pg_state_string(unsigned long)
> >>>>>>>        |     |             |       |   | + 0.20%
> operator<<(std::ostream&, eversion_t const&) [clone .isra.103]
> >>>>>>>        |     |             |       |   | + 0.10% std::ostream&
> std::ostream::_M_insert<unsigned long>(unsigned long)
> >>>>>>>        |     |             |       |   + 0.40%
> PyFormatter::dump_stream(char const*)
> >>>>>>>        |     |             |       |   + 0.30%
> operator<<(std::ostream&, pg_t const&)
> >>>>>>>        |     |             |       |   + 0.10%
> PyFormatter::open_object_section(char const*)
> >>>>>>>        |     |             |       + 2.70%
> PyFormatter::finish_pending_streams()
> >>>>>>>        |     |             |       | + 1.00%
> std::_List_base<std::shared_ptr<PyFormatter::PendingStream>,
> std::allocator<std::shared_ptr<PyFormatter::PendingStream> > >::_M_clear()
> >>>>>>>        |     |             |       | | + 0.40%
> std::_Sp_counted_ptr_inplace<PyFormatter::PendingStream,
> std::allocator<PyFormatter::PendingStream>,
> (__gnu_cxx::_Lock_policy)2>::_M_dispose()
> >>>>>>>        |     |             |       | | + 0.20%
> tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*,
> unsigned int)
> >>>>>>>        |     |             |       | + 0.70%
> PyFormatter::dump_pyobject(char const*, _object*)
> >>>>>>>        |     |             |       | + 0.50%
> std::string::replace(unsigned long, unsigned long, char const*, unsigned
> long)
> >>>>>>>        |     |             |       | + 0.30% PyString_FromString
> >>>>>>>        |     |             |       + 1.10% PyEval_RestoreThread
> >>>>>>>        |     |             |         + 1.10% PyThread_acquire_lock
> >>>>>>>        |     |             |           + 1.10% sem_wait@
> @GLIBC_2.2.5
> >>>>>>>        |     |             |             + 1.10%
> __new_sem_wait_slow.constprop.0
> >>>>>>>        |     |             |               + 1.10%
> do_futex_wait.constprop.1
> >>>>>>>        |     |             + 2.90% frame_dealloc
> >>>>>>>        |     |               + 2.90% dict_dealloc
> >>>>>>>        |     |                 + 2.90% list_dealloc
> >>>>>>>        |     |                   + 2.90% dict_dealloc
> >>>>>>>        |     |                     + 1.90% list_dealloc
> >>>>>>>        |     |                     | + 1.90% dict_dealloc
> >>>>>>>        |     |                     |   + 1.70% list_dealloc
> >>>>>>>        |     |                     |     + 1.50% dict_dealloc
> >>>>>>>        |     |                     |     | + 0.90% dict_dealloc
> >>>>>>>        |     |                     |     | + 0.10% PyObject_Free
> >>>>>>>        |     |                     |     + 0.10%
> tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*,
> unsigned int)
> >>>>>>>        |     |                     + 0.30% PyObject_Free
> >>>>>>>        |     |                     + 0.20% dict_dealloc
> >>>>>>>        |     + 8.10% Gil::Gil(SafeThreadState&, bool)
> >>>>>>>        |       + 8.10% PyEval_RestoreThread
> >>>>>>>        |         + 8.10% PyThread_acquire_lock
> >>>>>>>        |           + 8.10% sem_wait@@GLIBC_2.2.5
> >>>>>>>        |             + 8.10% __new_sem_wait_slow.constprop.0
> >>>>>>>        |               + 8.10% do_futex_wait.constprop.1
> >>>>>>>        + 0.60%
> std::condition_variable::wait(std::unique_lock<std::mutex>&)
> >>>>>>>
> >>>>>>> --
> >>>>>>> Paul Mezzanini
> >>>>>>> Sr Systems Administrator / Engineer, Research Computing
> >>>>>>> Information & Technology Services
> >>>>>>> Finance & Administration
> >>>>>>> Rochester Institute of Technology
> >>>>>>> o:(585) 475-3245 | pfm...@rit.edu
> >>>>>>>
> >>>>>>> CONFIDENTIALITY NOTE: The information transmitted, including
> attachments, is
> >>>>>>> intended only for the person(s) or entity to which it is addressed
> and may
> >>>>>>> contain confidential and/or privileged material. Any review,
> retransmission,
> >>>>>>> dissemination or other use of, or taking of any action in reliance
> upon this
> >>>>>>> information by persons or entities other than the intended
> recipient is
> >>>>>>> prohibited. If you received this in error, please contact the
> sender and
> >>>>>>> destroy any copies of this information.
> >>>>>>> ------------------------
> >>>>>>>
> >>>>>>> ________________________________________
> >>>>>>> From: Mark Nelson <mnel...@redhat.com>
> >>>>>>> Sent: Thursday, December 19, 2019 11:47 AM
> >>>>>>> To: ceph-users@ceph.io
> >>>>>>> Subject: [ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5
> >>>>>>>
> >>>>>>> If you can get a wallclock profiler on the mgr process we might be
> able
> >>>>>>> to figure out specifics of what's taking so much time (ie
> processing
> >>>>>>> pg_summary or something else).  Assuming you have gdb with the
> python
> >>>>>>> bindings and the ceph debug packages installed, if you (are anyone)
> >>>>>>> could try gdbpmp on the 100% mgr process that would be fantastic.
> >>>>>>>
> >>>>>>>
> >>>>>>> https://github.com/markhpc/gdbpmp
> >>>>>>>
> >>>>>>>
> >>>>>>> gdbpmp.py -p`pidof ceph-mgr` -n 1000 -o mgr.gdbpmp
> >>>>>>>
> >>>>>>>
> >>>>>>> If you want to view the results:
> >>>>>>>
> >>>>>>>
> >>>>>>> gdbpmp.py -i mgr.gdbpmp -t 1
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Mark
> >>>>>>>
> >>>>>>>
> >>>>>>> On 12/19/19 6:29 AM, Paul Emmerich wrote:
> >>>>>>>> We're also seeing unusually high mgr CPU usage on some setups, the
> >>>>>>>> only thing they have in common seem to > 300 OSDs.
> >>>>>>>>
> >>>>>>>> Threads using the CPU are "mgr-fin" and and "ms_dispatch"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Paul
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Paul Emmerich
> >>>>>>>>
> >>>>>>>> Looking for help with your Ceph cluster? Contact us at
> https://croit.io
> >>>>>>>>
> >>>>>>>> croit GmbH
> >>>>>>>> Freseniusstr. 31h
> >>>>>>>> 81247 München
> >>>>>>>> www.croit.io <http://www.croit.io>
> >>>>>>>> Tel: +49 89 1896585 90
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Dec 19, 2019 at 9:40 AM Serkan Çoban <
> cobanser...@gmail.com
> >>>>>>>> <mailto:cobanser...@gmail.com>> wrote:
> >>>>>>>>
> >>>>>>>>      +1
> >>>>>>>>      1500 OSDs, mgr is constant %100 after upgrading from 14.2.2
> to 14.2.5.
> >>>>>>>>
> >>>>>>>>      On Thu, Dec 19, 2019 at 11:06 AM Toby Darling
> >>>>>>>>      <t...@mrc-lmb.cam.ac.uk <mailto:t...@mrc-lmb.cam.ac.uk>>
> wrote:
> >>>>>>>>      >
> >>>>>>>>      > On 18/12/2019 22:40, Bryan Stillwell wrote:
> >>>>>>>>      > > That's how we noticed it too.  Our graphs went silent
> after
> >>>>>>>>      the upgrade
> >>>>>>>>      > > completed.  Is your large cluster over 350 OSDs?
> >>>>>>>>      >
> >>>>>>>>      > A 'me too' on this - graphs have gone quiet, and mgr is
> using
> >>>>>>>>      100% CPU.
> >>>>>>>>      > This happened when we grew our 14.2.5 cluster from 328 to
> 436 OSDs.
> >>>>>>>>      >
> >>>>>>>>      > Cheers
> >>>>>>>>      > Toby
> >>>>>>>>      > --
> >>>>>>>>      > Toby Darling, Scientific Computing (2N249)
> >>>>>>>>      > MRC Laboratory of Molecular Biology
> >>>>>>>>      > Francis Crick Avenue
> >>>>>>>>      > Cambridge Biomedical Campus
> >>>>>>>>      > Cambridge CB2 0QH
> >>>>>>>>      > Phone 01223 267070
> >>>>>>>>      > _______________________________________________
> >>>>>>>>      > ceph-users mailing list -- ceph-users@ceph.io
> >>>>>>>>      <mailto:ceph-users@ceph.io>
> >>>>>>>>      > To unsubscribe send an email to ceph-users-le...@ceph.io
> >>>>>>>>      <mailto:ceph-users-le...@ceph.io>
> >>>>>>>>      _______________________________________________
> >>>>>>>>      ceph-users mailing list -- ceph-users@ceph.io
> >>>>>>>>      <mailto:ceph-users@ceph.io>
> >>>>>>>>      To unsubscribe send an email to ceph-users-le...@ceph.io
> >>>>>>>>      <mailto:ceph-users-le...@ceph.io>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> ceph-users mailing list -- ceph-users@ceph.io
> >>>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>>>>>> _______________________________________________
> >>>>>>> ceph-users mailing list -- ceph-users@ceph.io
> >>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>>>>> _______________________________________________
> >>>>>> ceph-users mailing list -- ceph-users@ceph.io
> >>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>>>>
> >>>>> _______________________________________________
> >>>>> ceph-users mailing list -- ceph-users@ceph.io
> >>>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>> _______________________________________________
> >>> ceph-users mailing list -- ceph-users@ceph.io
> >>> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to