Hey Wido, We upgraded a 550-osd cluster from 14.2.4 to 14.2.6 and everything seems to be working fine. Here's top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1432693 ceph 20 0 3246580 2.0g 18260 S 78.4 13.9 2760:58 ceph-mgr 2075038 ceph 20 0 2235072 1.1g 16408 S 11.6 7.6 176:15.30 ceph-mon And the balancer is quick: # ceph balancer status { "last_optimize_duration": "0:00:02.806449", "plans": [], "mode": "upmap", "active": true, "optimize_result": "Optimization plan created successfully", "last_optimize_started": "Thu Jan 16 11:26:19 2020" } Cheers, Dan On Thu, Jan 16, 2020 at 11:19 AM Wido den Hollander <w...@42on.com> wrote: > Anybody upgraded to 14.2.6 yet? > > On a 1800 OSD cluster I see that ceph-mgr is consuming 200 to 450% CPU > on a 4C/8T system (Intel Xeon E3-1230 3.3Ghz CPU). > > The logs don't show anything very special, it's just that the mgr is > super busy. > > I noticed this when I executed: > > $ ceph balancer status > > That command wouldn't return and then I checked the mgr. Only after > restarting ceph-mgr the balancer module returned results again. It > didn't change the CPU usage, it's still consuming a lot of CPU, but at > least the balancer seems to work again. > > Wido > > On 1/9/20 10:21 AM, Lars Täuber wrote: > > yesterday: > > https://ceph.io/releases/v14-2-6-nautilus-released/ > > > > > > Cheers, > > Lars > > > > Thu, 9 Jan 2020 10:10:12 +0100 > > Wido den Hollander <w...@42on.com> ==> Neha Ojha <no...@redhat.com>, > Sasha Litvak <alexander.v.lit...@gmail.com> : > >> On 12/24/19 9:19 PM, Neha Ojha wrote: > >>> The root cause of this issue is the overhead added by the network ping > >>> time monitoring feature for the mgr to process. > >>> We have a fix that disables sending the network ping times related > >>> stats to the mgr and Eric has helped verify the fix(Thanks Eric!) - > >>> https://tracker.ceph.com/issues/43364#note-9. We'll get this fix out > >>> in 14.2.6 after the holidays. > >>> > >> > >> It's after the holidays now and this is affecting a lot of deployments. > >> Can people expect 14.2.6 soon? > >> > >> Wido > >> > >>> > >>> > >>> On Fri, Dec 20, 2019 at 6:24 PM Neha Ojha <no...@redhat.com> wrote: > >>>> > >>>> Not yet, but we have a theory and a test build in > >>>> https://tracker.ceph.com/issues/43364#note-6, if anybody would like > to > >>>> give it a try. > >>>> > >>>> Thanks, > >>>> Neha > >>>> > >>>> On Fri, Dec 20, 2019 at 2:31 PM Sasha Litvak > >>>> <alexander.v.lit...@gmail.com> wrote: > >>>>> > >>>>> Was the root cause found and fixed? If so, will the fix be > available in 14.2.6 or sooner? > >>>>> > >>>>> On Thu, Dec 19, 2019 at 5:48 PM Mark Nelson <mnel...@redhat.com> > wrote: > >>>>>> > >>>>>> Hi Paul, > >>>>>> > >>>>>> > >>>>>> Thanks for gathering this! It looks to me like at the very least we > >>>>>> should redo the fixed_u_to_string and fixed_to_string functions in > >>>>>> common/Formatter.cc. That alone looks like it's having a pretty > >>>>>> significant impact. > >>>>>> > >>>>>> > >>>>>> Mark > >>>>>> > >>>>>> > >>>>>> On 12/19/19 2:09 PM, Paul Mezzanini wrote: > >>>>>>> Based on what we've seen with perf, we think this is the relevant > section. (attached is also the whole file) > >>>>>>> > >>>>>>> Thread: 73 (mgr-fin) - 1000 samples > >>>>>>> > >>>>>>> + 100.00% clone > >>>>>>> + 100.00% start_thread > >>>>>>> + 100.00% Finisher::finisher_thread_entry() > >>>>>>> + 99.40% Context::complete(int) > >>>>>>> | + 99.40% FunctionContext::finish(int) > >>>>>>> | + 99.40% ActivePyModule::notify(std::string const&, > std::string const&) > >>>>>>> | + 91.30% PyObject_CallMethod > >>>>>>> | | + 91.30% call_function_tail > >>>>>>> | | + 91.30% PyObject_Call > >>>>>>> | | + 91.30% instancemethod_call > >>>>>>> | | + 91.30% PyObject_Call > >>>>>>> | | + 91.30% function_call > >>>>>>> | | + 91.30% PyEval_EvalCodeEx > >>>>>>> | | + 88.40% PyEval_EvalFrameEx > >>>>>>> | | | + 88.40% PyEval_EvalFrameEx > >>>>>>> | | | + 88.40% > ceph_state_get(BaseMgrModule*, _object*) > >>>>>>> | | | + 88.40% > ActivePyModules::get_python(std::string const&) > >>>>>>> | | | + 51.10% > PGMap::dump_osd_stats(ceph::Formatter*) const > >>>>>>> | | | | + 51.10% > osd_stat_t::dump(ceph::Formatter*) const > >>>>>>> | | | | + 22.50% > ceph::fixed_u_to_string(unsigned long, int) > >>>>>>> | | | | | + 10.50% > std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> > >::basic_ostringstream(std::_Ios_Openmode) > >>>>>>> | | | | | | + 9.30% > std::basic_ios<char, std::char_traits<char> > >::init(std::basic_streambuf<char, std::char_traits<char> >*) > >>>>>>> | | | | | | | + 7.00% > std::basic_ios<char, std::char_traits<char> >::_M_cache_locale(std::locale > const&) > >>>>>>> | | | | | | | | + 1.60% > std::ctype<char> const& std::use_facet<std::ctype<char> >(std::locale > const&) > >>>>>>> | | | | | | | | | + 1.50% > __dynamic_cast > >>>>>>> | | | | | | | | | + 0.80% > __cxxabiv1::__vmi_class_type_info::__do_dyncast(long, > __cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info > const*, void const*, __cxxabiv1::__class_type_info const*, void const*, > __cxxabiv1::__class_type_info::__dyncast_result&) const > >>>>>>> | | | | | | | | + 1.40% bool > std::has_facet<std::ctype<char> >(std::locale const&) > >>>>>>> | | | | | | | | | + 1.30% > __dynamic_cast > >>>>>>> | | | | | | | | | + 0.90% > __cxxabiv1::__vmi_class_type_info::__do_dyncast(long, > __cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info > const*, void const*, __cxxabiv1::__class_type_info const*, void const*, > __cxxabiv1::__class_type_info::__dyncast_result&) const > >>>>>>> | | | | | | | | + 1.10% bool > std::has_facet<std::num_put<char, std::ostreambuf_iterator<char, > std::char_traits<char> > > >(std::locale const&) > >>>>>>> | | | | | | | | | + 0.90% > __dynamic_cast > >>>>>>> | | | | | | | | + 1.00% bool > std::has_facet<std::num_get<char, std::istreambuf_iterator<char, > std::char_traits<char> > > >(std::locale const&) > >>>>>>> | | | | | | | | | + 0.70% > __dynamic_cast > >>>>>>> | | | | | | | | | + 0.10% > std::locale::id::_M_id() const > >>>>>>> | | | | | | | | | + 0.10% > _ZNKSt6locale2id5_M_idEv@plt > >>>>>>> | | | | | | | | + 0.80% > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > > > const& std::use_facet<std::num_put<char, std::ostreambuf_iterator<char, > std::char_traits<char> > > >(std::locale const&) > >>>>>>> | | | | | | | | + 0.70% > std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > > const& std::use_facet<std::num_get<char, std::istreambuf_iterator<char, > std::char_traits<char> > > >(std::locale const&) > >>>>>>> | | | | | | | | + 0.10% > _ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt > >>>>>>> | | | | | | | + 2.00% > std::ios_base::_M_init() > >>>>>>> | | | | | | | | + 0.80% > std::locale::operator=(std::locale const&) > >>>>>>> | | | | | | | | + 0.80% > std::locale::locale() > >>>>>>> | | | | | | | | + 0.30% > std::locale::~locale() > >>>>>>> | | | | | | | | + 0.10% > _ZNSt6localeC1Ev@plt > >>>>>>> | | | | | | | + 0.20% > _ZNSt8ios_base7_M_initEv@plt > >>>>>>> | | | | | | + 0.90% > std::locale::locale() > >>>>>>> | | | | | | + 0.10% > std::ios_base::ios_base() > >>>>>>> | | | | | | + 0.10% > _ZNSt9basic_iosIcSt11char_traitsIcEE4initEPSt15basic_streambufIcS1_E@plt > >>>>>>> | | | | | + 2.80% std::ostream& > std::ostream::_M_insert<unsigned long>(unsigned long) > >>>>>>> | | | | | | + 2.40% > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > > >::do_put(std::ostreambuf_iterator<char, std::char_traits<char> >, > std::ios_base&, char, unsigned long) const > >>>>>>> | | | | | | | + 2.10% > std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, > std::ostreambuf_iterator<char, std::char_traits<char> > > >::_M_insert_int<unsigned long>(std::ostreambuf_iterator<char, > std::char_traits<char> >, std::ios_base&, char, unsigned long) const > >>>>>>> | | | | | | | | + 1.60% > std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, > long) > >>>>>>> | | | | | | | | | + 1.40% > std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> > >::overflow(int) > >>>>>>> | | | | | | | | | | + 0.90% > std::string::reserve(unsigned long) > >>>>>>> | | | | | | | | | | + 0.10% > std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> > >::_M_sync(char*, unsigned long, unsigned long) > >>>>>>> | | | | | | | | | | + 0.10% > _ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEE7_M_syncEPcmm@plt > >>>>>>> | | | | | | | | | + 0.20% > __memcpy_ssse3_back > >>>>>>> | | | | | | | | + 0.20% ??? > >>>>>>> | | | | | | | | + 0.10% > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > > >::_M_pad(char, long, std::ios_base&, char*, char const*, int&) const > >>>>>>> | | | | | | | + 0.10% > _ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE13_M_insert_intImEES3_S3_RSt8ios_basecT_@plt > >>>>>>> | | | | | | + 0.10% > std::ostream::sentry::sentry(std::ostream&) > >>>>>>> | | | | | + 2.80% > std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> > >::str() const > >>>>>>> | | | | | | + 1.00% > std::string::assign(std::string const&) > >>>>>>> | | | | | | + 0.90% char* > std::string::_S_construct<char*>(char*, char*, std::allocator<char> const&, > std::forward_iterator_tag) [clone .part.1796] > >>>>>>> | | | | | + 1.50% > std::string::append(char const*, unsigned long) > >>>>>>> | | | | | | + 1.20% > std::string::reserve(unsigned long) > >>>>>>> | | | | | | + 0.60% > std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) > >>>>>>> | | | | | | + 0.10% tc_free > >>>>>>> | | | | | + 1.20% > std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone .isra.97] > [clone .part.98] > >>>>>>> | | | | | + 1.00% > std::string::append(std::string const&) > >>>>>>> | | | | | | + 0.70% > std::string::reserve(unsigned long) > >>>>>>> | | | | | | + 0.10% > __memcpy_ssse3_back > >>>>>>> | | | | | + 1.00% > std::basic_string<char, std::char_traits<char>, std::allocator<char> > >::basic_string(std::string const&, unsigned long, unsigned long) > >>>>>>> | | | | | | + 0.80% char* > std::string::_S_construct<char*>(char*, char*, std::allocator<char> const&, > std::forward_iterator_tag) [clone .part.220] > >>>>>>> | | | | | + 0.40% > std::locale::~locale() > >>>>>>> | | | | | + 0.20% tc_free > >>>>>>> | | | | | + 0.20% > __strlen_sse2_pminub > >>>>>>> | | | | | + 0.10% > std::ios_base::~ios_base() > >>>>>>> | | | | | + 0.10% > _ZNSt8ios_baseD2Ev@plt > >>>>>>> | | | | | + 0.10% > _ZNKSt15basic_stringbufIcSt11char_traitsIcESaIcEE3strEv@plt > >>>>>>> | | | | + 18.20% > PyFormatter::open_object_section(char const*) > >>>>>>> | | | | | + 17.10% PyDict_New > >>>>>>> | | | | | | + 16.70% > _PyObject_GC_New > >>>>>>> | | | | | | + 16.70% > _PyObject_GC_Malloc > >>>>>>> | | | | | | + 16.60% collect > >>>>>>> | | | | | | | + 8.10% > dict_traverse > >>>>>>> | | | | | | | | + 3.20% > visit_reachable > >>>>>>> | | | | | | | | | + 0.10% > type_is_gc > >>>>>>> | | | | | | | | + 2.80% > visit_decref > >>>>>>> | | | | | | | | + 1.60% > PyDict_Next > >>>>>>> | | | | | | | + 1.30% > list_traverse > >>>>>>> | | | | | | | | + 0.40% > visit_decref > >>>>>>> | | | | | | | | + 0.30% > visit_reachable > >>>>>>> | | | | | | | + 0.60% > func_traverse > >>>>>>> | | | | | | | + 0.40% > _PyDict_MaybeUntrack > >>>>>>> | | | | | | | + 0.10% > type_traverse > >>>>>>> | | | | | | | + 0.10% > subtype_traverse > >>>>>>> | | | | | | | + 0.10% > set_traverse > >>>>>>> | | | | | | | + 0.10% > class_traverse > >>>>>>> | | | | | | | + 0.10% > _PyDict_MaybeUntrack@plt > >>>>>>> | | | | | | + 0.10% > PyObject_Malloc > >>>>>>> | | | | | + 1.00% > PyFormatter::dump_pyobject(char const*, _object*) > >>>>>>> | | | | | + 0.40% > PyString_FromString > >>>>>>> | | | | | + 0.20% > dict_set_item_by_hash_or_entry > >>>>>>> | | | | | + 0.20% PyDict_SetItem > >>>>>>> | | | | | + 0.10% app1 > >>>>>>> | | | | + 6.60% > ceph::Formatter::dump_format_unquoted(char const*, char const*, ...) > >>>>>>> | | | | | + 6.60% > PyFormatter::dump_format_va(char const*, char const*, bool, char const*, > __va_list_tag*) > >>>>>>> | | | | | + 3.90% __vsnprintf_chk > >>>>>>> | | | | | | + 3.40% vfprintf > >>>>>>> | | | | | | | + 0.50% strchrnul > >>>>>>> | | | | | | | + 0.40% > __GI__IO_default_xsputn > >>>>>>> | | | | | | | + 0.20% tc_free > >>>>>>> | | | | | | | + 0.10% free@plt > >>>>>>> | | | | | | | + 0.10% (anonymous > namespace)::free_null_or_invalid(void*, void (*)(void*)) [clone > .constprop.41] > >>>>>>> | | | | | | + 0.20% _IO_no_init > >>>>>>> | | | | | | + 0.10% > _IO_str_init_static_internal > >>>>>>> | | | | | + 1.50% > PyFormatter::dump_pyobject(char const*, _object*) > >>>>>>> | | | | | | + 0.50% > PyString_FromString > >>>>>>> | | | | | | + 0.40% PyDict_SetItem > >>>>>>> | | | | | | + 0.10% > dict_set_item_by_hash_or_entry > >>>>>>> | | | | | | + 0.10% > PyDict_SetItem@plt > >>>>>>> | | | | | + 1.20% > PyString_FromString > >>>>>>> | | | | | + 0.60% > PyObject_Malloc > >>>>>>> | | | | | + 0.20% > __strlen_sse2_pminub > >>>>>>> | | | | | + 0.10% > __memcpy_ssse3_back > >>>>>>> | | | | + 0.90% ctime_r > >>>>>>> | | | | + 0.80% > PyFormatter::open_array_section(char const*) > >>>>>>> | | | | + 0.40% > std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone > .isra.846] [clone .part.847] > >>>>>>> | | | | + 0.30% > PyFormatter::dump_int(char const*, long) > >>>>>>> | | | | + 0.20% > PyFormatter::close_section() > >>>>>>> | | | | + 0.10% tc_free > >>>>>>> | | | | + 0.10% > std::basic_string<char, std::char_traits<char>, std::allocator<char> > >::basic_string(char const*, std::allocator<char> const&) > >>>>>>> | | | | + 0.10% > std::_Rb_tree_increment(std::_Rb_tree_node_base const*) > >>>>>>> | | | | + 0.10% > pow2_hist_t::dump(ceph::Formatter*) const > >>>>>>> | | | | + 0.10% > objectstore_perf_stat_t::dump(ceph::Formatter*) const > >>>>>>> | | | | + 0.10% > PyFormatter::dump_string(char const*, std::basic_string_view<char, > std::char_traits<char> >) > >>>>>>> | | | | + 0.10% > PyFormatter::dump_pyobject(char const*, _object*) > >>>>>>> | | | + 21.80% Mutex::lock(bool) > >>>>>>> | | | | + 21.80% pthread_mutex_lock > >>>>>>> | | | | + 21.80% _L_lock_883 > >>>>>>> | | | | + 21.80% __lll_lock_wait > >>>>>>> | | | + 11.70% > PGMap::dump(ceph::Formatter*) const > >>>>>>> | | | | + 11.70% > PGMap::dump_pg_stats(ceph::Formatter*, bool) const > >>>>>>> | | | | + 10.90% > pg_stat_t::dump(ceph::Formatter*) const > >>>>>>> | | | | | + 4.20% > PyFormatter::dump_stream(char const*) > >>>>>>> | | | | | | + 2.80% > std::basic_ios<char, std::char_traits<char> > >::init(std::basic_streambuf<char, std::char_traits<char> >*) > >>>>>>> | | | | | | | + 2.10% > std::basic_ios<char, std::char_traits<char> >::_M_cache_locale(std::locale > const&) > >>>>>>> | | | | | | | | + 0.50% > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > > > const& std::use_facet<std::num_put<char, std::ostreambuf_iterator<char, > std::char_traits<char> > > >(std::locale const&) > >>>>>>> | | | | | | | | + 0.50% bool > std::has_facet<std::num_put<char, std::ostreambuf_iterator<char, > std::char_traits<char> > > >(std::locale const&) > >>>>>>> | | | | | | | | + 0.40% bool > std::has_facet<std::num_get<char, std::istreambuf_iterator<char, > std::char_traits<char> > > >(std::locale const&) > >>>>>>> | | | | | | | | + 0.20% > std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > > const& std::use_facet<std::num_get<char, std::istreambuf_iterator<char, > std::char_traits<char> > > >(std::locale const&) > >>>>>>> | | | | | | | | + 0.20% > std::ctype<char> const& std::use_facet<std::ctype<char> >(std::locale > const&) > >>>>>>> | | | | | | | | + 0.20% bool > std::has_facet<std::ctype<char> >(std::locale const&) > >>>>>>> | | | | | | | | + 0.10% > _ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt > >>>>>>> | | | | | | | + 0.70% > std::ios_base::_M_init() > >>>>>>> | | | | | | + 0.50% > tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, int) > >>>>>>> | | | | | | + 0.40% > std::string::assign(char const*, unsigned long) > >>>>>>> | | | | | | + 0.20% > std::locale::locale() > >>>>>>> | | | | | | + 0.10% > std::ios_base::ios_base() > >>>>>>> | | | | | + 1.80% > object_stat_collection_t::dump(ceph::Formatter*) const > >>>>>>> | | | | | | + 1.70% > object_stat_sum_t::dump(ceph::Formatter*) const > >>>>>>> | | | | | | | + 1.40% > PyFormatter::dump_pyobject(char const*, _object*) > >>>>>>> | | | | | | | | + 0.60% dictresize > >>>>>>> | | | | | | | | + 0.30% > PyString_FromString > >>>>>>> | | | | | | | | + 0.20% > PyDict_SetItem > >>>>>>> | | | | | | | | + 0.10% > dict_set_item_by_hash_or_entry > >>>>>>> | | | | | | | + 0.20% > PyFormatter::dump_int(char const*, long) > >>>>>>> | | | | | | + 0.10% > PyFormatter::open_object_section(char const*) > >>>>>>> | | | | | + 1.80% > PyFormatter::open_array_section(char const*) > >>>>>>> | | | | | | + 1.60% PyList_New > >>>>>>> | | | | | | | + 1.60% > _PyObject_GC_New > >>>>>>> | | | | | | | + 1.60% > _PyObject_GC_Malloc > >>>>>>> | | | | | | | + 1.60% collect > >>>>>>> | | | | | | | + 0.80% > dict_traverse > >>>>>>> | | | | | | | + 0.10% > subtype_traverse > >>>>>>> | | | | | | | + 0.10% > list_traverse > >>>>>>> | | | | | | | + 0.10% > func_traverse > >>>>>>> | | | | | | | + 0.10% > _PyDict_MaybeUntrack > >>>>>>> | | | | | | + 0.20% > PyFormatter::dump_pyobject(char const*, _object*) > >>>>>>> | | | | | + 1.70% > utime_t::localtime(std::ostream&) const > >>>>>>> | | | | | | + 1.00% std::ostream& > std::ostream::_M_insert<long>(long) > >>>>>>> | | | | | | | + 0.60% > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > > >::do_put(std::ostreambuf_iterator<char, std::char_traits<char> >, > std::ios_base&, char, long) const > >>>>>>> | | | | | | + 0.30% > std::basic_ostream<char, std::char_traits<char> >& > std::__ostream_insert<char, std::char_traits<char> > >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long) > >>>>>>> | | | | | | + 0.20% __tz_convert > >>>>>>> | | | | | + 0.90% > PyFormatter::dump_pyobject(char const*, _object*) > >>>>>>> | | | | | + 0.20% > pg_state_string(unsigned long) > >>>>>>> | | | | | + 0.20% > operator<<(std::ostream&, eversion_t const&) [clone .isra.103] > >>>>>>> | | | | | + 0.10% std::ostream& > std::ostream::_M_insert<unsigned long>(unsigned long) > >>>>>>> | | | | + 0.40% > PyFormatter::dump_stream(char const*) > >>>>>>> | | | | + 0.30% > operator<<(std::ostream&, pg_t const&) > >>>>>>> | | | | + 0.10% > PyFormatter::open_object_section(char const*) > >>>>>>> | | | + 2.70% > PyFormatter::finish_pending_streams() > >>>>>>> | | | | + 1.00% > std::_List_base<std::shared_ptr<PyFormatter::PendingStream>, > std::allocator<std::shared_ptr<PyFormatter::PendingStream> > >::_M_clear() > >>>>>>> | | | | | + 0.40% > std::_Sp_counted_ptr_inplace<PyFormatter::PendingStream, > std::allocator<PyFormatter::PendingStream>, > (__gnu_cxx::_Lock_policy)2>::_M_dispose() > >>>>>>> | | | | | + 0.20% > tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, > unsigned int) > >>>>>>> | | | | + 0.70% > PyFormatter::dump_pyobject(char const*, _object*) > >>>>>>> | | | | + 0.50% > std::string::replace(unsigned long, unsigned long, char const*, unsigned > long) > >>>>>>> | | | | + 0.30% PyString_FromString > >>>>>>> | | | + 1.10% PyEval_RestoreThread > >>>>>>> | | | + 1.10% PyThread_acquire_lock > >>>>>>> | | | + 1.10% sem_wait@ > @GLIBC_2.2.5 > >>>>>>> | | | + 1.10% > __new_sem_wait_slow.constprop.0 > >>>>>>> | | | + 1.10% > do_futex_wait.constprop.1 > >>>>>>> | | + 2.90% frame_dealloc > >>>>>>> | | + 2.90% dict_dealloc > >>>>>>> | | + 2.90% list_dealloc > >>>>>>> | | + 2.90% dict_dealloc > >>>>>>> | | + 1.90% list_dealloc > >>>>>>> | | | + 1.90% dict_dealloc > >>>>>>> | | | + 1.70% list_dealloc > >>>>>>> | | | + 1.50% dict_dealloc > >>>>>>> | | | | + 0.90% dict_dealloc > >>>>>>> | | | | + 0.10% PyObject_Free > >>>>>>> | | | + 0.10% > tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, > unsigned int) > >>>>>>> | | + 0.30% PyObject_Free > >>>>>>> | | + 0.20% dict_dealloc > >>>>>>> | + 8.10% Gil::Gil(SafeThreadState&, bool) > >>>>>>> | + 8.10% PyEval_RestoreThread > >>>>>>> | + 8.10% PyThread_acquire_lock > >>>>>>> | + 8.10% sem_wait@@GLIBC_2.2.5 > >>>>>>> | + 8.10% __new_sem_wait_slow.constprop.0 > >>>>>>> | + 8.10% do_futex_wait.constprop.1 > >>>>>>> + 0.60% > std::condition_variable::wait(std::unique_lock<std::mutex>&) > >>>>>>> > >>>>>>> -- > >>>>>>> Paul Mezzanini > >>>>>>> Sr Systems Administrator / Engineer, Research Computing > >>>>>>> Information & Technology Services > >>>>>>> Finance & Administration > >>>>>>> Rochester Institute of Technology > >>>>>>> o:(585) 475-3245 | pfm...@rit.edu > >>>>>>> > >>>>>>> CONFIDENTIALITY NOTE: The information transmitted, including > attachments, is > >>>>>>> intended only for the person(s) or entity to which it is addressed > and may > >>>>>>> contain confidential and/or privileged material. Any review, > retransmission, > >>>>>>> dissemination or other use of, or taking of any action in reliance > upon this > >>>>>>> information by persons or entities other than the intended > recipient is > >>>>>>> prohibited. If you received this in error, please contact the > sender and > >>>>>>> destroy any copies of this information. > >>>>>>> ------------------------ > >>>>>>> > >>>>>>> ________________________________________ > >>>>>>> From: Mark Nelson <mnel...@redhat.com> > >>>>>>> Sent: Thursday, December 19, 2019 11:47 AM > >>>>>>> To: ceph-users@ceph.io > >>>>>>> Subject: [ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5 > >>>>>>> > >>>>>>> If you can get a wallclock profiler on the mgr process we might be > able > >>>>>>> to figure out specifics of what's taking so much time (ie > processing > >>>>>>> pg_summary or something else). Assuming you have gdb with the > python > >>>>>>> bindings and the ceph debug packages installed, if you (are anyone) > >>>>>>> could try gdbpmp on the 100% mgr process that would be fantastic. > >>>>>>> > >>>>>>> > >>>>>>> https://github.com/markhpc/gdbpmp > >>>>>>> > >>>>>>> > >>>>>>> gdbpmp.py -p`pidof ceph-mgr` -n 1000 -o mgr.gdbpmp > >>>>>>> > >>>>>>> > >>>>>>> If you want to view the results: > >>>>>>> > >>>>>>> > >>>>>>> gdbpmp.py -i mgr.gdbpmp -t 1 > >>>>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Mark > >>>>>>> > >>>>>>> > >>>>>>> On 12/19/19 6:29 AM, Paul Emmerich wrote: > >>>>>>>> We're also seeing unusually high mgr CPU usage on some setups, the > >>>>>>>> only thing they have in common seem to > 300 OSDs. > >>>>>>>> > >>>>>>>> Threads using the CPU are "mgr-fin" and and "ms_dispatch" > >>>>>>>> > >>>>>>>> > >>>>>>>> Paul > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Paul Emmerich > >>>>>>>> > >>>>>>>> Looking for help with your Ceph cluster? Contact us at > https://croit.io > >>>>>>>> > >>>>>>>> croit GmbH > >>>>>>>> Freseniusstr. 31h > >>>>>>>> 81247 München > >>>>>>>> www.croit.io <http://www.croit.io> > >>>>>>>> Tel: +49 89 1896585 90 > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Dec 19, 2019 at 9:40 AM Serkan Çoban < > cobanser...@gmail.com > >>>>>>>> <mailto:cobanser...@gmail.com>> wrote: > >>>>>>>> > >>>>>>>> +1 > >>>>>>>> 1500 OSDs, mgr is constant %100 after upgrading from 14.2.2 > to 14.2.5. > >>>>>>>> > >>>>>>>> On Thu, Dec 19, 2019 at 11:06 AM Toby Darling > >>>>>>>> <t...@mrc-lmb.cam.ac.uk <mailto:t...@mrc-lmb.cam.ac.uk>> > wrote: > >>>>>>>> > > >>>>>>>> > On 18/12/2019 22:40, Bryan Stillwell wrote: > >>>>>>>> > > That's how we noticed it too. Our graphs went silent > after > >>>>>>>> the upgrade > >>>>>>>> > > completed. Is your large cluster over 350 OSDs? > >>>>>>>> > > >>>>>>>> > A 'me too' on this - graphs have gone quiet, and mgr is > using > >>>>>>>> 100% CPU. > >>>>>>>> > This happened when we grew our 14.2.5 cluster from 328 to > 436 OSDs. > >>>>>>>> > > >>>>>>>> > Cheers > >>>>>>>> > Toby > >>>>>>>> > -- > >>>>>>>> > Toby Darling, Scientific Computing (2N249) > >>>>>>>> > MRC Laboratory of Molecular Biology > >>>>>>>> > Francis Crick Avenue > >>>>>>>> > Cambridge Biomedical Campus > >>>>>>>> > Cambridge CB2 0QH > >>>>>>>> > Phone 01223 267070 > >>>>>>>> > _______________________________________________ > >>>>>>>> > ceph-users mailing list -- ceph-users@ceph.io > >>>>>>>> <mailto:ceph-users@ceph.io> > >>>>>>>> > To unsubscribe send an email to ceph-users-le...@ceph.io > >>>>>>>> <mailto:ceph-users-le...@ceph.io> > >>>>>>>> _______________________________________________ > >>>>>>>> ceph-users mailing list -- ceph-users@ceph.io > >>>>>>>> <mailto:ceph-users@ceph.io> > >>>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>>>>>>> <mailto:ceph-users-le...@ceph.io> > >>>>>>>> > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> ceph-users mailing list -- ceph-users@ceph.io > >>>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>>>>>> _______________________________________________ > >>>>>>> ceph-users mailing list -- ceph-users@ceph.io > >>>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>>>>> _______________________________________________ > >>>>>> ceph-users mailing list -- ceph-users@ceph.io > >>>>>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>>>> > >>>>> _______________________________________________ > >>>>> ceph-users mailing list -- ceph-users@ceph.io > >>>>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>> _______________________________________________ > >>> ceph-users mailing list -- ceph-users@ceph.io > >>> To unsubscribe send an email to ceph-users-le...@ceph.io > >>> > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@ceph.io > >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io >
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io