Hello, just to report,

Looks like change the message type to simple help to avoid the memory leak.
Just about a day later the memory still OK:
   1264 ceph      20   0 12,547g 1,247g  16652 S   3,3  8,2 110:16.93
ceph-mds


The memory usage is more than 2x of MDS limit (512Mb), but maybe is the
daemon overhead and the memory fragmentation. At least is not 13-15Gb like
before.

Greetings!!

2018-07-25 23:16 GMT+02:00 Daniel Carrasco <d.carra...@i2tic.com>:

> I've changed the configuration adding your line and changing the mds
> memory limit to 512Mb, and for now looks stable (its on about 3-6% and
> sometimes even below 3%). I've got a very high usage on boot:
> 1264 ceph      20   0 12,543g 6,251g  16184 S   2,0 41,1%   0:19.34
> ceph-mds
>
> but now looks acceptable:
> 1264 ceph      20   0 12,543g 737952  16188 S   1,0  4,6%   0:41.05
> ceph-mds
>
> Anyway, I need time to test it, because 15 minutes is too less.
>
> Greetings!!
>
> 2018-07-25 17:16 GMT+02:00 Daniel Carrasco <d.carra...@i2tic.com>:
>
>> Hello,
>>
>> Thanks for all your help.
>>
>> The dd is an option of any command?, because at least on Debian/Ubuntu is
>> an aplication to copy blocks, and then fails.
>> For now I cannot change the configuration, but later I'll try.
>> About the logs, I've not seen nothing about "warning", "error", "failed",
>> "message" or something similar, so looks like there are no messages of that
>> kind.
>>
>>
>> Greetings!!
>>
>> 2018-07-25 14:48 GMT+02:00 Yan, Zheng <uker...@gmail.com>:
>>
>>> On Wed, Jul 25, 2018 at 8:12 PM Yan, Zheng <uker...@gmail.com> wrote:
>>> >
>>> > On Wed, Jul 25, 2018 at 5:04 PM Daniel Carrasco <d.carra...@i2tic.com>
>>> wrote:
>>> > >
>>> > > Hello,
>>> > >
>>> > > I've attached the PDF.
>>> > >
>>> > > I don't know if is important, but I made changes on configuration
>>> and I've restarted the servers after dump that heap file. I've changed the
>>> memory_limit to 25Mb to test if stil with aceptable values of RAM.
>>> > >
>>> >
>>> > Looks like there are memory leak in async messenger.  what's output of
>>> > "dd /usr/bin/ceph-mds"? Could you try simple messenger (add "ms type =
>>> > simple" to 'global' section of ceph.conf)
>>> >
>>>
>>> Besides, are there any suspicious messages in mds log? such as "failed
>>> to decode message of type"
>>>
>>>
>>>
>>>
>>> > Regards
>>> > Yan, Zheng
>>> >
>>> > > Greetings!
>>> > >
>>> > > 2018-07-25 2:53 GMT+02:00 Yan, Zheng <uker...@gmail.com>:
>>> > >>
>>> > >> On Wed, Jul 25, 2018 at 4:52 AM Daniel Carrasco <
>>> d.carra...@i2tic.com> wrote:
>>> > >> >
>>> > >> > Hello,
>>> > >> >
>>> > >> > I've run the profiler for about 5-6 minutes and this is what I've
>>> got:
>>> > >> >
>>> > >>
>>> > >> please run pprof --pdf /usr/bin/ceph-mds
>>> > >> /var/log/ceph/ceph-mds.x.profile.<largest number>.heap >
>>> > >> /tmp/profile.pdf. and send me the pdf
>>> > >>
>>> > >>
>>> > >>
>>> > >> > ------------------------------------------------------------
>>> --------------------------------
>>> > >> > ------------------------------------------------------------
>>> --------------------------------
>>> > >> > ------------------------------------------------------------
>>> --------------------------------
>>> > >> > Using local file /usr/bin/ceph-mds.
>>> > >> > Using local file /var/log/ceph/mds.kavehome-mgt
>>> o-pro-fs01.profile.0009.heap.
>>> > >> > Total: 400.0 MB
>>> > >> >    362.5  90.6%  90.6%    362.5  90.6%
>>> ceph::buffer::create_aligned_in_mempool
>>> > >> >     20.4   5.1%  95.7%     29.8   7.5% CDir::_load_dentry
>>> > >> >      5.9   1.5%  97.2%      6.9   1.7% CDir::add_primary_dentry
>>> > >> >      4.7   1.2%  98.4%      4.7   1.2%
>>> ceph::logging::Log::create_entry
>>> > >> >      1.8   0.5%  98.8%      1.8   0.5%
>>> std::_Rb_tree::_M_emplace_hint_unique
>>> > >> >      1.8   0.5%  99.3%      2.2   0.5% compact_map_base::decode
>>> > >> >      0.6   0.1%  99.4%      0.7   0.2% CInode::add_client_cap
>>> > >> >      0.5   0.1%  99.5%      0.5   0.1%
>>> std::__cxx11::basic_string::_M_mutate
>>> > >> >      0.4   0.1%  99.6%      0.4   0.1% SimpleLock::more
>>> > >> >      0.4   0.1%  99.7%      0.4   0.1% MDCache::add_inode
>>> > >> >      0.3   0.1%  99.8%      0.3   0.1% CDir::add_to_bloom
>>> > >> >      0.2   0.1%  99.9%      0.2   0.1% CDir::steal_dentry
>>> > >> >      0.2   0.0%  99.9%      0.2   0.0% CInode::get_or_open_dirfrag
>>> > >> >      0.1   0.0%  99.9%      0.8   0.2% std::enable_if::type decode
>>> > >> >      0.1   0.0% 100.0%      0.1   0.0% ceph::buffer::list::crc32c
>>> > >> >      0.1   0.0% 100.0%      0.1   0.0% decode_message
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% OpTracker::create_request
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% TrackedOp::TrackedOp
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> std::vector::_M_emplace_back_aux
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> std::_Rb_tree::_M_insert_unique
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CInode::add_dirfrag
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDLog::_prepare_new_segment
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% DispatchQueue::enqueue
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> ceph::buffer::list::push_back
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Server::prepare_new_inode
>>> > >> >      0.0   0.0% 100.0%    365.6  91.4% EventCenter::process_events
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% std::_Rb_tree::_M_copy
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDir::add_null_dentry
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Locker::check_inode_max_size
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDentry::add_client_lease
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CInode::project_inode
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> std::__cxx11::list::_M_insert
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MDBalancer::handle_heartbeat
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDBalancer::send_heartbeat
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> C_GatherBase::C_GatherSub::complete
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> EventCenter::create_time_event
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDir::_omap_fetch
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Locker::handle_inode_file_caps
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> std::_Rb_tree::_M_insert_equal
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::issue_caps
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% MDLog::_submit_thread
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Journaler::_wait_for_flush
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Journaler::wrap_finisher
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDSCacheObject::add_waiter
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% std::__cxx11::list::insert
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> std::__detail::_Map_base::operator[]
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Locker::mark_updated_scatterlock
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% std::_Rb_tree::_M_insert_
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% alloc_ptr::operator->
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> ceph::buffer::list::append@5c1560
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> ceph::buffer::malformed_input::~malformed_input
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% compact_set_base::insert
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDir::add_waiter
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% InoTable::apply_release_ids
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> InoTable::project_release_ids
>>> > >> >      0.0   0.0% 100.0%      2.2   0.5% InodeStoreBase::decode_bare
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% interval_set::erase
>>> > >> >      0.0   0.0% 100.0%      1.1   0.3% std::map::operator[]
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Beacon::_send
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDSDaemon::reset_tick
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MgrClient::send_report
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Journaler::_do_flush
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% Locker::rdlock_start
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDCache::_get_waiter
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDentry::~CDentry
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MonClient::schedule_tick
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0%
>>> AsyncConnection::handle_write
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0%
>>> AsyncConnection::prepare_send_message
>>> > >> >      0.0   0.0% 100.0%    365.5  91.4% AsyncConnection::process
>>> > >> >      0.0   0.0% 100.0%      0.3   0.1%
>>> AsyncConnection::send_message
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% AsyncConnection::tick
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> AsyncMessenger::_send_message
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> AsyncMessenger::send_message
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> AsyncMessenger::submit_message
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Beacon::notify_health
>>> > >> >      0.0   0.0% 100.0%      0.2   0.1% CDentry::CDentry
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDentry::_mark_dirty
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDentry::auth_pin
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDentry::mark_dirty
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> CDentry::pop_projected_linkage
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDir::_mark_dirty
>>> > >> >      0.0   0.0% 100.0%     29.8   7.5% CDir::_omap_fetched
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDir::auth_pin
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDir::fetch
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDir::link_inode_work
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CDir::link_primary_inode
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CInode::_mark_dirty
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CInode::add_waiter
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CInode::auth_pin
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% CInode::choose_ideal_loner
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% CInode::encode_inodestat
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CInode::mark_dirty
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CInode::mark_dirty_parent
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% CInode::mark_dirty_rstat
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> CInode::pop_and_dirty_projected_inode
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% CInode::set_loner_cap
>>> > >> >      0.0   0.0% 100.0%     29.8   7.5%
>>> C_IO_Dir_OMAP_Fetched::finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> C_Locker_FileUpdate_finish::finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% C_MDL_CheckMaxSize::finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> C_MDS_RetryRequest::~C_MDS_RetryRequest
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> C_MDS_inode_update_finish::finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% C_MDS_openc_finish::finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> C_MDS_session_finish::finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% C_OnFinisher::finish
>>> > >> >      0.0   0.0% 100.0%      1.2   0.3% Context::complete
>>> > >> >      0.0   0.0% 100.0%      3.7   0.9%
>>> DispatchQueue::DispatchThread::entry
>>> > >> >      0.0   0.0% 100.0%      3.7   0.9% DispatchQueue::entry
>>> > >> >      0.0   0.0% 100.0%      0.9   0.2%
>>> DispatchQueue::fast_dispatch
>>> > >> >      0.0   0.0% 100.0%      2.1   0.5% DispatchQueue::pre_dispatch
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% EMetaBlob::print
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> EventCenter::process_time_events
>>> > >> >      0.0   0.0% 100.0%     29.9   7.5%
>>> Finisher::finisher_thread_entry
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% FunctionContext::finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Journaler::_flush
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Journaler::_write_head
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Journaler::flush
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Journaler::wait_for_flush
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::_do_cap_release
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::_do_cap_update
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::_drop_non_rdlocks
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::_rdlock_kick
>>> > >> >      0.0   0.0% 100.0%      0.2   0.0% Locker::acquire_locks
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::adjust_cap_wanted
>>> > >> >      0.0   0.0% 100.0%      0.2   0.0% Locker::dispatch
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::drop_locks
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% Locker::eval
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::eval_gather
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::file_update_finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Locker::handle_client_cap_release
>>> > >> >      0.0   0.0% 100.0%      0.2   0.0% Locker::handle_client_caps
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::handle_client_lease
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::handle_file_lock
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::handle_lock
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::issue_caps_set
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::issue_client_lease
>>> > >> >      0.0   0.0% 100.0%      0.6   0.2% Locker::issue_new_caps
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::local_wrlock_start
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::nudge_log
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::scatter_eval
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::scatter_mix
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::scatter_nudge
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::scatter_tick
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::scatter_writebehind
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Locker::scatter_writebehind_finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Locker::send_lock_message@42d5b0
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Locker::send_lock_message@42f2b0
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Locker::share_inode_max_size
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::simple_lock
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::simple_sync
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::tick
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::try_eval@43da60
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::try_eval@441fb0
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::wrlock_finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::wrlock_force
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::wrlock_start
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Locker::xlock_start
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MClientCaps::print
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MClientRequest::decode_payload
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MClientRequest::print
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDBalancer::prep_rebalance
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDBalancer::proc_message
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDCache::check_memory_usage
>>> > >> >      0.0   0.0% 100.0%      0.2   0.1% MDCache::path_traverse
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MDCache::predirty_journal_parents
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDCache::request_cleanup
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDCache::request_finish
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDCache::request_start
>>> > >> >      0.0   0.0% 100.0%      0.3   0.1% MDCache::trim
>>> > >> >      0.0   0.0% 100.0%      0.3   0.1% MDCache::trim_dentry
>>> > >> >      0.0   0.0% 100.0%      0.3   0.1% MDCache::trim_lru
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDCache::truncate_inode
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% MDLog::SubmitThread::entry
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDLog::_start_new_segment
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDLog::_submit_entry
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDLog::submit_entry
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MDSCacheObject::finish_waiting
>>> > >> >      0.0   0.0% 100.0%      0.2   0.1% MDSCacheObject::get
>>> > >> >      0.0   0.0% 100.0%      1.7   0.4% MDSDaemon::ms_dispatch
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDSDaemon::tick
>>> > >> >      0.0   0.0% 100.0%     29.9   7.5% MDSIOContextBase::complete
>>> > >> >      0.0   0.0% 100.0%      0.7   0.2%
>>> MDSInternalContextBase::complete
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDSLogContextBase::complete
>>> > >> >      0.0   0.0% 100.0%      0.3   0.1%
>>> MDSRank::ProgressThread::entry
>>> > >> >      0.0   0.0% 100.0%      0.7   0.2% MDSRank::_advance_queues
>>> > >> >      0.0   0.0% 100.0%      1.7   0.4% MDSRank::_dispatch
>>> > >> >      0.0   0.0% 100.0%      1.3   0.3%
>>> MDSRank::handle_deferrable_message
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MDSRank::send_message_client
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MDSRank::send_message_client_counted@2a9260
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MDSRank::send_message_client_counted@2a94f0
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MDSRank::send_message_client_counted@2b1920
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MDSRank::send_message_mds
>>> > >> >      0.0   0.0% 100.0%      1.7   0.4%
>>> MDSRankDispatcher::ms_dispatch
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% MDSRankDispatcher::tick
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MOSDOp::print
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% Message::encode
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MonClient::_check_auth_rotating
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MonClient::_check_auth_tickets
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> MonClient::_send_mon_message
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MonClient::tick
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MutationImpl::MutationImpl
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% MutationImpl::auth_pin
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% MutationImpl::pin
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% MutationImpl::start_locking
>>> > >> >      0.0   0.0% 100.0%    365.6  91.4% NetworkStack::get_worker
>>> > >> >      0.0   0.0% 100.0%      0.8   0.2%
>>> ObjectOperation::C_ObjectOperation_decodevals::finish
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% Objecter::_op_submit
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0%
>>> Objecter::_op_submit_with_budget
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% Objecter::_send_op
>>> > >> >      0.0   0.0% 100.0%      0.8   0.2%
>>> Objecter::handle_osd_op_reply
>>> > >> >      0.0   0.0% 100.0%      0.8   0.2% Objecter::ms_dispatch
>>> > >> >      0.0   0.0% 100.0%      0.8   0.2% Objecter::ms_fast_dispatch
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% Objecter::op_submit
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Objecter::sg_write_trunc
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% OpHistory::insert
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> OpTracker::unregister_inflight_op
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> PrebufferedStreambuf::overflow
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% SafeTimer::add_event_after
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% SafeTimer::add_event_at
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% SafeTimer::timer_thread
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% SafeTimerThread::entry
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Server::_session_logged
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Server::apply_allocated_inos
>>> > >> >      0.0   0.0% 100.0%      1.1   0.3% Server::dispatch
>>> > >> >      0.0   0.0% 100.0%      1.7   0.4%
>>> Server::dispatch_client_request
>>> > >> >      0.0   0.0% 100.0%      0.7   0.2%
>>> Server::handle_client_getattr
>>> > >> >      0.0   0.0% 100.0%      0.9   0.2% Server::handle_client_open
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Server::handle_client_openc
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Server::handle_client_readdir
>>> > >> >      0.0   0.0% 100.0%      1.1   0.3%
>>> Server::handle_client_request
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Server::handle_client_session
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Server::handle_client_setattr
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Server::journal_and_reply
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> Server::journal_close_session
>>> > >> >      0.0   0.0% 100.0%      0.3   0.1% Server::rdlock_path_pin_ref
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% Server::recall_client_state
>>> > >> >      0.0   0.0% 100.0%      0.5   0.1%
>>> Server::reply_client_request
>>> > >> >      0.0   0.0% 100.0%      0.5   0.1% Server::respond_to_request
>>> > >> >      0.0   0.0% 100.0%      0.4   0.1% Server::set_trace_dist
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% SessionMap::_mark_dirty
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% SessionMap::mark_dirty
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% SessionMap::remove_session
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> ceph::buffer::list::iterator_impl::copy
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> ceph::buffer::list::iterator_impl::copy_shallow
>>> > >> >      0.0   0.0% 100.0%    400.0 100.0% clone
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% filepath::parse_bits
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% inode_t::operator=
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% operator<<@2a2890
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% operator<<@2c9760
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% operator<<@3eadf0
>>> > >> >      0.0   0.0% 100.0%    400.0 100.0% start_thread
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0%
>>> std::__cxx11::basic_string::_M_append
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> std::__cxx11::basic_string::_M_replace_aux
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> std::__cxx11::list::operator=
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% std::__ostream_insert
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0%
>>> std::basic_streambuf::xsputn
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% std::num_put::_M_insert_int
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% std::num_put::do_put
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% std::operator<<
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% std::ostream::_M_insert
>>> > >> >      0.0   0.0% 100.0%    365.6  91.4%
>>> std::this_thread::__sleep_for
>>> > >> >      0.0   0.0% 100.0%      0.0   0.0% utime_t::localtime
>>> > >> >      0.0   0.0% 100.0%      0.1   0.0% void finish_contexts@2a30f0
>>> > >> > ------------------------------------------------------------
>>> --------------------------------
>>> > >> > ------------------------------------------------------------
>>> --------------------------------
>>> > >> > ------------------------------------------------------------
>>> --------------------------------
>>> > >> >
>>> > >> >
>>> > >> > Greetings!!
>>> > >> >
>>> > >> > 2018-07-24 12:07 GMT+02:00 Yan, Zheng <uker...@gmail.com>:
>>> > >> >>
>>> > >> >> On Tue, Jul 24, 2018 at 4:59 PM Daniel Carrasco <
>>> d.carra...@i2tic.com> wrote:
>>> > >> >> >
>>> > >> >> > Hello,
>>> > >> >> >
>>> > >> >> > How many time is neccesary?, because is a production
>>> environment and memory profiler + low cache size because the problem, gives
>>> a lot of CPU usage from OSD and MDS that makes it fails while profiler is
>>> running. Is there any problem if is done in a low traffic time? (less usage
>>> and maybe it don't fails, but maybe less info about usage).
>>> > >> >> >
>>> > >> >>
>>> > >> >> just one time,  wait a few minutes between start_profiler and
>>> stop_profiler
>>> > >> >>
>>> > >> >> > Greetings!
>>> > >> >> >
>>> > >> >> > 2018-07-24 10:21 GMT+02:00 Yan, Zheng <uker...@gmail.com>:
>>> > >> >> >>
>>> > >> >> >> I mean:
>>> > >> >> >>
>>> > >> >> >> ceph tell mds.x heap start_profiler
>>> > >> >> >>
>>> > >> >> >> ... wait for some time
>>> > >> >> >>
>>> > >> >> >> ceph tell mds.x heap stop_profiler
>>> > >> >> >>
>>> > >> >> >> pprof --text  /usr/bin/ceph-mds
>>> > >> >> >> /var/log/ceph/ceph-mds.x.profile.<largest number>.heap
>>> > >> >> >>
>>> > >> >> >>
>>> > >> >> >>
>>> > >> >> >>
>>> > >> >> >> On Tue, Jul 24, 2018 at 3:18 PM Daniel Carrasco <
>>> d.carra...@i2tic.com> wrote:
>>> > >> >> >> >
>>> > >> >> >> > This is what i get:
>>> > >> >> >> >
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap dump
>>> > >> >> >> > 2018-07-24 09:05:19.350720 7fc562ffd700  0 client.1452545
>>> ms_handle_reset on 10.22.0.168:6800/1685786126
>>> > >> >> >> > 2018-07-24 09:05:29.103903 7fc563fff700  0 client.1452548
>>> ms_handle_reset on 10.22.0.168:6800/1685786126
>>> > >> >> >> > mds.kavehome-mgto-pro-fs01 dumping heap profile now.
>>> > >> >> >> > ------------------------------------------------
>>> > >> >> >> > MALLOC:      760199640 (  725.0 MiB) Bytes in use by
>>> application
>>> > >> >> >> > MALLOC: +            0 (    0.0 MiB) Bytes in page heap
>>> freelist
>>> > >> >> >> > MALLOC: +    246962320 (  235.5 MiB) Bytes in central cache
>>> freelist
>>> > >> >> >> > MALLOC: +     43933664 (   41.9 MiB) Bytes in transfer
>>> cache freelist
>>> > >> >> >> > MALLOC: +     41012664 (   39.1 MiB) Bytes in thread cache
>>> freelists
>>> > >> >> >> > MALLOC: +     10186912 (    9.7 MiB) Bytes in malloc
>>> metadata
>>> > >> >> >> > MALLOC:   ------------
>>> > >> >> >> > MALLOC: =   1102295200 ( 1051.2 MiB) Actual memory used
>>> (physical + swap)
>>> > >> >> >> > MALLOC: +   4268335104 ( 4070.6 MiB) Bytes released to OS
>>> (aka unmapped)
>>> > >> >> >> > MALLOC:   ------------
>>> > >> >> >> > MALLOC: =   5370630304 ( 5121.8 MiB) Virtual address space
>>> used
>>> > >> >> >> > MALLOC:
>>> > >> >> >> > MALLOC:          33027              Spans in use
>>> > >> >> >> > MALLOC:             19              Thread heaps in use
>>> > >> >> >> > MALLOC:           8192              Tcmalloc page size
>>> > >> >> >> > ------------------------------------------------
>>> > >> >> >> > Call ReleaseFreeMemory() to release freelist memory to the
>>> OS (via madvise()).
>>> > >> >> >> > Bytes released to the OS take up virtual address space but
>>> no physical memory.
>>> > >> >> >> >
>>> > >> >> >> >
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap stats
>>> > >> >> >> > 2018-07-24 09:14:25.747706 7f94fffff700  0 client.1452578
>>> ms_handle_reset on 10.22.0.168:6800/1685786126
>>> > >> >> >> > 2018-07-24 09:14:25.754034 7f95057fa700  0 client.1452581
>>> ms_handle_reset on 10.22.0.168:6800/1685786126
>>> > >> >> >> > mds.kavehome-mgto-pro-fs01 tcmalloc heap
>>> stats:------------------------------------------------
>>> > >> >> >> > MALLOC:      960649328 (  916.1 MiB) Bytes in use by
>>> application
>>> > >> >> >> > MALLOC: +            0 (    0.0 MiB) Bytes in page heap
>>> freelist
>>> > >> >> >> > MALLOC: +    108867288 (  103.8 MiB) Bytes in central cache
>>> freelist
>>> > >> >> >> > MALLOC: +     37179424 (   35.5 MiB) Bytes in transfer
>>> cache freelist
>>> > >> >> >> > MALLOC: +     40143000 (   38.3 MiB) Bytes in thread cache
>>> freelists
>>> > >> >> >> > MALLOC: +     10186912 (    9.7 MiB) Bytes in malloc
>>> metadata
>>> > >> >> >> > MALLOC:   ------------
>>> > >> >> >> > MALLOC: =   1157025952 ( 1103.4 MiB) Actual memory used
>>> (physical + swap)
>>> > >> >> >> > MALLOC: +   4213604352 ( 4018.4 MiB) Bytes released to OS
>>> (aka unmapped)
>>> > >> >> >> > MALLOC:   ------------
>>> > >> >> >> > MALLOC: =   5370630304 ( 5121.8 MiB) Virtual address space
>>> used
>>> > >> >> >> > MALLOC:
>>> > >> >> >> > MALLOC:          33028              Spans in use
>>> > >> >> >> > MALLOC:             19              Thread heaps in use
>>> > >> >> >> > MALLOC:           8192              Tcmalloc page size
>>> > >> >> >> > ------------------------------------------------
>>> > >> >> >> > Call ReleaseFreeMemory() to release freelist memory to the
>>> OS (via madvise()).
>>> > >> >> >> > Bytes released to the OS take up virtual address space but
>>> no physical memory.
>>> > >> >> >> >
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > --------------------------------------------------------
>>> > >> >> >> > After heap release:
>>> > >> >> >> > :/# ceph tell mds.kavehome-mgto-pro-fs01 heap stats
>>> > >> >> >> > 2018-07-24 09:15:28.540203 7f2f7affd700  0 client.1443339
>>> ms_handle_reset on 10.22.0.168:6800/1685786126
>>> > >> >> >> > 2018-07-24 09:15:28.547153 7f2f7bfff700  0 client.1443342
>>> ms_handle_reset on 10.22.0.168:6800/1685786126
>>> > >> >> >> > mds.kavehome-mgto-pro-fs01 tcmalloc heap
>>> stats:------------------------------------------------
>>> > >> >> >> > MALLOC:      710315776 (  677.4 MiB) Bytes in use by
>>> application
>>> > >> >> >> > MALLOC: +            0 (    0.0 MiB) Bytes in page heap
>>> freelist
>>> > >> >> >> > MALLOC: +    246471880 (  235.1 MiB) Bytes in central cache
>>> freelist
>>> > >> >> >> > MALLOC: +     40802848 (   38.9 MiB) Bytes in transfer
>>> cache freelist
>>> > >> >> >> > MALLOC: +     38689304 (   36.9 MiB) Bytes in thread cache
>>> freelists
>>> > >> >> >> > MALLOC: +     10186912 (    9.7 MiB) Bytes in malloc
>>> metadata
>>> > >> >> >> > MALLOC:   ------------
>>> > >> >> >> > MALLOC: =   1046466720 (  998.0 MiB) Actual memory used
>>> (physical + swap)
>>> > >> >> >> > MALLOC: +   4324163584 ( 4123.8 MiB) Bytes released to OS
>>> (aka unmapped)
>>> > >> >> >> > MALLOC:   ------------
>>> > >> >> >> > MALLOC: =   5370630304 ( 5121.8 MiB) Virtual address space
>>> used
>>> > >> >> >> > MALLOC:
>>> > >> >> >> > MALLOC:          33177              Spans in use
>>> > >> >> >> > MALLOC:             19              Thread heaps in use
>>> > >> >> >> > MALLOC:           8192              Tcmalloc page size
>>> > >> >> >> > ------------------------------------------------
>>> > >> >> >> > Call ReleaseFreeMemory() to release freelist memory to the
>>> OS (via madvise()).
>>> > >> >> >> > Bytes released to the OS take up virtual address space but
>>> no physical memory.
>>> > >> >> >> >
>>> > >> >> >> >
>>> > >> >> >> > The other commands fails with a curl error:
>>> > >> >> >> > Failed to get profile: curl 'http:///pprof/profile?seconds=30'
>>> > /root/pprof/.tmp.ceph-mds.1532416424.:
>>> > >> >> >> >
>>> > >> >> >> >
>>> > >> >> >> > Greetings!!
>>> > >> >> >> >
>>> > >> >> >> > 2018-07-24 5:35 GMT+02:00 Yan, Zheng <uker...@gmail.com>:
>>> > >> >> >> >>
>>> > >> >> >> >> could you profile memory allocation of mds
>>> > >> >> >> >>
>>> > >> >> >> >> http://docs.ceph.com/docs/mimi
>>> c/rados/troubleshooting/memory-profiling/
>>> > >> >> >> >> On Tue, Jul 24, 2018 at 7:54 AM Daniel Carrasco <
>>> d.carra...@i2tic.com> wrote:
>>> > >> >> >> >> >
>>> > >> >> >> >> > Yeah, is also my thread. This thread was created before
>>> lower the cache size from 512Mb to 8Mb. I thought that maybe was my fault
>>> and I did a misconfiguration, so I've ignored the problem until now.
>>> > >> >> >> >> >
>>> > >> >> >> >> > Greetings!
>>> > >> >> >> >> >
>>> > >> >> >> >> > El mar., 24 jul. 2018 1:00, Gregory Farnum <
>>> gfar...@redhat.com> escribió:
>>> > >> >> >> >> >>
>>> > >> >> >> >> >> On Mon, Jul 23, 2018 at 11:08 AM Patrick Donnelly <
>>> pdonn...@redhat.com> wrote:
>>> > >> >> >> >> >>>
>>> > >> >> >> >> >>> On Mon, Jul 23, 2018 at 5:48 AM, Daniel Carrasco <
>>> d.carra...@i2tic.com> wrote:
>>> > >> >> >> >> >>> > Hi, thanks for your response.
>>> > >> >> >> >> >>> >
>>> > >> >> >> >> >>> > Clients are about 6, and 4 of them are the most of
>>> time on standby. Only two
>>> > >> >> >> >> >>> > are active servers that are serving the webpage.
>>> Also we've a varnish on
>>> > >> >> >> >> >>> > front, so are not getting all the load (below 30% in
>>> PHP is not much).
>>> > >> >> >> >> >>> > About the MDS cache, now I've the
>>> mds_cache_memory_limit at 8Mb.
>>> > >> >> >> >> >>>
>>> > >> >> >> >> >>> What! Please post `ceph daemon mds.<name> config
>>> diff`,  `... perf
>>> > >> >> >> >> >>> dump`, and `... dump_mempools `  from the server the
>>> active MDS is on.
>>> > >> >> >> >> >>>
>>> > >> >> >> >> >>> > I've tested
>>> > >> >> >> >> >>> > also 512Mb, but the CPU usage is the same and the
>>> MDS RAM usage grows up to
>>> > >> >> >> >> >>> > 15GB (on a 16Gb server it starts to swap and all
>>> fails). With 8Mb, at least
>>> > >> >> >> >> >>> > the memory usage is stable on less than 6Gb (now is
>>> using about 1GB of RAM).
>>> > >> >> >> >> >>>
>>> > >> >> >> >> >>> We've seen reports of possible memory leaks before and
>>> the potential
>>> > >> >> >> >> >>> fixes for those were in 12.2.6. How fast does your MDS
>>> reach 15GB?
>>> > >> >> >> >> >>> Your MDS cache size should be configured to 1-8GB
>>> (depending on your
>>> > >> >> >> >> >>> preference) so it's disturbing to see you set it so
>>> low.
>>> > >> >> >> >> >>
>>> > >> >> >> >> >>
>>> > >> >> >> >> >> See also the thread "[ceph-users] Fwd: MDS memory usage
>>> is very high", which had more discussion of that. The MDS daemon seemingly
>>> had 9.5GB of allocated RSS but only believed 489MB was in use for the
>>> cache...
>>> > >> >> >> >> >> -Greg
>>> > >> >> >> >> >>
>>> > >> >> >> >> >>>
>>> > >> >> >> >> >>>
>>> > >> >> >> >> >>> --
>>> > >> >> >> >> >>> Patrick Donnelly
>>> > >> >> >> >> >>> _______________________________________________
>>> > >> >> >> >> >>> ceph-users mailing list
>>> > >> >> >> >> >>> ceph-users@lists.ceph.com
>>> > >> >> >> >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> > >> >> >> >> >
>>> > >> >> >> >> > _______________________________________________
>>> > >> >> >> >> > ceph-users mailing list
>>> > >> >> >> >> > ceph-users@lists.ceph.com
>>> > >> >> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> > >> >> >> >
>>> > >> >> >> >
>>> > >> >> >> >
>>> > >> >> >> >
>>> > >> >> >> > --
>>> > >> >> >> > _________________________________________
>>> > >> >> >> >
>>> > >> >> >> >       Daniel Carrasco Marín
>>> > >> >> >> >       Ingeniería para la Innovación i2TIC, S.L.
>>> > >> >> >> >       Tlf:  +34 911 12 32 84 Ext: 223
>>> > >> >> >> >       www.i2tic.com
>>> > >> >> >> > _________________________________________
>>> > >> >> >
>>> > >> >> >
>>> > >> >> >
>>> > >> >> >
>>> > >> >> > --
>>> > >> >> > _________________________________________
>>> > >> >> >
>>> > >> >> >       Daniel Carrasco Marín
>>> > >> >> >       Ingeniería para la Innovación i2TIC, S.L.
>>> > >> >> >       Tlf:  +34 911 12 32 84 Ext: 223
>>> > >> >> >       www.i2tic.com
>>> > >> >> > _________________________________________
>>> > >> >
>>> > >> >
>>> > >> >
>>> > >> >
>>> > >> > --
>>> > >> > _________________________________________
>>> > >> >
>>> > >> >       Daniel Carrasco Marín
>>> > >> >       Ingeniería para la Innovación i2TIC, S.L.
>>> > >> >       Tlf:  +34 911 12 32 84 Ext: 223
>>> > >> >       www.i2tic.com
>>> > >> > _________________________________________
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > _________________________________________
>>> > >
>>> > >       Daniel Carrasco Marín
>>> > >       Ingeniería para la Innovación i2TIC, S.L.
>>> > >       Tlf:  +34 911 12 32 84 Ext: 223
>>> > >       www.i2tic.com
>>> > > _________________________________________
>>>
>>
>>
>>
>> --
>> _________________________________________
>>
>>       Daniel Carrasco Marín
>>       Ingeniería para la Innovación i2TIC, S.L.
>>       Tlf:  +34 911 12 32 84 Ext: 223
>>       www.i2tic.com
>> _________________________________________
>>
>
>
>
> --
> _________________________________________
>
>       Daniel Carrasco Marín
>       Ingeniería para la Innovación i2TIC, S.L.
>       Tlf:  +34 911 12 32 84 Ext: 223
>       www.i2tic.com
> _________________________________________
>



-- 
_________________________________________

      Daniel Carrasco Marín
      Ingeniería para la Innovación i2TIC, S.L.
      Tlf:  +34 911 12 32 84 Ext: 223
      www.i2tic.com
_________________________________________
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to