Hi Igor
Yes we got all in same device : [root@CEPH-MON01 ~]# ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME 31 130.96783 - 131 TiB 114 TiB 114 TiB 14 MiB 204 GiB 17 TiB 86.88 1.03 - host CEPH008 5 archive 10.91399 0.80002 11 TiB 7.9 TiB 7.9 TiB 2.6 MiB 15 GiB 3.0 TiB 72.65 0.86 181 up osd.5 6 archive 10.91399 1.00000 11 TiB 9.4 TiB 9.3 TiB 5.8 MiB 17 GiB 1.6 TiB 85.76 1.01 222 up osd.6 11 archive 10.91399 1.00000 11 TiB 10 TiB 10 TiB 48 KiB 19 GiB 838 GiB 92.50 1.09 251 up osd.11 45 archive 10.91399 1.00000 11 TiB 10 TiB 10 TiB 148 KiB 18 GiB 678 GiB 93.94 1.11 248 up osd.45 46 archive 10.91399 1.00000 11 TiB 9.6 TiB 9.5 TiB 4.7 MiB 17 GiB 1.4 TiB 87.52 1.04 235 up osd.46 47 archive 10.91399 1.00000 11 TiB 8.8 TiB 8.8 TiB 68 KiB 17 GiB 2.1 TiB 80.43 0.95 211 up osd.47 55 archive 10.91399 1.00000 11 TiB 9.9 TiB 9.9 TiB 132 KiB 17 GiB 1.0 TiB 90.74 1.07 243 up osd.55 70 archive 10.91399 1.00000 11 TiB 10 TiB 10 TiB 44 KiB 19 GiB 864 GiB 92.27 1.09 236 up osd.70 71 archive 10.91399 1.00000 11 TiB 9.2 TiB 9.2 TiB 28 KiB 16 GiB 1.7 TiB 84.19 1.00 228 up osd.71 78 archive 10.91399 1.00000 11 TiB 8.9 TiB 8.9 TiB 182 KiB 16 GiB 2.0 TiB 81.87 0.97 215 up osd.78 79 archive 10.91399 1.00000 11 TiB 10 TiB 10 TiB 152 KiB 17 GiB 958 GiB 91.43 1.08 238 up osd.79 91 archive 10.91399 1.00000 11 TiB 9.7 TiB 9.7 TiB 92 KiB 17 GiB 1.2 TiB 89.22 1.06 232 up osd.91 Disk are HGST of 12TB for archive porpourse. In the same osd we got sine commit bluestore log latency 2019-08-07 06:57:33.681 7f059b06e700 0 bluestore(/var/lib/ceph/osd/ceph-46) queue_transactions slow operation observed for l_bluestore_submit_lat, latency = 11.163s 2019-08-07 06:57:33.703 7f05a8088700 0 bluestore(/var/lib/ceph/osd/ceph-46) _txc_committed_kv slow operation observed for l_bluestore_commit_lat, latency = 11.1858s, txc = 0x55e9e3ea2c00 2019-08-07 09:14:00.620 7f059d072700 0 bluestore(/var/lib/ceph/osd/ceph-46) queue_transactions slow operation observed for l_bluestore_submit_lat, latency = 7.23777s 2019-08-07 09:14:00.650 7f05a8088700 0 bluestore(/var/lib/ceph/osd/ceph-46) _txc_committed_kv slow operation observed for l_bluestore_commit_lat, latency = 7.26778s, txc = 0x55eaafbf6600 2019-08-07 09:19:08.242 7f059e875700 0 bluestore(/var/lib/ceph/osd/ceph-46) queue_transactions slow operation observed for l_bluestore_submit_lat, latency = 81.8293s 2019-08-07 09:19:08.291 7f05a8088700 0 bluestore(/var/lib/ceph/osd/ceph-46) _txc_committed_kv slow operation observed for l_bluestore_commit_lat, latency = 81.8609s, txc = 0x55ea05ee6000 2019-08-07 09:19:08.467 7f059b06e700 0 bluestore(/var/lib/ceph/osd/ceph-46) queue_transactions slow operation observed for l_bluestore_submit_lat, latency = 87.7795s 2019-08-07 09:19:08.481 7f05a8088700 0 bluestore(/var/lib/ceph/osd/ceph-46) _txc_committed_kv slow operation observed for l_bluestore_commit_lat, latency = 87.7928s, txc = 0x55eaa7a40600 Maybe move OMAP +META from all OSD to a NVME of 480GB per node helps in this situation but not sure. Manuel De: Igor Fedotov <ifedo...@suse.de> Enviado el: miércoles, 7 de agosto de 2019 13:10 Para: EDH - Manuel Rios Fernandez <mrios...@easydatahost.com>; 'Ceph Users' <ceph-users@lists.ceph.com> Asunto: Re: [ceph-users] 14.2.2 - OSD Crash Hi Manuel, as Brad pointed out timeouts and suicides are rather consequences of some other issues with OSDs. I recall at least two recent relevant tickets: https://tracker.ceph.com/issues/36482 https://tracker.ceph.com/issues/40741 (see last comments) Both had massive and slow reads from RocksDB which caused timeouts.. Visible symptom for both cases was unexpectedly high read I/O from underlying disks (main and/or DB). You can use iotop for inspection.., These were worsened by having significant part of DB at spinners due to spillovers. So wondering what's your layout in this respect: what drives back troublesome OSDs, is there any spillover to slow device, how massive it is? Also could you please inspect your OSD logs for the presence of lines containing "slow operation observed" substring. And share them if any.. Hope this helps. Thanks, Igor On 8/7/2019 2:16 AM, EDH - Manuel Rios Fernandez wrote: Hi We got a pair of OSD located in node that crash randomly since 14.2.2 OS Version : Centos 7.6 Therere a ton of lines before crash , I will unespected: -- 3045> 2019-08-07 00:39:32.013 7fe9a4996700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15 -3044> 2019-08-07 00:39:32.013 7fe9a3994700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15 -3043> 2019-08-07 00:39:32.033 7fe9a4195700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15 -3042> 2019-08-07 00:39:32.033 7fe9a4996700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15 -- ----- Some hundred lines of: -164> 2019-08-07 00:47:36.628 7fe9a3994700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60 -163> 2019-08-07 00:47:36.632 7fe9a3994700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60 -162> 2019-08-07 00:47:36.632 7fe9a3994700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60 ----- -78> 2019-08-07 00:50:51.755 7fe995bfa700 10 monclient: tick -77> 2019-08-07 00:50:51.755 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:50:21.756453) -76> 2019-08-07 00:51:01.755 7fe995bfa700 10 monclient: tick -75> 2019-08-07 00:51:01.755 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:50:31.756604) -74> 2019-08-07 00:51:11.755 7fe995bfa700 10 monclient: tick -73> 2019-08-07 00:51:11.755 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:50:41.756788) -72> 2019-08-07 00:51:21.756 7fe995bfa700 10 monclient: tick -71> 2019-08-07 00:51:21.756 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:50:51.756982) -70> 2019-08-07 00:51:31.755 7fe995bfa700 10 monclient: tick -69> 2019-08-07 00:51:31.755 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:51:01.757206) -68> 2019-08-07 00:51:41.756 7fe995bfa700 10 monclient: tick -67> 2019-08-07 00:51:41.756 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:51:11.757364) -66> 2019-08-07 00:51:51.756 7fe995bfa700 10 monclient: tick -65> 2019-08-07 00:51:51.756 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:51:21.757535) -64> 2019-08-07 00:51:52.861 7fe987e49700 1 heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15 -63> 2019-08-07 00:51:52.861 7fe987e49700 1 heartbeat_map clear_timeout 'OSD::osd_op_tp thread 0x7fe987e49700' had suicide timed out after 150 -62> 2019-08-07 00:51:52.948 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1721180160 mapped: 4297818112 old cache_size: 1994018210 new cache size: 1992784572 -61> 2019-08-07 00:51:52.948 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 1992784572 kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 763363328 meta_used: 654593191 data_alloc: 452984832 data_used: 455929856 -60> 2019-08-07 00:51:57.923 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 1994110827 kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 763363328 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -59> 2019-08-07 00:51:57.973 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 1994110827 new cache size: 1994442069 -58> 2019-08-07 00:52:01.756 7fe995bfa700 10 monclient: tick -57> 2019-08-07 00:52:01.756 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:51:31.757684) -56> 2019-08-07 00:52:02.933 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 1995765747 kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 763363328 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -55> 2019-08-07 00:52:02.983 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 1995765747 new cache size: 1996096345 -54> 2019-08-07 00:52:07.943 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 1997417449 kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 763363328 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -53> 2019-08-07 00:52:07.993 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 1997417449 new cache size: 1997747404 -52> 2019-08-07 00:52:11.757 7fe995bfa700 10 monclient: tick -51> 2019-08-07 00:52:11.757 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:51:41.757855) -50> 2019-08-07 00:52:12.952 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 1999065941 kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 763363328 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -49> 2019-08-07 00:52:13.002 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 1999065941 new cache size: 1999395254 -48> 2019-08-07 00:52:17.962 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2000711226 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -47> 2019-08-07 00:52:18.012 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2000711226 new cache size: 2001039899 -46> 2019-08-07 00:52:21.756 7fe995bfa700 10 monclient: tick -45> 2019-08-07 00:52:21.756 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:51:51.758043) -44> 2019-08-07 00:52:22.971 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2002353314 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -43> 2019-08-07 00:52:23.022 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2002353314 new cache size: 2002681348 -42> 2019-08-07 00:52:27.982 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2003992210 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -41> 2019-08-07 00:52:28.031 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2003992210 new cache size: 2004319607 -40> 2019-08-07 00:52:31.756 7fe995bfa700 10 monclient: tick -39> 2019-08-07 00:52:31.756 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:52:01.758219) -38> 2019-08-07 00:52:32.991 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2005627920 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -37> 2019-08-07 00:52:33.041 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2005627920 new cache size: 2005954680 -36> 2019-08-07 00:52:38.001 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2007260450 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -35> 2019-08-07 00:52:38.051 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2007260450 new cache size: 2007586575 -34> 2019-08-07 00:52:41.757 7fe995bfa700 10 monclient: tick -33> 2019-08-07 00:52:41.757 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:52:11.758447) -32> 2019-08-07 00:52:43.011 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2008889806 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -31> 2019-08-07 00:52:43.061 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2008889806 new cache size: 2009215297 -30> 2019-08-07 00:52:48.021 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2010515995 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -29> 2019-08-07 00:52:48.071 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2010515995 new cache size: 2010840853 -28> 2019-08-07 00:52:51.757 7fe995bfa700 10 monclient: tick -27> 2019-08-07 00:52:51.757 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:52:21.758631) -26> 2019-08-07 00:52:53.031 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2012139023 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -25> 2019-08-07 00:52:53.081 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2012139023 new cache size: 2012463250 -24> 2019-08-07 00:52:58.042 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2013758896 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -23> 2019-08-07 00:52:58.092 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2013758896 new cache size: 2014082492 -22> 2019-08-07 00:53:01.758 7fe995bfa700 10 monclient: tick -21> 2019-08-07 00:53:01.758 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:52:31.758799) -20> 2019-08-07 00:53:03.052 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2015375620 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -19> 2019-08-07 00:53:03.102 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2015375620 new cache size: 2015698587 -18> 2019-08-07 00:53:08.062 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2016989201 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -17> 2019-08-07 00:53:08.112 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2016989201 new cache size: 2017311541 -16> 2019-08-07 00:53:11.758 7fe995bfa700 10 monclient: tick -15> 2019-08-07 00:53:11.758 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:52:41.759013) -14> 2019-08-07 00:53:13.071 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2018599645 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -13> 2019-08-07 00:53:13.121 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2018599645 new cache size: 2018921358 -12> 2019-08-07 00:53:18.081 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2020206960 kv_alloc: 771751936 kv_used: 749381098 meta_alloc: 771751936 meta_used: 654590799 data_alloc: 452984832 data_used: 451538944 -11> 2019-08-07 00:53:18.130 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2020206960 new cache size: 2020528048 -10> 2019-08-07 00:53:21.757 7fe995bfa700 10 monclient: tick -9> 2019-08-07 00:53:21.757 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:52:51.759214) -8> 2019-08-07 00:53:23.090 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2021811150 kv_alloc: 780140544 kv_used: 749381098 meta_alloc: 780140544 meta_used: 654590799 data_alloc: 461373440 data_used: 451538944 -7> 2019-08-07 00:53:23.140 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2021811150 new cache size: 2022131613 -6> 2019-08-07 00:53:28.100 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2023412220 kv_alloc: 780140544 kv_used: 749381098 meta_alloc: 780140544 meta_used: 654590799 data_alloc: 461373440 data_used: 451538944 -5> 2019-08-07 00:53:28.150 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2023412220 new cache size: 2023732060 -4> 2019-08-07 00:53:31.758 7fe995bfa700 10 monclient: tick -3> 2019-08-07 00:53:31.758 7fe995bfa700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 00:53:01.759334) -2> 2019-08-07 00:53:33.110 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 2025010178 kv_alloc: 780140544 kv_used: 749381098 meta_alloc: 780140544 meta_used: 654590799 data_alloc: 461373440 data_used: 451538944 -1> 2019-08-07 00:53:33.160 7fe99966c700 5 bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 heap: 6018998272 unmapped: 1725702144 mapped: 4293296128 old cache_size: 2025010178 new cache size: 2025329397 0> 2019-08-07 00:53:37.655 7fe987e49700 -1 *** Caught signal (Aborted) ** in thread 7fe987e49700 thread_name:tp_osd_tp ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable) 1: (()+0xf5d0) [0x7fe9a7cba5d0] 2: (pthread_kill()+0x31) [0x7fe9a7cb79d1] 3: (ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d const*, char const*, unsigned long)+0x466) [0x55fef8748176] 4: (ceph::HeartbeatMap::clear_timeout(ceph::heartbeat_handle_d*)+0x7b) [0x55fef874878b] 5: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionI mpl>&, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0xa9e) [0x55fef86085de] 6: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::Collection Impl>&, ObjectStore::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x7f) [0x55fef81cd7ff] 7: (OSD::dispatch_context_transaction(PG::RecoveryCtx&, PG*, ThreadPool::TPHandle*)+0x58) [0x55fef8118298] 8: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x202) [0x55fef81767c2] 9: (PGPeeringItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x50) [0x55fef83eb490] 10: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x9f4) [0x55fef816aef4] 11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x433) [0x55fef8769ce3] 12: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55fef876cd80] 13: (()+0x7dd5) [0x7fe9a7cb2dd5] 14: (clone()+0x6d) [0x7fe9a6b7202d] About server load: [root@CEPH008 ~]# top top - 00:57:30 up 186 days, 22 min, 1 user, load average: 11.65, 13.42, 13.51 Tasks: 316 total, 1 running, 315 sleeping, 0 stopped, 0 zombie %Cpu(s): 2.3 us, 1.2 sy, 0.0 ni, 74.1 id, 22.4 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem : 65737480 total, 431824 free, 49046608 used, 16259048 buff/cache KiB Swap: 29241340 total, 19406504 free, 9834836 used. 15917556 avail Mem Currently the server is doing some deep-scrub that we got off during the last two weeks due a node evict and a new node install. _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com