Re: [ceph-users] CephFS and many small files
Hi Paul! Thanks for your answer. Yep, bluestore_min_alloc_size and your calculation sounds very reasonable to me :) Am 29.03.2019 um 23:56 schrieb Paul Emmerich: Are you running on HDDs? The minimum allocation size is 64kb by default here. You can control that via the parameter bluestore_min_alloc_size during OSD creation. 64 kb times 8 million files is 512 GB which is the amount of usable space you reported before running the test, so that seems to add up. My test cluster is virtualized on vSphere, but the OSDs are reported as HDDs. And our production cluster also uses HDDs only. All OSDs use the default value for bluestore_min_alloc_size. If we should really consider tinkering with bluestore_min_alloc_size: As this is probably not tunable afterwards, we would need to replace all OSDs in a rolling update. Should we expect any problems while we have OSDs with mixed min_alloc_sizes? There's also some metadata overhead etc. You might want to consider enabling inline data in cephfs to handle small files in a store-efficient way (note that this feature is officially marked as experimental, though). http://docs.ceph.com/docs/master/cephfs/experimental-features/#inline-data I'll give it a try on my test cluster. -- Jörn Clausen Daten- und Rechenzentrum GEOMAR Helmholtz-Zentrum für Ozeanforschung Kiel Düsternbrookerweg 20 24105 Kiel smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PG stuck in active+clean+remapped
As we fixed failed node next day, cluster rebalanced to it's original state without any issues, so crush dump would be irrelevant at this point I guess. Will have to wait for next occurence. Here's a tunables part, maybe it will help to shed some light: "tunables": { "choose_local_tries": 0, "choose_local_fallback_tries": 0, "choose_total_tries": 50, "chooseleaf_descend_once": 1, "chooseleaf_vary_r": 1, "chooseleaf_stable": 0, "straw_calc_version": 1, "allowed_bucket_algs": 22, "profile": "firefly", "optimal_tunables": 0, "legacy_tunables": 0, "minimum_required_version": "firefly", "require_feature_tunables": 1, "require_feature_tunables2": 1, "has_v2_rules": 0, "require_feature_tunables3": 1, "has_v3_rules": 0, "has_v4_buckets": 0, "require_feature_tunables5": 0, "has_v5_rules": 0 }, вс, 31 мар. 2019 г. в 13:28, huang jun : > seems like the crush cannot get enough osds for this pg, > what the output of 'ceph osd crush dump' and especially the 'tunables' > section values? > > Vladimir Prokofev 于2019年3月27日周三 上午4:02写道: > > > > CEPH 12.2.11, pool size 3, min_size 2. > > > > One node went down today(private network interface started flapping, and > after a while OSD processes crashed), no big deal, cluster recovered, but > not completely. 1 PG stuck in active+clean+remapped state. > > > > PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES > LOG DISK_LOG STATE STATE_STAMPVERSION > REPORTEDUP UP_PRIMARY ACTING ACTING_PRIMARY > LAST_SCRUB SCRUB_STAMPLAST_DEEP_SCRUB > DEEP_SCRUB_STAMP SNAPTRIMQ_LEN > > 20.a2 511 00 511 0 > 1584410172 1500 1500 active+clean+remapped 2019-03-26 20:50:18.639452 > 96149'18920496861:935872[26,14] 26 [26,14,9] > 2696149'189204 2019-03-26 10:47:36.17476995989'187669 2019-03-22 > 23:29:02.322848 0 > > > > it states it's placed on 26,14 OSDs, should be on 26,14,9. As far as I > can see there's nothing wrong with any of those OSDs, they work, host other > PGs, peer with each other, etc. I tried restarting all of them one after > another, but without any success. > > OSD 9 hosts 95 other PGs, don't think it's PG overdose. > > > > Last line of log from osd.9 mentioning PG 20.a2: > > 2019-03-26 20:50:16.294500 7fe27963a700 1 osd.9 pg_epoch: 96860 > pg[20.a2( v 96149'189204 (95989'187645,96149'189204] > local-lis/les=96857/96858 n=511 ec=39164/39164 lis/c 96857/96855 les/c/f > 96858/96856/66611 96859/96860/96855) [26,14]/[26,14,9] r=2 lpr=96860 > pi=[96855,96860)/1 crt=96149'189204 lcod 0'0 remapped NOTIFY mbc={}] > state: transitioning to Stray > > > > Nothing else out of ordinary, just usual scrubs/deep-scrubs > notifications. > > Any ideas what it it can be, or any other steps to troubleshoot this? > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Thank you! > HuangJun > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] REQUEST_SLOW across many OSDs at the same time
" " Thanks for this advice. It helped me to identify a subset of devices (only 3 of the whole cluster) where was this problem happening. The SAS adapter (LSI SAS 3008) on my Supermicro board was the issue. There is a RAID mode enabled by default. I have flashed the latest firmware (v16) and switched to IT mode (no raid). Issues with slow requests immediately ceased. I hope it will help someone else with the same issue :-) Best, Martin " I am afraid I was not clear enough. Suppose that ceph health detail reports a slow request involving osd.14 In osd.14 log I see this line: 2019-02-24 16:58:39.475740 7fe25a84d700 0 log_channel(cluster) log [WRN] : slow request 30.328572 seconds old, received at 2019-02-24 16:58:09.147037: osd_op(client.148580771.0:476351313 8.1d6 8:6ba6a916:::rbd_data.ba32e7238e1f 29.04b3:head [set-alloc-hint object_size 4194304 write_size 4194304,write 3776512~4096] snapc 0=[] ondisk+write+known_if_redirected e 1242718) currently op_applied Here the pg_num is 8.1d6 # ceph pg map 8.1d6 osdmap e1247126 pg 8.1d6 (8.1d6) -> up [14,38,24] acting [14,38,24] [root@ceph-osd-02 ceph]# ceph pg map 8.1d6 So the problem is not necessarily is osd.14. It could also in osd.38 or osd. 24, or in the relevant hosts " " " " "___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] co-located cephfs client deadlock
Hi all, We have been benchmarking a hyperconverged cephfs cluster (kernel clients + osd on same machines) for awhile. Over the weekend (for the first time) we had one cephfs mount deadlock while some clients were running ior. All the ior processes are stuck in D state with this stack: [] wait_on_page_bit+0x83/0xa0 [] __filemap_fdatawait_range+0x111/0x190 [] filemap_fdatawait_range+0x14/0x30 [] filemap_write_and_wait_range+0x56/0x90 [] ceph_fsync+0x55/0x420 [ceph] [] do_fsync+0x67/0xb0 [] SyS_fsync+0x10/0x20 [] system_call_fastpath+0x22/0x27 [] 0x We tried restarting the co-located OSDs, and tried evicting the client, but the processes stay deadlocked. We've seen the recent issue related to co-location (https://bugzilla.redhat.com/show_bug.cgi?id=1665248) but we don't have the `usercopy` warning in dmesg. Are there other known issues related to co-locating? Thanks! Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] CephFS and many small files
There are no problems with mixed bluestore_min_alloc_size; that's an abstraction layer lower than the concept of multiple OSDs. (Also, you always have that when mixing SSDs and HDDs) I'm not sure about the real-world impacts of a lower min alloc size or the rationale behind the default values for HDDs (64) and SSDs (16kb). Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Apr 1, 2019 at 10:36 AM Clausen, Jörn wrote: > > Hi Paul! > > Thanks for your answer. Yep, bluestore_min_alloc_size and your > calculation sounds very reasonable to me :) > > Am 29.03.2019 um 23:56 schrieb Paul Emmerich: > > Are you running on HDDs? The minimum allocation size is 64kb by > > default here. You can control that via the parameter > > bluestore_min_alloc_size during OSD creation. > > 64 kb times 8 million files is 512 GB which is the amount of usable > > space you reported before running the test, so that seems to add up. > > My test cluster is virtualized on vSphere, but the OSDs are reported as > HDDs. And our production cluster also uses HDDs only. All OSDs use the > default value for bluestore_min_alloc_size. > > If we should really consider tinkering with bluestore_min_alloc_size: As > this is probably not tunable afterwards, we would need to replace all > OSDs in a rolling update. Should we expect any problems while we have > OSDs with mixed min_alloc_sizes? > > > There's also some metadata overhead etc. You might want to consider > > enabling inline data in cephfs to handle small files in a > > store-efficient way (note that this feature is officially marked as > > experimental, though). > > http://docs.ceph.com/docs/master/cephfs/experimental-features/#inline-data > > I'll give it a try on my test cluster. > > -- > Jörn Clausen > Daten- und Rechenzentrum > GEOMAR Helmholtz-Zentrum für Ozeanforschung Kiel > Düsternbrookerweg 20 > 24105 Kiel > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] co-located cephfs client deadlock
Which kernel version are you using? We've had lots of problems with random deadlocks in kernels with cephfs but 4.19 seems to be pretty stable. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Apr 1, 2019 at 12:45 PM Dan van der Ster wrote: > > Hi all, > > We have been benchmarking a hyperconverged cephfs cluster (kernel > clients + osd on same machines) for awhile. Over the weekend (for the > first time) we had one cephfs mount deadlock while some clients were > running ior. > > All the ior processes are stuck in D state with this stack: > > [] wait_on_page_bit+0x83/0xa0 > [] __filemap_fdatawait_range+0x111/0x190 > [] filemap_fdatawait_range+0x14/0x30 > [] filemap_write_and_wait_range+0x56/0x90 > [] ceph_fsync+0x55/0x420 [ceph] > [] do_fsync+0x67/0xb0 > [] SyS_fsync+0x10/0x20 > [] system_call_fastpath+0x22/0x27 > [] 0x > > We tried restarting the co-located OSDs, and tried evicting the > client, but the processes stay deadlocked. > > We've seen the recent issue related to co-location > (https://bugzilla.redhat.com/show_bug.cgi?id=1665248) but we don't > have the `usercopy` warning in dmesg. > > Are there other known issues related to co-locating? > > Thanks! > Dan > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] co-located cephfs client deadlock
It's the latest CentOS 7.6 kernel. Known pain there? The user was running a 1.95TiB ior benchmark -- so, trying to do parallel writes to one single 1.95TiB file. We have max_file_size 219902322 (exactly 2 TiB) so it should fit. Thanks! Dan On Mon, Apr 1, 2019 at 1:06 PM Paul Emmerich wrote: > > Which kernel version are you using? We've had lots of problems with > random deadlocks in kernels with cephfs but 4.19 seems to be pretty > stable. > > > Paul > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > 81247 München > www.croit.io > Tel: +49 89 1896585 90 > > On Mon, Apr 1, 2019 at 12:45 PM Dan van der Ster wrote: > > > > Hi all, > > > > We have been benchmarking a hyperconverged cephfs cluster (kernel > > clients + osd on same machines) for awhile. Over the weekend (for the > > first time) we had one cephfs mount deadlock while some clients were > > running ior. > > > > All the ior processes are stuck in D state with this stack: > > > > [] wait_on_page_bit+0x83/0xa0 > > [] __filemap_fdatawait_range+0x111/0x190 > > [] filemap_fdatawait_range+0x14/0x30 > > [] filemap_write_and_wait_range+0x56/0x90 > > [] ceph_fsync+0x55/0x420 [ceph] > > [] do_fsync+0x67/0xb0 > > [] SyS_fsync+0x10/0x20 > > [] system_call_fastpath+0x22/0x27 > > [] 0x > > > > We tried restarting the co-located OSDs, and tried evicting the > > client, but the processes stay deadlocked. > > > > We've seen the recent issue related to co-location > > (https://bugzilla.redhat.com/show_bug.cgi?id=1665248) but we don't > > have the `usercopy` warning in dmesg. > > > > Are there other known issues related to co-locating? > > > > Thanks! > > Dan > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Samsung 983 NVMe M.2 - experiences?
Hi Fabian, We've just started building a cluster using the PM983 for the bucket index. Let me know if you want us to perform any test on them. Thanks, Martin > -Original Message- > From: ceph-users On Behalf Of > Fabian Figueredo > Sent: 30. marts 2019 07:55 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Samsung 983 NVMe M.2 - experiences? > > Hello, > I'm in the process of building a new ceph cluster, this time around i was > considering going with nvme ssd drives. > In searching for something in the line of 1TB per ssd drive, i found "Samsung > 983 DCT 960GB NVMe M.2 Enterprise SSD for Business". > > More info: > https://www.samsung.com/us/business/products/computing/ssd/enterpris > e/983-dct-960gb-mz-1lb960ne/ > > The idea is buy 10 units. > > Anyone have any thoughts/experiences with this drives? > > Thanks, > Fabian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] op_w_latency
Hello Ceph Users, I am finding that the write latency across my ceph clusters isn't great and I wanted to see what other people are getting for op_w_latency. Generally I am getting 70-110ms latency. I am using: ceph --admin-daemon /var/run/ceph/ceph-osd.102.asok perf dump | grep -A3 '\"op_w_latency' | grep 'avgtime' Ram, CPU and network don't seem to be the bottleneck. The drives are behind a dell H810p raid card with a 1GB writeback cache and battery. I have tried with LSI JBOD cards and haven't found it faster ( as you would expect with write cache ). The disks through iostat -xyz 1 show 10-30% usage with general service + write latency around 3-4ms. Queue depth is normally less than one. RocksDB write latency is around 0.6ms, read 1-2ms. Usage is RBD backend for Cloudstack. Dumping the ops seems to show the latency here: (ceph --admin-daemon /var/run/ceph/ceph-osd.102.asok dump_historic_ops_by_duration |less) { "time": "2019-04-01 22:24:38.432000", "event": "queued_for_pg" }, { "time": "2019-04-01 22:24:38.438691", "event": "reached_pg" }, { "time": "2019-04-01 22:24:38.438740", "event": "started" }, { "time": "2019-04-01 22:24:38.727820", "event": "sub_op_started" }, { "time": "2019-04-01 22:24:38.728448", "event": "sub_op_committed" }, { "time": "2019-04-01 22:24:39.129175", "event": "commit_sent" }, { "time": "2019-04-01 22:24:39.129231", "event": "done" } ] } } This write was around a very slow one and I am wondering if I have a few ops that are taking along time and most that are good What else can I do to figure out where the issue is? This e-mail is intended solely for the benefit of the addressee(s) and any other named recipient. It is confidential and may contain legally privileged or confidential information. If you are not the recipient, any use, distribution, disclosure or copying of this e-mail is prohibited. The confidentiality and legal privilege attached to this communication is not waived or lost by reason of the mistaken transmission or delivery to you. If you have received this e-mail in error, please notify us immediately. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] co-located cephfs client deadlock
On Mon, Apr 1, 2019 at 6:45 PM Dan van der Ster wrote: > > Hi all, > > We have been benchmarking a hyperconverged cephfs cluster (kernel > clients + osd on same machines) for awhile. Over the weekend (for the > first time) we had one cephfs mount deadlock while some clients were > running ior. > > All the ior processes are stuck in D state with this stack: > > [] wait_on_page_bit+0x83/0xa0 > [] __filemap_fdatawait_range+0x111/0x190 > [] filemap_fdatawait_range+0x14/0x30 > [] filemap_write_and_wait_range+0x56/0x90 > [] ceph_fsync+0x55/0x420 [ceph] > [] do_fsync+0x67/0xb0 > [] SyS_fsync+0x10/0x20 > [] system_call_fastpath+0x22/0x27 > [] 0x > are there hang osd requests in /sys/kernel/debug/ceph/xxx/osdc? > We tried restarting the co-located OSDs, and tried evicting the > client, but the processes stay deadlocked. > > We've seen the recent issue related to co-location > (https://bugzilla.redhat.com/show_bug.cgi?id=1665248) but we don't > have the `usercopy` warning in dmesg. > > Are there other known issues related to co-locating? > > Thanks! > Dan > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] CephFS and many small files
I haven't had any issues with 4k allocation size in cluster holding 189M files. April 1, 2019 2:04 PM, "Paul Emmerich" wrote: > I'm not sure about the real-world impacts of a lower min alloc size or > the rationale behind the default values for HDDs (64) and SSDs (16kb). > > Paul ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph nautilus upgrade problem
Hi, Please let us know how this ended for you! -- Mark Schouten Tuxis, Ede, https://www.tuxis.nl T: +31 318 200208 - Originele bericht - Van: Stadsnet (jwil...@stads.net) Datum: 26-03-2019 16:42 Naar: Ashley Merrick (singap...@amerrick.co.uk) Cc: ceph-users@lists.ceph.com Onderwerp: Re: [ceph-users] Ceph nautilus upgrade problem On 26-3-2019 16:39, Ashley Merrick wrote: Have you upgraded any OSD's? No didn't go through with the osd's On a test cluster I saw the same and as I upgraded / restarted the OSD's the PG's started to show online till it was 100%. I know it says to not change anything to do with pool's during the upgrade so I am guessing there is a code change that cause this till all is on the same version. will continue On Tue, Mar 26, 2019 at 11:37 PM Stadsnet wrote: We did a upgrade from luminous to nautilus after upgrading the three monitors we got that all our pgs where inactive cluster: id: 5bafad08-31b2-4716-be77-07ad2e2647eb health: HEALTH_ERR noout flag(s) set 1 scrub errors Reduced data availability: 1429 pgs inactive 316 pgs not deep-scrubbed in time 520 pgs not scrubbed in time 3 monitors have not enabled msgr2 services: mon: 3 daemons, quorum Ceph-Mon1,Ceph-Mon2,Ceph-Mon3 (age 51m) mgr: Ceph-Mon1(active, since 23m), standbys: Ceph-Mon3, Ceph-Mon2 osd: 103 osds: 103 up, 103 in flags noout rgw: 2 daemons active (S3-Ceph1, S3-Ceph2) data: pools: 26 pools, 3248 pgs objects: 134.92M objects, 202 TiB usage: 392 TiB used, 486 TiB / 879 TiB avail pgs: 100.000% pgs unknown 3248 unknown System seems to keep working. Did we loose reference "-1 0 root default" ? is there a fix for that ? ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -18 16.0 root ssd -10 2.0 host Ceph-Stor1-SSD 80 nvme 2.0 osd.80 up 1.0 1.0 -11 2.0 host Ceph-Stor2-SSD 81 nvme 2.0 osd.81 up 1.0 1.0 -12 2.0 host Ceph-Stor3-SSD 82 nvme 2.0 osd.82 up 1.0 1.0 -13 2.0 host Ceph-Stor4-SSD 83 nvme 2.0 osd.83 up 1.0 1.0 -14 2.0 host Ceph-Stor5-SSD 84 nvme 2.0 osd.84 up 1.0 1.0 -15 2.0 host Ceph-Stor6-SSD 85 nvme 2.0 osd.85 up 1.0 1.0 -16 2.0 host Ceph-Stor7-SSD 86 nvme 2.0 osd.86 up 1.0 1.0 -17 2.0 host Ceph-Stor8-SSD 87 nvme 2.0 osd.87 up 1.0 1.0 -1 865.93420 root default -2 110.96700 host Ceph-Stor1 0 hdd 9.09599 osd.0 up 1.0 1.0 1 hdd 9.09599 osd.1 up 1.0 1.0 2 hdd 9.09599 osd.2 up 1.0 1.0 3 hdd 9.09599 osd.3 up 1.0 1.0 4 hdd 9.09599 osd.4 up 1.0 1.0 5 hdd 9.09599 osd.5 up 1.0 1.0 6 hdd 9.09599 osd.6 up 1.0 1.0 7 hdd 9.09599 osd.7 up 1.0 1.0 8 hdd 9.09599 osd.8 up 1.0 1.0 9 hdd 9.09599 osd.9 up 1.0 1.0 88 hdd 9.09599 osd.88 up 1.0 1.0 89 hdd 9.09599 osd.89 up 1.0 1.0 -3 109.15189 host Ceph-Stor2 10 hdd 9.09599 osd.10 up 1.0 1.0 11 hdd 9.09599 osd.11 up 1.0 1.0 12 hdd 9.09599 osd.12 up 1.0 1.0 13 hdd 9.09599 osd.13 up 1.0 1.0 14 hdd 9.09599 osd.14 up 1.0 1.0 15 hdd 9.09599 osd.15 up 1.0 1.0 16 hdd 9.09599 osd.16 up 1.0 1.0 17 hdd 9.09599 osd.17 up 1.0 1.0 18 hdd 9.09599 osd.18 up 1.0 1.0 19 hdd 9.09599 osd.19 up 1.0 1.0 90 hdd 9.09598 osd.90 up 1.0 1.0 91 hdd 9.09598 osd.91 up 1.0 1.0 -4 109.15189 host Ceph-Stor3 20 hdd 9.09599 osd.20 up 1.0 1.0 21 hdd 9.09599 osd.21 up 1.0 1.0 22 hdd 9.09599 osd.22 up 1.0 1.0 23 hdd 9.09599 osd.23 up 1.0 1.0 24 hdd 9.09599 osd.24 up 1.0 1.0 25 hdd 9.09599 osd.25 up 1.0 1.0 26 hdd 9.09599 osd.26 up 1.0 1.0 27 hdd 9.09599 osd.27 up 1.0 1.0 28 hdd 9.09599 osd.28 up 1.0 1.0 29 hdd 9.09599 osd.29 up 1.0 1.0 92 hdd 9.09598 osd.92 up 1.0 1.0 93 hdd 9.09598 osd.93 up 0.80002 1.0 -5 109.15189 host Ceph-Stor4 30 hdd 9.09599 osd.30 up 1.0 1.0 31 hdd 9.09599 osd.31 up 1.0 1.0 32 hdd 9.09599 osd.32 up 1.0 1.0 33 hdd 9.09599 osd.33 up 1.0 1.0 34 hdd 9.09599 osd.34 up 0.90002 1.0 35 hdd 9.09599 osd.35 up 1.0 1.0 36 hdd 9.09599 osd.36 up 1.0 1.0 37 hdd 9.09599 osd.37 up 1.0 1.0 38 hdd 9.09599 osd.38 up 1.0 1.0 39 hdd 9.09599 osd.39 up 1.0 1.0 94 hdd 9.09598 osd.94 up 1.0 1.0 95 hdd 9.09598 osd.95 up 1.0 1.0 -6 109.15189 host Ceph-Stor5 40 hdd 9.09599 osd.40 up 1.0 1.0 41 hdd 9.09599 osd.41 up 1.0 1.0 42 hdd 9.09599 osd.42 up 1.0 1.0 43 hdd 9.09599 osd.43 up 1.0 1.0 44 hdd 9.09599 osd.44 up 1.0 1.0 45 hdd 9.09599 osd.45 up 1.0 1.0 46 hdd 9.09599 osd.46 up 1.0 1.0 47 hdd 9.09599 osd.47 up 1.0 1.0 48 hdd 9.09599 osd.48 up 1.0 1.0 49 hdd 9.09599 osd.49 up 1.0 1.0 96 hdd 9.09598 osd.96 up 1.0 1.0 97 hdd 9.09598 osd.97 up 1.0 1.0 -7 109.15187 host Ceph-Stor6 50 hdd 9.09599 osd.5
[ceph-users] MDS allocates all memory (>500G) replaying, OOM-killed, repeat
Hello We are experiencing an issue where our ceph MDS gobbles up 500G of RAM, is killed by the kernel, dies, then repeats. We have 3 MDS daemons on different machines, and all are exhibiting this behavior. We are running the following versions (from Docker): * ceph/daemon:v3.2.1-stable-3.2-luminous-centos-7 * ceph/daemon:v3.2.1-stable-3.2-luminous-centos-7 * ceph/daemon:v3.1.0-stable-3.1-luminous-centos-7 (downgraded in last-ditch effort to resolve, didn't help) The machines hosting the MDS instances have 512G RAM. We tried adding swap, and the MDS just started eating into the swap (and got really slow, eventually being kicked out for exceeding the mds_beacon_grace of 240). mds_cache_memory_limit has been many values ranging from 200G to the default of 1073741824, and the result of replay is always the same: keep allocating memory until the kernel OOM killer stops it (or the mds_beacon_grace period expires, if swap is enabled). Before it died, the active MDS reported 1.592 million inodes to Prometheus (ceph_mds_inodes) and 1.493 million caps (ceph_mds_caps). This appears to be the same problem as http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030872.html At this point I feel like my best option is to try to destroy the journal and hope things come back, but while we can probably recover from this, I'd like to prevent it happening in the future. Any advice? Neale Pickett A-4: Advanced Research in Cyber Systems Los Alamos National Laboratory ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-iscsi: (Config.lock) Timed out (30s) waiting for excl lock on gateway.conf object
What happens when you run "rados -p rbd lock list gateway.conf"? On Fri, Mar 29, 2019 at 12:19 PM Matthias Leopold wrote: > > Hi, > > I upgraded my test Ceph iSCSI gateways to > ceph-iscsi-3.0-6.g433bbaa.el7.noarch. > I'm trying to use the new parameter "cluster_client_name", which - to me > - sounds like I don't have to access the ceph cluster as "client.admin" > anymore. I created a "client.iscsi" user and watched what happened. The > gateways can obviously read the config (which I created when I was still > client.admin), but when I try to change anything (like create a new disk > in pool "iscsi") I get the following error: > > (Config.lock) Timed out (30s) waiting for excl lock on gateway.conf object > > I suspect this is related to the privileges of "client.iscsi", but I > couldn't find the correct settings yet. The last thing I tried was: > > caps: [mon] allow r, allow command "osd blacklist" > caps: [osd] allow * pool=rbd, profile rbd pool=iscsi > > Can anybody tell me how to solve this? > My Ceph version is 12.2.10 on CentOS 7. > > thx > Matthias > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS allocates all memory (>500G) replaying, OOM-killed, repeat
We decided to go ahead and try truncating the journal, but before we did, we would try to back it up. However, there are ridiculous values in the header. It can't write a journal this large because (I presume) my ext4 filesystem can't seek to this position in the (sparse) file. I would not be surprised to learn that memory allocation is trying to do something similar, hence the allocation of all available memory. This seems like a new kind of journal corruption that isn't being reported correctly. [root@lima /]# time cephfs-journal-tool --cluster=prodstore journal export backup.bin journal is 24652730602129~673601102 2019-04-01 17:49:52.776977 7fdcb999e040 -1 Error 22 ((22) Invalid argument) seeking to 0x166be9401291 Error ((22) Invalid argument) real0m27.832s user0m2.028s sys 0m3.438s [root@lima /]# cephfs-journal-tool --cluster=prodstore event get summary Events by type: EXPORT: 187 IMPORTFINISH: 182 IMPORTSTART: 182 OPEN: 3133 SUBTREEMAP: 129 UPDATE: 42185 Errors: 0 [root@lima /]# cephfs-journal-tool --cluster=prodstore header get { "magic": "ceph fs volume v011", "write_pos": 24653404029749, "expire_pos": 24652730602129, "trimmed_pos": 24652730597376, "stream_format": 1, "layout": { "stripe_unit": 4194304, "stripe_count": 1, "object_size": 4194304, "pool_id": 2, "pool_ns": "" } } [root@lima /]# printf "%x\n" "24653404029749" 166c1163c335 [root@lima /]# printf "%x\n" "24652730602129" 166be9401291 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS allocates all memory (>500G) replaying, OOM-killed, repeat
Since my problem is going to be archived on the Internet I'll keep following up, so the next person with this problem might save some time. The seek was because ext4 can't seek to 23TB, but changing to an xfs mount to create this file resulted in success. Here is what I wound up doing to fix this: * Bring down all MDSes so they stop flapping * Back up journal (as seen in previous message) * Apply journal manually * Reset journal manually * Clear session table * Clear other tables (not sure I needed to do this) * Mark FS down * Mark the rank 0 MDS as failed * Reset the FS (yes, I really mean it) * Restart MDSes * Finally get some sleep If anybody has any idea what may have caused this situation, I am keenly interested. If not, hopefully I at least helped someone else. From: Pickett, Neale T Sent: Monday, April 1, 2019 12:31 To: ceph-users@lists.ceph.com Subject: Re: MDS allocates all memory (>500G) replaying, OOM-killed, repeat We decided to go ahead and try truncating the journal, but before we did, we would try to back it up. However, there are ridiculous values in the header. It can't write a journal this large because (I presume) my ext4 filesystem can't seek to this position in the (sparse) file. I would not be surprised to learn that memory allocation is trying to do something similar, hence the allocation of all available memory. This seems like a new kind of journal corruption that isn't being reported correctly. [root@lima /]# time cephfs-journal-tool --cluster=prodstore journal export backup.bin journal is 24652730602129~673601102 2019-04-01 17:49:52.776977 7fdcb999e040 -1 Error 22 ((22) Invalid argument) seeking to 0x166be9401291 Error ((22) Invalid argument) real0m27.832s user0m2.028s sys 0m3.438s [root@lima /]# cephfs-journal-tool --cluster=prodstore event get summary Events by type: EXPORT: 187 IMPORTFINISH: 182 IMPORTSTART: 182 OPEN: 3133 SUBTREEMAP: 129 UPDATE: 42185 Errors: 0 [root@lima /]# cephfs-journal-tool --cluster=prodstore header get { "magic": "ceph fs volume v011", "write_pos": 24653404029749, "expire_pos": 24652730602129, "trimmed_pos": 24652730597376, "stream_format": 1, "layout": { "stripe_unit": 4194304, "stripe_count": 1, "object_size": 4194304, "pool_id": 2, "pool_ns": "" } } [root@lima /]# printf "%x\n" "24653404029749" 166c1163c335 [root@lima /]# printf "%x\n" "24652730602129" 166be9401291 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS allocates all memory (>500G) replaying, OOM-killed, repeat
These steps pretty well correspond to http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/ (http://docs.ceph.com/docs/mimic/cephfs/disaster-recovery/) Were you able to replay journal manually with no issues? IIRC, "cephfs-journal-tool recover_dentries" would lead to OOM in case of MDS doing so, and it has already been discussed on this list. April 2, 2019 1:37 AM, "Pickett, Neale T" mailto:ne...@lanl.gov?to=%22Pickett,%20Neale%20T%22%20)> wrote: Here is what I wound up doing to fix this: * Bring down all MDSes so they stop flapping * Back up journal (as seen in previous message) * Apply journal manually * Reset journal manually * Clear session table * Clear other tables (not sure I needed to do this) * Mark FS down * Mark the rank 0 MDS as failed * Reset the FS (yes, I really mean it) * Restart MDSes * Finally get some sleep ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Unable to list rbd block > images in nautilus dashboard
Hi all, I've been having an issue with the dashboard being unable to list block images. In the mimic and luminous dashboards it would take a very long time to load, eventually telling me it was showing a cached list, and after a few auto refreshes it would finally show all rbd images and their properties. In the nautilus dashboard however it just times out and never tries again, display 'Could not load data. Please check the cluster health' - cluster however reports healthy. Using the cli to retrieve information works, however it can be slow to calculate du for every image. I know this isn't a problem with the dashboard itself but instead whatever mechanism its using under the hood to retrieve the information regarding the block images. I'm not sure what component needs to be diagnosed here though. The cluster itself is performant, VMs running from the rbd images are performant. I do have multiple rbd pools though, one on EC hdds for slow/large storage and another on replicated ssd for fast storage. Is this an issue with having multiple rbd pools? Or is this an issue with mon health? I should mention that this is a relatively small cluster of just a couple nodes, single mon, single mds, 2 rgw, 13 osd - it's basically a lab and home storage. Thanks, Wes Cilldhaire Sol1 (null) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Unable to list rbd block > images in nautilus dashboard
Sorry slight correction, nautilus dashboard has finally listed the images, it just took even longer still. It's also reporting the same "Warning Displaying previously cached data for pools rbd, rbd_repl_ssd." messages as before and is clearly struggling. Thanks, Wes Cilldhaire Sol1 - Original Message - From: "Wes Cilldhaire" To: "ceph-users" Sent: Tuesday, 2 April, 2019 11:38:30 AM Subject: [ceph-users] Unable to list rbd block > images in nautilus dashboard Hi all, I've been having an issue with the dashboard being unable to list block images. In the mimic and luminous dashboards it would take a very long time to load, eventually telling me it was showing a cached list, and after a few auto refreshes it would finally show all rbd images and their properties. In the nautilus dashboard however it just times out and never tries again, display 'Could not load data. Please check the cluster health' - cluster however reports healthy. Using the cli to retrieve information works, however it can be slow to calculate du for every image. I know this isn't a problem with the dashboard itself but instead whatever mechanism its using under the hood to retrieve the information regarding the block images. I'm not sure what component needs to be diagnosed here though. The cluster itself is performant, VMs running from the rbd images are performant. I do have multiple rbd pools though, one on EC hdds for slow/large storage and another on replicated ssd for fast storage. Is this an issue with having multiple rbd pools? Or is this an issue with mon health? I should mention that this is a relatively small cluster of just a couple nodes, single mon, single mds, 2 rgw, 13 osd - it's basically a lab and home storage. Thanks, Wes Cilldhaire Sol1 (null) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com (null) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Update crushmap when monitors are down
Hi, Our ceph production cluster is down when updating crushmap. Now we can't get out monitors to come online and when they come online for a fraction of a second we see crush map errors in logs. How can we update crushmap when monitors are down as none of the ceph commands are working. Thanks, Pardhiv Karri ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Update crushmap when monitors are down
Can you provide detail error logs when mon crash? Pardhiv Karri 于2019年4月2日周二 上午9:02写道: > > Hi, > > Our ceph production cluster is down when updating crushmap. Now we can't get > out monitors to come online and when they come online for a fraction of a > second we see crush map errors in logs. How can we update crushmap when > monitors are down as none of the ceph commands are working. > > Thanks, > Pardhiv Karri > > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Thank you! HuangJun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Update crushmap when monitors are down
Hi Huang, We are on ceph Luminous 12.2.11 The primary is sh1ora1300 but that is not coming up at all. sh1ora1301 and sh1ora1302 are coming up and are in quorum as per log but still not able to run any ceph commands. Below is part of the log. 2019-04-02 00:48:51.644339 mon.sh1ora1302 mon.2 10.15.29.21:6789/0 105 : cluster [INF] mon.sh1ora1302 calling monitor election 2019-04-02 00:51:41.706135 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 292 : cluster [WRN] overall HEALTH_WARN crush map has legacy tunables (require bobtail, min is firefly); 399 osds down; 14 hosts (17 osds) down; 785718/146017356 objects misplaced (0.538%); 10/48672452 objects unfound (0.000%); Reduced data availability: 11606 pgs inactive, 86 pgs down, 779 pgs peering, 3081 pgs stale; Degraded data redundancy: 59329035/146017356 objects degraded (40.631%), 16508 pgs degraded, 19795 pgs undersized; 1/3 mons down, quorum sh1ora1301,sh1ora1302 2019-04-02 00:52:15.583292 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 293 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:31.224838 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 294 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:31.256251 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 295 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:39.810572 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 296 : cluster [INF] mon.sh1ora1301 is new leader, mons sh1ora1301,sh1ora1302 in quorum (ranks 1,2) 2019-04-02 00:48:06.751139 mon.sh1ora1302 mon.2 10.15.29.21:6789/0 104 : cluster [INF] mon.sh1ora1302 calling monitor election 2019-04-02 00:48:51.644339 mon.sh1ora1302 mon.2 10.15.29.21:6789/0 105 : cluster [INF] mon.sh1ora1302 calling monitor election 2019-04-02 00:51:41.706135 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 292 : cluster [WRN] overall HEALTH_WARN crush map has legacy tunables (require bobtail, min is firefly); 399 osds down; 14 hosts (17 osds) down; 785718/146017356 objects misplaced (0.538%); 10/48672452 objects unfound (0.000%); Reduced data availability: 11606 pgs inactive, 86 pgs down, 779 pgs peering, 3081 pgs stale; Degraded data redundancy: 59329035/146017356 objects degraded (40.631%), 16508 pgs degraded, 19795 pgs undersized; 1/3 mons down, quorum sh1ora1301,sh1ora1302 2019-04-02 00:52:15.583292 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 293 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:31.224838 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 294 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:31.256251 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 295 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:39.810572 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 296 : cluster [INF] mon.sh1ora1301 is new leader, mons sh1ora1301,sh1ora1302 in quorum (ranks 1,2) 2019-04-02 00:48:06.751139 mon.sh1ora1302 mon.2 10.15.29.21:6789/0 104 : cluster [INF] mon.sh1ora1302 calling monitor election 2019-04-02 00:48:51.644339 mon.sh1ora1302 mon.2 10.15.29.21:6789/0 105 : cluster [INF] mon.sh1ora1302 calling monitor election 2019-04-02 00:51:41.706135 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 292 : cluster [WRN] overall HEALTH_WARN crush map has legacy tunables (require bobtail, min is firefly); 399 osds down; 14 hosts (17 osds) down; 785718/146017356 objects misplaced (0.538%); 10/48672452 objects unfound (0.000%); Reduced data availability: 11606 pgs inactive, 86 pgs down, 779 pgs peering, 3081 pgs stale; Degraded data redundancy: 59329035/146017356 objects degraded (40.631%), 16508 pgs degraded, 19795 pgs undersized; 1/3 mons down, quorum sh1ora1301,sh1ora1302 2019-04-02 00:52:15.583292 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 293 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:31.224838 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 294 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:31.256251 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 295 : cluster [INF] mon.sh1ora1301 calling monitor election 2019-04-02 00:52:39.810572 mon.sh1ora1301 mon.1 10.15.29.15:6789/0 296 : cluster [INF] mon.sh1ora1301 is new leader, mons sh1ora1301,sh1ora1302 in quorum (ranks 1,2) Thanks, Pardhiv Karri On Mon, Apr 1, 2019 at 6:16 PM huang jun wrote: > Can you provide detail error logs when mon crash? > > Pardhiv Karri 于2019年4月2日周二 上午9:02写道: > > > > Hi, > > > > Our ceph production cluster is down when updating crushmap. Now we can't > get out monitors to come online and when they come online for a fraction of > a second we see crush map errors in logs. How can we update crushmap when > monitors are down as none of the ceph commands are working. > > > > Thanks, > > Pardhiv Karri > > > > > > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Thank you! > HuangJun > -- *Pardhiv Karri* "Rise and Rise again until LAMBS become LIONS" __
[ceph-users] MDS stuck at replaying status
Hi, This happens after we restart the active MDS, and somehow the standby MDS daemon cannot take over successfully and is stuck at up:replaying. It is showing the following log. Any idea on how to fix this? 2019-04-02 12:54:00.985079 7f6f70670700 1 mds.WXS0023 respawn 2019-04-02 12:54:00.985095 7f6f70670700 1 mds.WXS0023 e: '/usr/bin/ceph-mds' 2019-04-02 12:54:00.985097 7f6f70670700 1 mds.WXS0023 0: '/usr/bin/ceph-mds' 2019-04-02 12:54:00.985099 7f6f70670700 1 mds.WXS0023 1: '-f' 2019-04-02 12:54:00.985100 7f6f70670700 1 mds.WXS0023 2: '--cluster' 2019-04-02 12:54:00.985101 7f6f70670700 1 mds.WXS0023 3: 'ceph' 2019-04-02 12:54:00.985102 7f6f70670700 1 mds.WXS0023 4: '--id' 2019-04-02 12:54:00.985103 7f6f70670700 1 mds.WXS0023 5: 'WXS0023' 2019-04-02 12:54:00.985104 7f6f70670700 1 mds.WXS0023 6: '--setuser' 2019-04-02 12:54:00.985105 7f6f70670700 1 mds.WXS0023 7: 'ceph' 2019-04-02 12:54:00.985106 7f6f70670700 1 mds.WXS0023 8: '--setgroup' 2019-04-02 12:54:00.985107 7f6f70670700 1 mds.WXS0023 9: 'ceph' 2019-04-02 12:54:00.985142 7f6f70670700 1 mds.WXS0023 respawning with exe /usr/bin/ceph-mds 2019-04-02 12:54:00.985145 7f6f70670700 1 mds.WXS0023 exe_path /proc/self/exe 2019-04-02 12:54:02.139272 7ff8a739a200 0 ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable), process (unknown), pid 3369045 2019-04-02 12:54:02.141565 7ff8a739a200 0 pidfile_write: ignore empty --pid-file 2019-04-02 12:54:06.675604 7ff8a0ecd700 1 mds.WXS0023 handle_mds_map standby 2019-04-02 12:54:26.114757 7ff8a0ecd700 1 mds.0.136021 handle_mds_map i am now mds.0.136021 2019-04-02 12:54:26.114764 7ff8a0ecd700 1 mds.0.136021 handle_mds_map state change up:boot --> up:replay 2019-04-02 12:54:26.114779 7ff8a0ecd700 1 mds.0.136021 replay_start 2019-04-02 12:54:26.114784 7ff8a0ecd700 1 mds.0.136021 recovery set is 2019-04-02 12:54:26.114789 7ff8a0ecd700 1 mds.0.136021 waiting for osdmap 14333 (which blacklists prior instance) 2019-04-02 12:54:26.141256 7ff89a6c0700 0 mds.0.cache creating system inode with ino:0x100 2019-04-02 12:54:26.141454 7ff89a6c0700 0 mds.0.cache creating system inode with ino:0x1 2019-04-02 12:54:50.148022 7ff89dec7700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:54:50.148049 7ff89dec7700 1 mds.beacon.WXS0023 _send skipping beacon, heartbeat map not healthy 2019-04-02 12:54:52.143637 7ff8a1ecf700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:54:54.148122 7ff89dec7700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:54:54.148157 7ff89dec7700 1 mds.beacon.WXS0023 _send skipping beacon, heartbeat map not healthy 2019-04-02 12:54:57.143730 7ff8a1ecf700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:54:58.148239 7ff89dec7700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:54:58.148249 7ff89dec7700 1 mds.beacon.WXS0023 _send skipping beacon, heartbeat map not healthy 2019-04-02 12:55:02.143819 7ff8a1ecf700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:55:02.148311 7ff89dec7700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:55:02.148330 7ff89dec7700 1 mds.beacon.WXS0023 _send skipping beacon, heartbeat map not healthy 2019-04-02 12:55:06.148393 7ff89dec7700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:55:06.148416 7ff89dec7700 1 mds.beacon.WXS0023 _send skipping beacon, heartbeat map not healthy 2019-04-02 12:55:07.143914 7ff8a1ecf700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15 2019-04-02 12:55:07.615602 7ff89e6c8700 1 heartbeat_map reset_timeout 'MDSRank' had timed out after 15 2019-04-02 12:55:07.618294 7ff8a0ecd700 1 mds.WXS0023 map removed me (mds.-1 gid:7441294) from cluster due to lost contact; respawning 2019-04-02 12:55:07.618296 7ff8a0ecd700 1 mds.WXS0023 respawn 2019-04-02 12:55:07.618314 7ff8a0ecd700 1 mds.WXS0023 e: '/usr/bin/ceph-mds' 2019-04-02 12:55:07.618318 7ff8a0ecd700 1 mds.WXS0023 0: '/usr/bin/ceph-mds' 2019-04-02 12:55:07.618319 7ff8a0ecd700 1 mds.WXS0023 1: '-f' 2019-04-02 12:55:07.618320 7ff8a0ecd700 1 mds.WXS0023 2: '--cluster' 2019-04-02 12:55:07.618320 7ff8a0ecd700 1 mds.WXS0023 3: 'ceph' 2019-04-02 12:55:07.618321 7ff8a0ecd700 1 mds.WXS0023 4: '--id' 2019-04-02 12:55:07.618321 7ff8a0ecd700 1 mds.WXS0023 5: 'WXS0023' 2019-04-02 12:55:07.618322 7ff8a0ecd700 1 mds.WXS0023 6: '--setuser' 2019-04-02 12:55:07.618323 7ff8a0ecd700 1 mds.WXS0023 7: 'ceph' 2019-04-02 12:55:07.618323 7ff8a0ecd700 1 mds.WXS0023 8: '--setgroup' 2019-04-02 12:55:07.618325 7ff8a0ecd700 1 mds.WXS0023 9: 'ceph' 2019-04-02 12:55:07.618352 7ff8a0ecd700 1 mds.WXS0023 respawning with exe /usr/bin/ceph-mds 2019-04-02 12:55:07.618353 7ff8a0ecd700 1 mds.WXS0023 exe_path /proc/self/exe 2019-04-02 12:55:09.174064 7f4c596be200 0 ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stabl