[ceph-users] How to recover/mount mirrored rbd image for file recovery

2020-03-19 Thread Ml Ml
Hello, my goal is to back up a proxmox cluster with rbd-mirror for desaster recovery. Promoting/Demoting, etc.. works great. But how can i access a single file on the mirrored cluster? I tried: root@ceph01:~# rbd-nbd --read-only map cluster5-rbd/vm-114-disk-1 --cluster backup /dev/nbd1 Bu

[ceph-users] Re: How to recover/mount mirrored rbd image for file recovery

2020-03-19 Thread Eugen Block
Hi, one workaround would be to create a protected snapshot on the primary image which is also mirrored, and then clone that snapshot on the remote site. That clone can be accessed as required. I'm not sure if there's a way to directly access the remote image since it's read-only. Regard

[ceph-users] Re: Full OSD's on cephfs_metadata pool

2020-03-19 Thread Eugen Block
Hi, I have tried extending the LV of one of the OSD's but it can't make use of it and I have added a separate db volume but that didn't help. can you tell why it can't make use of additional space? Extending LVs has worked for me in Nautilus. Maybe you could share the steps you performed?

[ceph-users] Re: OSDs continuously restarting under load

2020-03-19 Thread Igor Fedotov
Hi, Samuel, I've never seen that sort of signal in the real life: 2020-03-18 18:39:26.426584 201e35fdb40 -1 *** Caught signal (Bus error) ** I suppose this has some hardware roots. Have you checked dmesg output? Just in case, here is some info on "Bus Error" signal, may be it will provide s

[ceph-users] Re: Full OSD's on cephfs_metadata pool

2020-03-19 Thread Igor Fedotov
Hi Robert, there was a thread named "bluefs enospc" a couple day ago where Derek shared steps to bring in a standalone DB volume and get rid of "enospc" error. I'm currently working on a fix which hopefully will allow to recover from this failure but it might take some time before it lands

[ceph-users] Re: Full OSD's on cephfs_metadata pool

2020-03-19 Thread Robert Ruge
Thanks Igor. I found that thread in my mailbox a few hours into the episode and it saved the day. I managed to get 6 of the 8 OSD's up which was enough to get the 10 missing pg's online and transitioned back onto hdd. However I also appear to have killed two of the OSD's through maybe using ina

[ceph-users] Re: How to recover/mount mirrored rbd image for file recovery

2020-03-19 Thread Jason Dillaman
On Thu, Mar 19, 2020 at 6:19 AM Eugen Block wrote: > > Hi, > > one workaround would be to create a protected snapshot on the primary > image which is also mirrored, and then clone that snapshot on the > remote site. That clone can be accessed as required. +1. This is the correct approach. If you

[ceph-users] 回复: Re: OSDs continuously restarting under load

2020-03-19 Thread huxia...@horebdata.cn
Hi, Igor, thanks for the tip. Dmesg does not say any suspicious information. I will investigate whether hardware has any problem or not. best regards, samuel huxia...@horebdata.cn 发件人: Igor Fedotov 发送时间: 2020-03-19 12:07 收件人: huxia...@horebdata.cn; ceph-users; ceph-users 主题: Re: [ceph-u

[ceph-users] Re: Full OSD's on cephfs_metadata pool

2020-03-19 Thread Derek Yarnell
Hi Robert, Sorry to hear that this impacted you but I feel a bit better that I wasn't alone. Did you have a lot of log segments to trim on the MDSs when you recovered? I would agree that this was a very odd sudden onset of space consumption for us. We have usually like 600GB consumed of around

[ceph-users] Re: MGRs failing once per day and generally slow response times

2020-03-19 Thread Janek Bevendorff
Sorry for nagging, but is there a solution to this? Routinely restarting my MGRs every few hours isn't how I want to spend my time (although I guess I could schedule a cron job for that). On 16/03/2020 09:35, Janek Bevendorff wrote: > Over the weekend, all five MGRs failed, which means we have no

[ceph-users] Re: Full OSD's on cephfs_metadata pool

2020-03-19 Thread Robert Ruge
Derek you are my champion. Your instructions were spot on and so timely. Thank you so much for posting those up for those who follow in your footsteps. How do I tell if my MDS was behind in log trimming? I didn't see any health messages to that effect. I think my NVMe OSD's were too small for t