[ceph-users] Re: [EXTERNAL] RE: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-10-18 Thread Dave Piper
end of our troubles. Thanks very much again for all your help Igor. Kind regards, Dave -Original Message- From: Igor Fedotov Sent: 30 September 2021 17:03 To: Dave Piper ; ceph-users@ceph.io Subject: Re: [EXTERNAL] RE: [ceph-users] OSDs flapping with "_open_alloc loaded 132 GiB in

[ceph-users] Re: RGW pubsub deprecation

2021-10-13 Thread Dave Piper
Hi Yuval, We're using pubsub! We opted for pubsub over bucket notifications as the pull mode fits well with our requirements. 1) We want to be able to guarantee that our client (the external server) has received and processed each event. My initial understanding of bucket notifications was th

[ceph-users] Re: [EXTERNAL] RE: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-09-30 Thread Dave Piper
? We used ceph-ansible to deploy initially so I'm expecting to use the add-mon.yml playbooks to do this; I'll look into that. Cheers, Dave -Original Message- From: Igor Fedotov Sent: 29 September 2021 13:27 To: Dave Piper ; ceph-users@ceph.io Subject: Re: [EXTERNAL] RE: [ce

[ceph-users] Re: [EXTERNAL] RE: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-09-29 Thread Dave Piper
Some interesting updates on our end. This cluster (condor) is in a multisite RGW zonegroup with another cluster (albans). Albans is still on nautilus and was healthy back when we started this thread. As a last resort, we decided to destroy condor and recreate it, putting it back in the zonegro

[ceph-users] Re: [EXTERNAL] RE: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-09-21 Thread Dave Piper
I still can't find a way to get ceph-bluestore-tool working in my containerized deployment. As soon as the OSD daemon stops, the contents of /var/lib/ceph/osd/ceph- are unreachable. I've found this blog post that suggests changes to the container's entrypoint are required, but the proposed fix

[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-09-20 Thread Dave Piper
Okay - I've finally got full debug logs from the flapping OSDs. The raw logs are both 100M each - I can email them directly if necessary. (Igor I've already sent these your way.) Both flapping OSDs are reporting the same "bluefs _allocate failed to allocate" errors as before. I've also noticed

[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-09-08 Thread Dave Piper
We've started hitting this issue again, despite having bitmap allocator configured. The logs just before the crash look similar to before (pasted below). So perhaps this isn't a hybrid allocator issue after all? I'm still struggling to collect the full set of diags / run ceph-bluestore-tool c

[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-08-26 Thread Dave Piper
one? I'm nervous about switching from the default without understanding what that means. Dave -Original Message- From: Igor Fedotov Sent: 23 August 2021 14:22 To: Dave Piper ; ceph-users@ceph.io Subject: Re: [EXTERNAL] Re: [ceph-users] OSDs flapping with "_open_alloc loaded

[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-08-20 Thread Dave Piper
the issue here; the container seems to be dying before we've had time to flush the log stream to file. I'll keep looking for a way around this. Dave -Original Message- From: Igor Fedotov Sent: 12 August 2021 13:36 To: Dave Piper ; ceph-users@ceph.io Subject: Re: [EXTERN

[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-08-12 Thread Dave Piper
gured it out yet. Cheers again for all your help, Dave -Original Message- From: Igor Fedotov Sent: 26 July 2021 13:30 To: Dave Piper ; ceph-users@ceph.io Subject: Re: [EXTERNAL] Re: [ceph-users] OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-07-26 Thread Dave Piper
38c49cb) octopus (stable) Cheers, Dave -Original Message- From: Igor Fedotov Sent: 23 July 2021 20:45 To: Dave Piper ; ceph-users@ceph.io Subject: [EXTERNAL] Re: [ceph-users] OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB" Hi Dave, The

[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-07-26 Thread Dave Piper
sue we've seen in isolation elsewhere. So - that's a big step forward! Should I retry with my original config on the latest octopus release and see if this is now fixed? Cheers again, Dave -Original Message- From: Igor Fedotov Sent: 26 July 2021 11:14 To: Dave Piper ; ceph-users@c

[ceph-users] OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-07-23 Thread Dave Piper
Hi all, We've got a containerized test cluster with 3 OSDs and ~ 220GiB of data. Shortly after upgrading from nautilus -> octopus, 2 of the 3 OSDs have started flapping. I've also got alarms about the MDS being damaged, which we've seen elsewhere and have a recovery process for, but I'm unable