Responding to myself to follow up with what I found.
While going over the release notes for 14.2.3/14.2.4 I found this was a known
problem that has already been fixed. Upgrading the cluster to 14.2.4 fixed the
issue.
Bryan
> On Oct 30, 2019, at 10:33 AM, Bryan Stillwell wrote:
>
> This morn
So, I ended up checking all datalog shards with:
radosgw-admin data sync status --shard-id=XY --source-zone=us-east-1
and found one with a few hundred references to a bucket that had been
deleted.
I ended up shutting down HAProxy on both ends and running
radosgw-admin data sync init
This seeme
Not sure if this is related to the dirlisting issue since the deep-scrubs
have always been way behind schedule.
But lets see if it has any effect to clear this warning. But it seems i can
only deep-scrub 5 pgs at a time. How can i increase this ?
On Wed, Oct 30, 2019 at 6:53 AM Lars Täuber wrote
This morning I noticed that on a new cluster the number of PGs for the
default.rgw.buckets.data pool was way too small (just 8 PGs), but when I try to
split the PGs the cluster doesn't do anything:
# ceph osd pool set default.rgw.buckets.data pg_num 16
set pool 13 pg_num to 16
It seems to set t
RADOS import/export? Will look into that. I ended up going with Paul's
suggestion, and created a separate zone under the main zonegroup with
another RGW instance running that zone, and had it sync the data into the
Erasure-Coded zone. Made it very easy to do!
Thanks,
Mac
On Wed, Oct 30, 2019 at
Hey Dan,
We've got three rgws with the following configuration:
- We're running 12.2.12 with civit web.
- 3 RGW's with haproxy round robin
- 32 GiB RAM (handles = 4, thread pool = 512)
- We run mon+mgr+rgw on the same hardware.
Looking at our grafana dashboards, I don't see us runnin
I am getting these since Nautilus upgrade
[Wed Oct 30 01:32:09 2019] ceph: build_snap_context 100020859dd
911cca33b800 fail -12
[Wed Oct 30 01:32:09 2019] ceph: build_snap_context 100020859d2
911d3eef5a00 fail -12
[Wed Oct 30 01:32:09 2019] ceph: build_snap_context 100020859d9
911
We've solved this off-list (because I already got access to the cluster)
For the list:
Copying on rados level is possible, but requires to shut down radosgw
to get a consistent copy. This wasn't feasible here due to the size
and performance.
We've instead added a second zone where the placement m