This is a tangent on Paul Emmerich's response to "[ceph-users] Correct
Migration Workflow Replicated -> Erasure Code". I've tried Paul's method
before to migrate between 2 data pools. However I ran into some issues.

The first issue seems like a bug in RGW where the RGW for the new zone was
able to pull data directly from the data pool of the original zone after
the metadata had been sync'd. The metadata seemed to realize the file
actually exists and so it went ahead and grabbed it from the pool backing
the other zone. I worked around that slightly by using cephx to specify
which pools each RGW user could access, but it gives a permission denied
error instead of a file not found error. This happens on buckets that are
set not to replicate as well as buckets that failed to sync properly. Seems
like a bit of a security threat, but not a super common situation at all.

The second issue I think has to do with corrupt index files in my index
pool. Some of the buckets I don't need any more so I went to delete them
for simplicity, but the command failed to delete them. I just set them
aside for now and can just set the ones that I don't need any more to not
replicate on the bucket level. That works for most things, but then I have
a few buckets that I need to migrate, but when I set them to start
replicating the data sync between zones gets stuck. Does anyone have any
ideas on how to clean up the bucket indexes to make these operations
possible?

At this point I've disabled multisite and cleared up the new zone so I can
run operations on these buckets without dealing with multisite and
replication. I've tried a few things and can get some additional
information on my specific errors tomorrow at work.


---------- Forwarded message ---------
From: Paul Emmerich <paul.emmer...@croit.io>
Date: Wed, Oct 30, 2019 at 4:32 AM
Subject: [ceph-users] Re: Correct Migration Workflow Replicated -> Erasure
Code
To: Konstantin Shalygin <k0...@k0ste.ru>
Cc: Mac Wynkoop <mwynk...@netdepot.com>, ceph-users <ceph-us...@ceph.com>


We've solved this off-list (because I already got access to the cluster)

For the list:

Copying on rados level is possible, but requires to shut down radosgw
to get a consistent copy. This wasn't feasible here due to the size
and performance.
We've instead added a second zone where the placement maps to an EC
pool to the zonegroup and it's currently copying over data. We'll then
make the second zone master and default and ultimately delete the
first one.
This allows for a migration without downtime.

Another possibility would be using a Transition lifecycle rule, but
that's not ideal because it doesn't actually change the bucket.

I don't think it would be too complicated to add a native bucket
migration mechanism that works similar to "bucket rewrite" (which is
intended for something similar but different).

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to