date:20210414

[ceph-users] Re: Revisit Large OMAP Objects

2021-04-14 Thread Konstantin Shalygin

Run reshard instances rm
And reshard your bucket by hand or leave dynamic resharding process to do this 
work


k

Sent from my iPhone

> On 13 Apr 2021, at 19:33, dhils...@performair.com wrote:
> 
> All;
> 
> We run 2 Nautilus clusters, with RADOSGW replication (14.2.11 --> 14.2.16).
> 
> Initially our bucket grew very quickly, as I was loading old data into it and 
> we quickly ran into Large OMAP Object warnings.
> 
> I have since done a couple manual reshards, which has fixed the warning on 
> the primary cluster.  I have never been able to get rid of the issue on the 
> cluster with the replica.
> 
> I prior conversation on this list led me to this command:
> radosgw-admin reshard stale-instances list --yes-i-really-mean-it
> 
> The results of which look like this:
> [
>"nextcloud-ra:f91aeff8-a365-47b4-a1c8-928cd66134e8.185262.1",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.6",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.2",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.5",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.4",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.3",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.1",
>"3520ae821f974340afd018110c1065b8/OS 
> Development:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.1",
>
> "10dfdfadb7374ea1ba37bee1435d87ad/volumebackups:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.2",
>"WorkOrder:f91aeff8-a365-47b4-a1c8-928cd66134e8.44130.1"
> ]
> 
> I find this particularly interesting, as nextcloud-ra, /OS 
> Development, /volumbackups, and WorkOrder buckets no longer exist.
> 
> When I run:
> for obj in $(rados -p 300.rgw.buckets.index ls | grep 
> f91aeff8-a365-47b4-a1c8-928cd66134e8.3512190.1);   do   printf "%-60s %7d\n" 
> $obj $(rados -p 300.rgw.buckets.index listomapkeys $obj | wc -l);   done
> 
> I get the expected 64 entries, with counts around 2 +/- 1000.
> 
> Are the above listed stale instances ok to delete?  If so, how do I go about 
> doing so?
> 
> Thank you,
> 
> Dominic L. Hilsbos, MBA 
> Director - Information Technology 
> Perform Air International Inc.
> dhils...@performair.com 
> www.PerformAir.com
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Exporting CephFS using Samba preferred method

2021-04-14 Thread Konstantin Shalygin

Hi,

Actually vfs_ceph should perform better, but this method will not work with 
another's vfs's, like recycle bin or audit, in one stack


k

Sent from my iPhone

> On 14 Apr 2021, at 09:56, Martin Palma  wrote:
> 
> Hello,
> 
> what is the currently preferred method, in terms of stability and
> performance, for exporting a CephFS directory with Samba?
> 
> - locally mount the CephFS directory and export it via Samba?
> - using the "vfs_ceph" module of Samba?
> 
> Best,
> Martin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Exporting CephFS using Samba preferred method

2021-04-14 Thread Alexander Sporleder

Hello Konstantin,
In my experience the CephFS kernel driver (Ubuntu 20.04) was always
faster and the CPU load was much lower compared to vfs_ceph.

Alex
 

Am Mittwoch, dem 14.04.2021 um 10:19 +0300 schrieb Konstantin Shalygin:
> Hi,
> 
> Actually vfs_ceph should perform better, but this method will not
> work with another's vfs's, like recycle bin or audit, in one stack
> 
> 
> k
> 
> Sent from my iPhone
> 
> > On 14 Apr 2021, at 09:56, Martin Palma  wrote:
> > 
> > Hello,
> > 
> > what is the currently preferred method, in terms of stability and
> > performance, for exporting a CephFS directory with Samba?
> > 
> > - locally mount the CephFS directory and export it via Samba?
> > - using the "vfs_ceph" module of Samba?
> > 
> > Best,
> > Martin
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Exporting CephFS using Samba preferred method

2021-04-14 Thread Magnus HAGDORN

On Wed, 2021-04-14 at 08:55 +0200, Martin Palma wrote:
> Hello,
>
> what is the currently preferred method, in terms of stability and
> performance, for exporting a CephFS directory with Samba?
>
> - locally mount the CephFS directory and export it via Samba?
> - using the "vfs_ceph" module of Samba?
>

We use cephfs to serve files to both Linux and Windows machines. In
order to be able to get a consistent layout on both Windows and Linux
clients we automount cephfs on Linux clients and the Samba servers.
Samba serves the automounted files. This way absolute symbolic links
point to the same location on both systems.
Regards
magnus
The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh 
Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] DocuBetter Meeting This Week -- 1630 UTC

2021-04-14 Thread John Zachary Dover

This week's meeting will focus on the ongoing rewrite of the cephadm
documentation and the upcoming Google Season of Docs project.

Meeting: https://bluejeans.com/908675367
Etherpad: https://pad.ceph.com/p/Ceph_Documentation
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Abandon incomplete (damaged EC) pgs - How to manage the impact on cephfs?

2021-04-14 Thread Joshua West

Just working this through, how does one identify the OIDs within a PG,
without list_unfound?

I've been poking around, but can't seem to find a command that outputs
the necessary OIDs. I tried a handful of cephfs commands, but they of
course become stuck, and ceph pg commands haven't revealed the OID
yet.

Joshua


Joshua West
President
403-456-0072
CAYK.ca


On Fri, Apr 9, 2021 at 12:15 PM Joshua West  wrote:
>
> Absolutely!
>
> Attached the files, they're not duplicate, but revised (as I tidied up
> what I could to make things easier)
>
> > Correct me if I'm wrong, but you are willing to throw away all of the data 
> > on this pool?
>
> Correct, if push comes to shove, I accept that data-loss is probable.
> If I can manage to save the data, I would definitely be okay with that
> too though.
>
> Still learning to program, but know python quite well. I am going to
> push off on a script to clean up per your previously noted steps in
> the language I know! But will hold off on unlinking everything for the
> moment.
>
> Thank you again for your time, your help has already been invaluable to me.
>
> Joshua
>
>
> Joshua West
> President
> 403-456-0072
> CAYK.ca
>
>
> On Fri, Apr 9, 2021 at 7:03 AM Michael Thomas  wrote:
> >
> > Hi Joshua,
> >
> > I'll dig into this output a bit more later, but here are my thoughts
> > right now.  I'll preface this by saying that I've never had to clean up
> > from unrecoverable incomplete PGs, so some of what I suggest may not
> > work/apply or be the ideal fix in your case.
> >
> > Correct me if I'm wrong, but you are willing to throw away all of the
> > data on this pool?  This should make it easier because we don't have to
> > worry about recovering any lost data.
> >
> > If this is the case, then I think the general strategy would be:
> >
> > 1) Identify and remove any files/directories in cephfs that are located
> > on this pool (based on ceph.file.layout.pool=claypool and
> > ceph.dir.layout.pool=claypool).  Use 'unlink' instead of 'rm' to remove
> > the files; it should be less prone to hanging.
> >
> > 2) Wait a bit for ceph to clean up any unreferenced objects.  Watch the
> > output of 'ceph df' to see how many objects are listed for the pool.
> >
> > 3) Use 'rados -p claypool ls' to identify the remaining objects.  Use
> > the OID identifier to calculate the inode number of each file, then
> > search cephfs to identify which files these belong to.  I would expect
> > it would be none, as you already deleted the files in step 1.
> >
> > 4) With nothing in the cephfs metadata referring to the objects anymore,
> > it should be safe to remove them with 'rados -p rm'.
> >
> > 5) Remove the now-empty pool from cephfs
> >
> > 6) Remove the now-empty pool from ceph
> >
> > Can you also include the output of 'ceph df'?
> >
> > --Mike
> >
> > On 4/9/21 7:31 AM, Joshua West wrote:
> > > Thank you Mike!
> > >
> > > This is honestly a way more detailed reply than I was expecting.
> > > You've equipped me with new tools to work with.  Thank you!
> > >
> > > I don't actually have any unfound pgs... only "incomplete" ones, which
> > > limits the usefulness of:
> > > `grep recovery_unfound`
> > > `ceph pg $pg list_unfound`
> > > `ceph pg $pg mark_unfound_lost delete`
> > >
> > > I don't seem to see equivalent commands for incomplete pgs, save for
> > > grep of course.
> > >
> > > This does make me slightly more hopeful that recovery might be
> > > possible if the pgs are incomplete and stuck, but not unfound..? Not
> > > going to get my hopes too high.
> > >
> > > Going to attach a few items just to keep from bugging me, if anyone
> > > can take a glance, it would be appreciated.
> > >
> > > In the meantime, in the absence of the above commands, what's the best
> > > way to clean this up under the assumption that the data is lost?
> > >
> > > ~Joshua
> > >
> > >
> > > Joshua West
> > > President
> > > 403-456-0072
> > > CAYK.ca
> > >
> > >
> > > On Thu, Apr 8, 2021 at 6:15 PM Michael Thomas  wrote:
> > >>
> > >> Hi Joshua,
> > >>
> > >> I have had a similar issue three different times on one of my cephfs
> > >> pools (15.2.10). The first time this happened I had lost some OSDs.  In
> > >> all cases I ended up with degraded PGs with unfound objects that could
> > >> not be recovered.
> > >>
> > >> Here's how I recovered from the situation.  Note that this will
> > >> permanently remove the affected files from ceph.  Restoring them from
> > >> backup is an excercise left to the reader.
> > >>
> > >> * Make a list of the affected PGs:
> > >> ceph pg dump_stuck  | grep recovery_unfound > pg.txt
> > >>
> > >> * Make a list of the affected objects (OIDs):
> > >> cat pg.txt | awk '{print $1}' | while read pg ; do echo $pg ; ceph pg
> > >> $pg list_unfound | jq '.objects[].oid.oid' ; done | sed -e 's/"//g' >
> > >> oid.txt
> > >>
> > >> * Convert the OID numbers to inodes using 'printf "%d\n" 0x${oid}' and
> > >> put the results in a file called 'inum.txt'
> > >>
> > >> * On a ceph c

[ceph-users] Re: Abandon incomplete (damaged EC) pgs - How to manage the impact on cephfs?

2021-04-14 Thread Joshua West

Additional to my last note, I should have mentioned, I am exploring
options to delete the damaged data, but in hopes to preserve what I
can, prior to moving to simply deleting all data on that pool.

When trying to simply empty pgs, it seems like the pgs don't exist.

In attempting to follow:
https://medium.com/opsops/recovering-ceph-from-reduced-data-availability-3-pgs-inactive-3-pgs-incomplete-b97cbcb4b5a1
with regard to deleting pgs with zero objects/data, I receive:

#ceph pg ls incomplete

47.3ff0 0  000
0   0 0  incomplete 3m0'0527856:6054
   [7,9,2]p7  [7,9,2]p7...

#ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-7 --op info
--pgid 47.3ff
PG '47.3ff' not found

#ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-7 --op
remove --pgid 47.3ff --force
PG '47.3ff' not found

#ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-7 --op
mark-complete --pgid 47.3ff
PG '47.3ff' not found

# ceph osd force-create-pg 47.3ff --yes-i-really-mean-it
(worked, but I don't have the output handy) --> No change.


Since I am having troubles with this process, can't delete pgs, can't
get OIDs for incomplete pgs, I had the idea to start from the other
end:
Is there a method to determine which files are not stuck, to copy them
out, prior to deleting the whole pool?

As we know, `ls` becomes stuck, so how is it best to get a list of
filepaths+filenames for cephfs?
My current plan is to get that list, and simply brute force attempting
to copy all files, but each copy in it's own thread + timeout. Does
this make sense?


Joshua

On Wed, Apr 14, 2021 at 6:03 AM Joshua West  wrote:
>
> Just working this through, how does one identify the OIDs within a PG,
> without list_unfound?
>
> I've been poking around, but can't seem to find a command that outputs
> the necessary OIDs. I tried a handful of cephfs commands, but they of
> course become stuck, and ceph pg commands haven't revealed the OID
> yet.
>
> Joshua
>
>
> Joshua West
> President
> 403-456-0072
> CAYK.ca
>
>
> On Fri, Apr 9, 2021 at 12:15 PM Joshua West  wrote:
> >
> > Absolutely!
> >
> > Attached the files, they're not duplicate, but revised (as I tidied up
> > what I could to make things easier)
> >
> > > Correct me if I'm wrong, but you are willing to throw away all of the 
> > > data on this pool?
> >
> > Correct, if push comes to shove, I accept that data-loss is probable.
> > If I can manage to save the data, I would definitely be okay with that
> > too though.
> >
> > Still learning to program, but know python quite well. I am going to
> > push off on a script to clean up per your previously noted steps in
> > the language I know! But will hold off on unlinking everything for the
> > moment.
> >
> > Thank you again for your time, your help has already been invaluable to me.
> >
> > Joshua
> >
> >
> > Joshua West
> > President
> > 403-456-0072
> > CAYK.ca
> >
> >
> > On Fri, Apr 9, 2021 at 7:03 AM Michael Thomas  wrote:
> > >
> > > Hi Joshua,
> > >
> > > I'll dig into this output a bit more later, but here are my thoughts
> > > right now.  I'll preface this by saying that I've never had to clean up
> > > from unrecoverable incomplete PGs, so some of what I suggest may not
> > > work/apply or be the ideal fix in your case.
> > >
> > > Correct me if I'm wrong, but you are willing to throw away all of the
> > > data on this pool?  This should make it easier because we don't have to
> > > worry about recovering any lost data.
> > >
> > > If this is the case, then I think the general strategy would be:
> > >
> > > 1) Identify and remove any files/directories in cephfs that are located
> > > on this pool (based on ceph.file.layout.pool=claypool and
> > > ceph.dir.layout.pool=claypool).  Use 'unlink' instead of 'rm' to remove
> > > the files; it should be less prone to hanging.
> > >
> > > 2) Wait a bit for ceph to clean up any unreferenced objects.  Watch the
> > > output of 'ceph df' to see how many objects are listed for the pool.
> > >
> > > 3) Use 'rados -p claypool ls' to identify the remaining objects.  Use
> > > the OID identifier to calculate the inode number of each file, then
> > > search cephfs to identify which files these belong to.  I would expect
> > > it would be none, as you already deleted the files in step 1.
> > >
> > > 4) With nothing in the cephfs metadata referring to the objects anymore,
> > > it should be safe to remove them with 'rados -p rm'.
> > >
> > > 5) Remove the now-empty pool from cephfs
> > >
> > > 6) Remove the now-empty pool from ceph
> > >
> > > Can you also include the output of 'ceph df'?
> > >
> > > --Mike
> > >
> > > On 4/9/21 7:31 AM, Joshua West wrote:
> > > > Thank you Mike!
> > > >
> > > > This is honestly a way more detailed reply than I was expecting.
> > > > You've equipped me with new tools to work with.  Thank you!
> > > >
> > > > I don't actually have any unfound pgs... only "incomplete" ones, which
> > > > limits

[ceph-users] Re: How to disable ceph-grafana during cephadm bootstrap

2021-04-14 Thread Sebastian Wagner

cephadm bootstrap --skip-monitoring-stack

should to the trick. See man cephadm

On Tue, Apr 13, 2021 at 6:05 PM mabi  wrote:

> Hello,
>
> When bootstrapping a new ceph Octopus cluster with "cephadm bootstrap",
> how can I tell the cephadm bootstrap NOT to install the ceph-grafana
> container?
>
> Thank you very much in advance for your answer.
>
> Best regards,
> Mabi
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Monitor dissapears/stopped after testing monitor-host loss and recovery

2021-04-14 Thread Kai Börnert


Hi,

I'm currently testing some disaster scenarios.

When removing one osd/monitor host, I see that a new quorum is build 
without the missing host. The missing host is listed in the dashboard 
under Not In Quorum, so probably everything as expected.


After restarting the host, I see that the osd's come back online and 
everything appears to be working, however the quorum is still only with 
two monitors.


Looking at the services, I can see that it is somehow stopped. Is this 
expected and I must start it manually somehow, or should it work? The 
whole cluster is deployed using cephadm (the node was the initial 
bootstrap one, if that is important)



Greetings, Kai

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Revisit Large OMAP Objects

2021-04-14 Thread DHilsbos

Konstantin;

Dynamic resharding is disabled in multisite environments.

I believe you mean radosgw-admin reshard stale-instances rm.

Documentation suggests this shouldn't be run in a multisite environment.  Does 
anyone know the reason for this?

Is it, in fact, safe, even in a multisite environment?

Thank you,

Dominic L. Hilsbos, MBA 
Director – Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com


-Original Message-
From: Konstantin Shalygin [mailto:k0...@k0ste.ru] 
Sent: Wednesday, April 14, 2021 12:15 AM
To: Dominic Hilsbos
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Revisit Large OMAP Objects

Run reshard instances rm
And reshard your bucket by hand or leave dynamic resharding process to do this 
work


k

Sent from my iPhone

> On 13 Apr 2021, at 19:33, dhils...@performair.com wrote:
> 
> All;
> 
> We run 2 Nautilus clusters, with RADOSGW replication (14.2.11 --> 14.2.16).
> 
> Initially our bucket grew very quickly, as I was loading old data into it and 
> we quickly ran into Large OMAP Object warnings.
> 
> I have since done a couple manual reshards, which has fixed the warning on 
> the primary cluster.  I have never been able to get rid of the issue on the 
> cluster with the replica.
> 
> I prior conversation on this list led me to this command:
> radosgw-admin reshard stale-instances list --yes-i-really-mean-it
> 
> The results of which look like this:
> [
>"nextcloud-ra:f91aeff8-a365-47b4-a1c8-928cd66134e8.185262.1",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.6",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.2",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.5",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.4",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.3",
>"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.1",
>"3520ae821f974340afd018110c1065b8/OS 
> Development:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.1",
>
> "10dfdfadb7374ea1ba37bee1435d87ad/volumebackups:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.2",
>"WorkOrder:f91aeff8-a365-47b4-a1c8-928cd66134e8.44130.1"
> ]
> 
> I find this particularly interesting, as nextcloud-ra, /OS 
> Development, /volumbackups, and WorkOrder buckets no longer exist.
> 
> When I run:
> for obj in $(rados -p 300.rgw.buckets.index ls | grep 
> f91aeff8-a365-47b4-a1c8-928cd66134e8.3512190.1);   do   printf "%-60s %7d\n" 
> $obj $(rados -p 300.rgw.buckets.index listomapkeys $obj | wc -l);   done
> 
> I get the expected 64 entries, with counts around 2 +/- 1000.
> 
> Are the above listed stale instances ok to delete?  If so, how do I go about 
> doing so?
> 
> Thank you,
> 
> Dominic L. Hilsbos, MBA 
> Director - Information Technology 
> Perform Air International Inc.
> dhils...@performair.com 
> www.PerformAir.com
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Revisit Large OMAP Objects

2021-04-14 Thread Casey Bodley

On Wed, Apr 14, 2021 at 11:44 AM  wrote:
>
> Konstantin;
>
> Dynamic resharding is disabled in multisite environments.
>
> I believe you mean radosgw-admin reshard stale-instances rm.
>
> Documentation suggests this shouldn't be run in a multisite environment.  
> Does anyone know the reason for this?

say there's a bucket with 10 objects in it, and that's been fully
replicated to a secondary zone. if you want to remove the bucket, you
delete its objects then delete the bucket

when the bucket is deleted, rgw can't delete its bucket instance yet
because the secondary zone may not be caught up with sync - it
requires access to the bucket instance (and its index) to sync those
last 10 object deletions

so the risk with 'stales-instances rm' in multisite is that you might
delete instances before other zones catch up, which can lead to
orphaned objects

>
> Is it, in fact, safe, even in a multisite environment?
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director – Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
> -Original Message-
> From: Konstantin Shalygin [mailto:k0...@k0ste.ru]
> Sent: Wednesday, April 14, 2021 12:15 AM
> To: Dominic Hilsbos
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Revisit Large OMAP Objects
>
> Run reshard instances rm
> And reshard your bucket by hand or leave dynamic resharding process to do 
> this work
>
>
> k
>
> Sent from my iPhone
>
> > On 13 Apr 2021, at 19:33, dhils...@performair.com wrote:
> >
> > All;
> >
> > We run 2 Nautilus clusters, with RADOSGW replication (14.2.11 --> 14.2.16).
> >
> > Initially our bucket grew very quickly, as I was loading old data into it 
> > and we quickly ran into Large OMAP Object warnings.
> >
> > I have since done a couple manual reshards, which has fixed the warning on 
> > the primary cluster.  I have never been able to get rid of the issue on the 
> > cluster with the replica.
> >
> > I prior conversation on this list led me to this command:
> > radosgw-admin reshard stale-instances list --yes-i-really-mean-it
> >
> > The results of which look like this:
> > [
> >"nextcloud-ra:f91aeff8-a365-47b4-a1c8-928cd66134e8.185262.1",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.6",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.2",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.5",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.4",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.3",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.1",
> >"3520ae821f974340afd018110c1065b8/OS 
> > Development:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.1",
> >
> > "10dfdfadb7374ea1ba37bee1435d87ad/volumebackups:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.2",
> >"WorkOrder:f91aeff8-a365-47b4-a1c8-928cd66134e8.44130.1"
> > ]
> >
> > I find this particularly interesting, as nextcloud-ra, /OS 
> > Development, /volumbackups, and WorkOrder buckets no longer exist.
> >
> > When I run:
> > for obj in $(rados -p 300.rgw.buckets.index ls | grep 
> > f91aeff8-a365-47b4-a1c8-928cd66134e8.3512190.1);   do   printf "%-60s 
> > %7d\n" $obj $(rados -p 300.rgw.buckets.index listomapkeys $obj | wc -l);   
> > done
> >
> > I get the expected 64 entries, with counts around 2 +/- 1000.
> >
> > Are the above listed stale instances ok to delete?  If so, how do I go 
> > about doing so?
> >
> > Thank you,
> >
> > Dominic L. Hilsbos, MBA
> > Director - Information Technology
> > Perform Air International Inc.
> > dhils...@performair.com
> > www.PerformAir.com
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Cephadm upgrade to Pacific problem

2021-04-14 Thread Radoslav Milanov


Hello,

Cluster is 3 nodes Debian 10. Started cephadm upgrade on healthy 15.2.10 
cluster. Managers were upgraded fine then first monitor went down for 
upgrade and never came back. Researching at the unit files container 
fails to run because of an error:


root@host1:/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1# 
cat unit.run


set -e
/usr/bin/install -d -m0770 -o 167 -g 167 
/var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6

# mon.host1
! /usr/bin/docker rm -f 
ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 2> /dev/null
/usr/bin/docker run --rm --ipc=host --net=host --entrypoint 
/usr/bin/ceph-mon --privileged --group-add=disk --init --name 
ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 -e 
CONTAINER_IMAGE=ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a 
-e NODE_NAME=host1 -e CEPH_USE_RANDOM_NONCE=1 -v 
/var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/run/ceph:z -v 
/var/log/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/log/ceph:z -v 
/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/crash:/var/lib/ceph/crash:z 
-v 
/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1:/var/lib/ceph/mon/ceph-host1:z 
-v 
/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1/config:/etc/ceph/ceph.conf:z 
-v /dev:/dev -v /run/udev:/run/udev 
ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a 
-n mon.host1 -f --setuser ceph --setgroup ceph 
--default-log-to-file=false --default-log-to-stderr=true 
'--default-log-stderr-prefix=debug ' 
--default-mon-cluster-log-to-file=false 
--default-mon-cluster-log-to-stderr=true


root@host1:/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1# 
/usr/bin/docker run --rm --ipc=host --net=host --entrypoint 
/usr/bin/ceph-mon --privileged --group-add=disk --init --name 
ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 -e 
CONTAINER_IMAGE=ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a 
-e NODE_NAME=host1 -e CEPH_USE_RANDOM_NONCE=1 -v 
/var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/run/ceph:z -v 
/var/log/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/log/ceph:z -v 
/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/crash:/var/lib/ceph/crash:z 
-v 
/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1:/var/lib/ceph/mon/ceph-host1:z 
-v 
/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1/config:/etc/ceph/ceph.conf:z 
-v /dev:/dev -v /run/udev:/run/udev 
ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a 
-n mon.host1 -f --setuser ceph --setgroup ceph 
--default-log-to-file=false --default-log-to-stderr=true 
'--default-log-stderr-prefix=debug ' 
--default-mon-cluster-log-to-file=false 
--default-mon-cluster-log-to-stderr=true



/usr/bin/docker: Error response from daemon: OCI runtime create failed: 
container_linux.go:344: starting container process caused "exec: 
\"/dev/init\": stat /dev/init: no such file or directory": unknown.


Any suggestions how to resolve that ?

Thank you.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Revisit Large OMAP Objects

2021-04-14 Thread DHilsbos

Casey;

That makes sense, and I appreciate the explanation.

If I were to shut down all uses of RGW, and wait for replication to catch up, 
would this then address most known issues with running this command in a 
multi-site environment?  Can I offline RADOSGW daemons as an added precaution?

Thank you,

Dominic L. Hilsbos, MBA 
Director – Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com


-Original Message-
From: Casey Bodley [mailto:cbod...@redhat.com] 
Sent: Wednesday, April 14, 2021 9:03 AM
To: Dominic Hilsbos
Cc: k0...@k0ste.ru; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: Revisit Large OMAP Objects

On Wed, Apr 14, 2021 at 11:44 AM  wrote:
>
> Konstantin;
>
> Dynamic resharding is disabled in multisite environments.
>
> I believe you mean radosgw-admin reshard stale-instances rm.
>
> Documentation suggests this shouldn't be run in a multisite environment.  
> Does anyone know the reason for this?

say there's a bucket with 10 objects in it, and that's been fully
replicated to a secondary zone. if you want to remove the bucket, you
delete its objects then delete the bucket

when the bucket is deleted, rgw can't delete its bucket instance yet
because the secondary zone may not be caught up with sync - it
requires access to the bucket instance (and its index) to sync those
last 10 object deletions

so the risk with 'stales-instances rm' in multisite is that you might
delete instances before other zones catch up, which can lead to
orphaned objects

>
> Is it, in fact, safe, even in a multisite environment?
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director – Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
> -Original Message-
> From: Konstantin Shalygin [mailto:k0...@k0ste.ru]
> Sent: Wednesday, April 14, 2021 12:15 AM
> To: Dominic Hilsbos
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Revisit Large OMAP Objects
>
> Run reshard instances rm
> And reshard your bucket by hand or leave dynamic resharding process to do 
> this work
>
>
> k
>
> Sent from my iPhone
>
> > On 13 Apr 2021, at 19:33, dhils...@performair.com wrote:
> >
> > All;
> >
> > We run 2 Nautilus clusters, with RADOSGW replication (14.2.11 --> 14.2.16).
> >
> > Initially our bucket grew very quickly, as I was loading old data into it 
> > and we quickly ran into Large OMAP Object warnings.
> >
> > I have since done a couple manual reshards, which has fixed the warning on 
> > the primary cluster.  I have never been able to get rid of the issue on the 
> > cluster with the replica.
> >
> > I prior conversation on this list led me to this command:
> > radosgw-admin reshard stale-instances list --yes-i-really-mean-it
> >
> > The results of which look like this:
> > [
> >"nextcloud-ra:f91aeff8-a365-47b4-a1c8-928cd66134e8.185262.1",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.6",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.2",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.5",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.4",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.3",
> >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.1",
> >"3520ae821f974340afd018110c1065b8/OS 
> > Development:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.1",
> >
> > "10dfdfadb7374ea1ba37bee1435d87ad/volumebackups:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.2",
> >"WorkOrder:f91aeff8-a365-47b4-a1c8-928cd66134e8.44130.1"
> > ]
> >
> > I find this particularly interesting, as nextcloud-ra, /OS 
> > Development, /volumbackups, and WorkOrder buckets no longer exist.
> >
> > When I run:
> > for obj in $(rados -p 300.rgw.buckets.index ls | grep 
> > f91aeff8-a365-47b4-a1c8-928cd66134e8.3512190.1);   do   printf "%-60s 
> > %7d\n" $obj $(rados -p 300.rgw.buckets.index listomapkeys $obj | wc -l);   
> > done
> >
> > I get the expected 64 entries, with counts around 2 +/- 1000.
> >
> > Are the above listed stale instances ok to delete?  If so, how do I go 
> > about doing so?
> >
> > Thank you,
> >
> > Dominic L. Hilsbos, MBA
> > Director - Information Technology
> > Perform Air International Inc.
> > dhils...@performair.com
> > www.PerformAir.com
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] _delete_some new onodes has appeared since PG removal started

2021-04-14 Thread Dan van der Ster

Hi Igor,

After updating to 14.2.19 and then moving some PGs around we have a
few warnings related to the new efficient PG removal code, e.g. [1].
Is that something to worry about?

Best Regards,

Dan

[1]

/var/log/ceph/ceph-osd.792.log:2021-04-14 20:34:34.353 7fb2439d4700  0
osd.792 pg_epoch: 40906 pg[10.14b2s0( v 40734'290069
(33782'287000,40734'290069] lb MIN (bitwise) local-lis/les=33990/33991
n=36272 ec=4951/4937 lis/c 33990/33716 les/c/f 33991/33747/0
40813/40813/37166) [933,626,260,804,503,491]p933(0) r=-1 lpr=40813
DELETING pi=[33716,40813)/4 crt=40734'290069 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[0#10:4d28head#]

/var/log/ceph/ceph-osd.851.log:2021-04-14 18:40:13.312 7fd87bded700  0
osd.851 pg_epoch: 40671 pg[10.133fs5( v 40662'288967
(33782'285900,40662'288967] lb MIN (bitwise) local-lis/les=33786/33787
n=13 ec=4947/4937 lis/c 40498/33714 les/c/f 40499/33747/0
40670/40670/33432) [859,199,913,329,439,79]p859(0) r=-1 lpr=40670
DELETING pi=[33714,40670)/4 crt=40662'288967 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[5#10:fcc8head#]

/var/log/ceph/ceph-osd.851.log:2021-04-14 20:58:14.393 7fd87adeb700  0
osd.851 pg_epoch: 40906 pg[10.2e8s3( v 40610'288991
(33782'285900,40610'288991] lb MIN (bitwise) local-lis/les=33786/33787
n=161220 ec=4937/4937 lis/c 39826/33716 les/c/f 39827/33747/0
40617/40617/39225) [717,933,727,792,607,129]p717(0) r=-1 lpr=40617
DELETING pi=[33716,40617)/3 crt=40610'288991 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[3#10:1740head#]

/var/log/ceph/ceph-osd.883.log:2021-04-14 18:55:16.822 7f78c485d700  0
osd.883 pg_epoch: 40857 pg[7.d4( v 40804'9911289
(35835'9908201,40804'9911289] lb MIN (bitwise)
local-lis/les=40782/40783 n=195 ec=2063/1989 lis/c 40782/40782 les/c/f
40783/40844/0 40781/40845/40845) [877,870,894] r=-1 lpr=40845 DELETING
pi=[40782,40845)/1 crt=40804'9911289 lcod 40804'9911288 unknown NOTIFY
mbc={}] _delete_some additional unexpected onode list (new onodes has
appeared since PG removal started[#7:2b00head#]
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Ceph Month June 2021 Event

2021-04-14 Thread Mike Perez

Hi everyone,

In June 2021, we're hosting a month of Ceph presentations, lightning
talks, and unconference sessions such as BOFs. There is no
registration or cost to attend this event.

The CFP is now open until May 12th.

https://ceph.io/events/ceph-month-june-2021/cfp

Speakers will receive confirmation that their presentation is accepted
and further instructions for scheduling by May 16th.

The schedule will be available on May 19th.

Join the Ceph community as we discuss how Ceph, the massively
scalable, open-source, software-defined storage system, can radically
improve the economics and management of data storage for your
enterprise.

--
Mike Perez
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [External Email] Cephadm upgrade to Pacific problem

2021-04-14 Thread Radoslav Milanov


Thanks for the pointer Dave,

in my case though problem proved to be old docker version (18) provided 
by OS repos. Installing latest docker-ce from docker.com resolves the 
problem. It would be nice though if host was checked for compatibility 
before starting an upgrade.




On 14.4.2021 г. 13:15 ч., Dave Hall wrote:

Radoslav,

I ran into the same.  For Debian 10 - recent updates - you have to add 
'cgroup_enable=memory swapaccount=1' to the kernel command line 
(/etc/default/grub).  The reference I found said that Debian 
decided to disable this by default and make us turn it on if we want 
to run containers.


-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu 

On Wed, Apr 14, 2021 at 12:51 PM Radoslav Milanov 
mailto:radoslav.mila...@gmail.com>> wrote:


Hello,

Cluster is 3 nodes Debian 10. Started cephadm upgrade on healthy
15.2.10
cluster. Managers were upgraded fine then first monitor went down for
upgrade and never came back. Researching at the unit files container
fails to run because of an error:

root@host1:/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1#

cat unit.run

set -e
/usr/bin/install -d -m0770 -o 167 -g 167
/var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6
# mon.host1
! /usr/bin/docker rm -f
ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 2> /dev/null
/usr/bin/docker run --rm --ipc=host --net=host --entrypoint
/usr/bin/ceph-mon --privileged --group-add=disk --init --name
ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 -e

CONTAINER_IMAGE=ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a

-e NODE_NAME=host1 -e CEPH_USE_RANDOM_NONCE=1 -v
/var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/run/ceph:z -v
/var/log/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/log/ceph:z -v

/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/crash:/var/lib/ceph/crash:z

-v

/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1:/var/lib/ceph/mon/ceph-host1:z

-v

/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1/config:/etc/ceph/ceph.conf:z

-v /dev:/dev -v /run/udev:/run/udev

ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a

-n mon.host1 -f --setuser ceph --setgroup ceph
--default-log-to-file=false --default-log-to-stderr=true
'--default-log-stderr-prefix=debug '
--default-mon-cluster-log-to-file=false
--default-mon-cluster-log-to-stderr=true

root@host1:/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1#

/usr/bin/docker run --rm --ipc=host --net=host --entrypoint
/usr/bin/ceph-mon --privileged --group-add=disk --init --name
ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 -e

CONTAINER_IMAGE=ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a

-e NODE_NAME=host1 -e CEPH_USE_RANDOM_NONCE=1 -v
/var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/run/ceph:z -v
/var/log/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/log/ceph:z -v

/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/crash:/var/lib/ceph/crash:z

-v

/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1:/var/lib/ceph/mon/ceph-host1:z

-v

/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1/config:/etc/ceph/ceph.conf:z

-v /dev:/dev -v /run/udev:/run/udev

ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a

-n mon.host1 -f --setuser ceph --setgroup ceph
--default-log-to-file=false --default-log-to-stderr=true
'--default-log-stderr-prefix=debug '
--default-mon-cluster-log-to-file=false
--default-mon-cluster-log-to-stderr=true


/usr/bin/docker: Error response from daemon: OCI runtime create
failed:
container_linux.go:344: starting container process caused "exec:
\"/dev/init\": stat /dev/init: no such file or directory": unknown.

Any suggestions how to resolve that ?

Thank you.
___
ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-04-14 Thread Jorge Boncompte

	Hi, every osd in a SSD that I have upgraded from 15.2.9->15.210 logs 
errors like the ones below. The osd's in HD or NVME don't. But they 
restart ok and a deep-scrub of the entire pool finishes ok. Could be the 
same bug?


2021-04-14T00:29:27.740+0200 7f364750d700  3 rocksdb: 
[table/block_based_table_reader.cc:1117] Encountered error while reading 
data from compression dictionary block Corruption: truncated block read 
from db/044714.sst offset 18446744073709551615, expected 4 bytes, got 0


2021-04-14T00:29:51.852+0200 7f364750d700  3 rocksdb: 
[table/block_based_table_reader.cc:1117] Encountered error while reading 
data from compression dictionary block Corruption: block checksum 
mismatch: expected 0, got 2326482265  in db/044743.sst offset 
18446744073709551615 size 18446744073709551615


Br.

El 12/4/21 a las 18:15, Igor Fedotov escribió:
The workaround would be to disable bluestore_fsck_quick_fix_on_mount, do 
an upgrade and then do a regular fsck.


Depending on fsck  results either proceed with a repair or not.


Thanks,

Igor


On 4/12/2021 6:35 PM, dhils...@performair.com wrote:
Is there a way to check for these zombie blobs, and other issues 
needing repair, prior to the upgrade?  That would allow us to know 
that issues might be coming, and perhaps address them before they 
result in corrupt OSDs.


I'm considering upgrading our clusters from 14 to 15, and would really 
like to avoid these kinds of issues.


Thank you,

Dominic L. Hilsbos, MBA
Director - Information Technology
Perform Air International Inc.
dhils...@performair.com
www.PerformAir.com

-Original Message-
From: Igor Fedotov [mailto:ifedo...@suse.de]
Sent: Monday, April 12, 2021 7:55 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: OSDs RocksDB corrupted when upgrading 
nautilus->octopus: unknown WriteBatch tag


Sorry for being too late to the party...

I think the root cause is related to the high amount of repairs made
during the first post-upgrade fsck run.

The check (and fix) for zombie spanning blobs was been backported to
v15.2.9 (here is the PR https://github.com/ceph/ceph/pull/39256). And I
presumt it's the one which causes BlueFS data corruption due to huge
transaction happening during such a repair.

I haven't seen this exact issue (as having that many zombie blobs is a
rarely met bug by itself) but we had to some degree similar issue with
upgrading omap names, see: https://github.com/ceph/ceph/pull/39377

Huge resulting transaction could cause too big write to WAL which in
turn caused data corruption (see https://github.com/ceph/ceph/pull/39701)

Although the fix for the latter has been merged for 15.2.10 some
additional issues with huge transactions might still exist...


If someone can afford another OSD loss it could be interesting to get an
OSD log for such a repair with debug-bluefs set to 20...

I'm planning to make a fix to cap transaction size for repair in the
nearest future anyway though..


Thanks,

Igor


On 4/12/2021 5:15 PM, Dan van der Ster wrote:
Too bad. Let me continue trying to invoke Cunningham's Law for you 
... ;)


Have you excluded any possible hardware issues?

15.2.10 has a new option to check for all zero reads; maybe try it 
with true?


  Option("bluefs_check_for_zeros", Option::TYPE_BOOL, 
Option::LEVEL_DEV)

  .set_default(false)
  .set_flag(Option::FLAG_RUNTIME)
  .set_description("Check data read for suspicious pages")
  .set_long_description("Looks into data read to check if there is a
4K block entirely filled with zeros. "
  "If this happens, we re-read data. If there is
difference, we print error to log.")
  .add_see_also("bluestore_retry_disk_reads"),

The "fix zombie spanning blobs" feature was added in 15.2.9. Does
15.2.8 work for you?

Cheers, Dan

On Sun, Apr 11, 2021 at 10:17 PM Jonas Jelten  wrote:
Thanks for the idea, I've tried it with 1 thread, and it shredded 
another OSD.

I've updated the tracker ticket :)

At least non-racecondition bugs are hopefully easier to spot...

I wouldn't just disable the fsck and upgrade anyway until the cause 
is rooted out.


-- Jonas


On 29/03/2021 14.34, Dan van der Ster wrote:

Hi,

Saw that, looks scary!

I have no experience with that particular crash, but I was thinking
that if you have already backfilled the degraded PGs, and can afford
to try another OSD, you could try:

  "bluestore_fsck_quick_fix_threads": "1",  # because
https://github.com/facebook/rocksdb/issues/5068 showed a similar crash
and the dev said it occurs because WriteBatch is not thread safe.

  "bluestore_fsck_quick_fix_on_mount": "false", # should 
disable the

fsck during upgrade. See https://github.com/ceph/ceph/pull/40198

-- Dan

On Mon, Mar 29, 2021 at 2:23 PM Jonas Jelten  wrote:

Hi!

After upgrading MONs and MGRs successfully, the first OSD host I 
upgraded on Ubuntu Bionic from 14.2.16 to 15.2.10
shredded all OSDs on it by corrupting RocksDB, and they now refuse 
to boot.

Roc

[ceph-users] Re: [External Email] Cephadm upgrade to Pacific problem

2021-04-14 Thread Dave Hall

Radoslav,

I ran into the same.  For Debian 10 - recent updates - you have to add
'cgroup_enable=memory swapaccount=1' to the kernel command line
(/etc/default/grub).  The reference I found said that Debian decided to
disable this by default and make us turn it on if we want to run containers.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu

On Wed, Apr 14, 2021 at 12:51 PM Radoslav Milanov <
radoslav.mila...@gmail.com> wrote:

> Hello,
>
> Cluster is 3 nodes Debian 10. Started cephadm upgrade on healthy 15.2.10
> cluster. Managers were upgraded fine then first monitor went down for
> upgrade and never came back. Researching at the unit files container
> fails to run because of an error:
>
> root@host1:/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1#
> cat unit.run
>
> set -e
> /usr/bin/install -d -m0770 -o 167 -g 167
> /var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6
> # mon.host1
> ! /usr/bin/docker rm -f
> ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 2> /dev/null
> /usr/bin/docker run --rm --ipc=host --net=host --entrypoint
> /usr/bin/ceph-mon --privileged --group-add=disk --init --name
> ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 -e
> CONTAINER_IMAGE=ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a
>
> -e NODE_NAME=host1 -e CEPH_USE_RANDOM_NONCE=1 -v
> /var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/run/ceph:z -v
> /var/log/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/log/ceph:z -v
> /var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/crash:/var/lib/ceph/crash:z
>
> -v
> /var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1:/var/lib/ceph/mon/ceph-host1:z
>
> -v
> /var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1/config:/etc/ceph/ceph.conf:z
>
> -v /dev:/dev -v /run/udev:/run/udev
> ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a
>
> -n mon.host1 -f --setuser ceph --setgroup ceph
> --default-log-to-file=false --default-log-to-stderr=true
> '--default-log-stderr-prefix=debug '
> --default-mon-cluster-log-to-file=false
> --default-mon-cluster-log-to-stderr=true
>
> root@host1:/var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1#
> /usr/bin/docker run --rm --ipc=host --net=host --entrypoint
> /usr/bin/ceph-mon --privileged --group-add=disk --init --name
> ceph-97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6-mon.host1 -e
> CONTAINER_IMAGE=ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a
>
> -e NODE_NAME=host1 -e CEPH_USE_RANDOM_NONCE=1 -v
> /var/run/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/run/ceph:z -v
> /var/log/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6:/var/log/ceph:z -v
> /var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/crash:/var/lib/ceph/crash:z
>
> -v
> /var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1:/var/lib/ceph/mon/ceph-host1:z
>
> -v
> /var/lib/ceph/97d9f40e-9d33-11eb-8e3f-1c34da4b9fb6/mon.host1/config:/etc/ceph/ceph.conf:z
>
> -v /dev:/dev -v /run/udev:/run/udev
> ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a
>
> -n mon.host1 -f --setuser ceph --setgroup ceph
> --default-log-to-file=false --default-log-to-stderr=true
> '--default-log-stderr-prefix=debug '
> --default-mon-cluster-log-to-file=false
> --default-mon-cluster-log-to-stderr=true
>
>
> /usr/bin/docker: Error response from daemon: OCI runtime create failed:
> container_linux.go:344: starting container process caused "exec:
> \"/dev/init\": stat /dev/init: no such file or directory": unknown.
>
> Any suggestions how to resolve that ?
>
> Thank you.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-14 Thread Igor Fedotov


Hi Dan,

Seen that once before and haven't thoroughly investigated yet but I 
think the new PG removal stuff just revealed this "issue". In fact it 
had been in the code before the patch.


The warning means that new object(s) (given the object names these are 
apparently system objects, don't remember what's this exactly)  has been 
written to a PG after it was staged for removal.


New PG removal properly handles that case - that was just a paranoid 
check for an unexpected situation which has actually triggered. Hence 
IMO no need to worry at this point but developers might want to validate 
why this is happening



Thanks,

Igor

On 4/14/2021 10:26 PM, Dan van der Ster wrote:

Hi Igor,

After updating to 14.2.19 and then moving some PGs around we have a
few warnings related to the new efficient PG removal code, e.g. [1].
Is that something to worry about?

Best Regards,

Dan

[1]

/var/log/ceph/ceph-osd.792.log:2021-04-14 20:34:34.353 7fb2439d4700  0
osd.792 pg_epoch: 40906 pg[10.14b2s0( v 40734'290069
(33782'287000,40734'290069] lb MIN (bitwise) local-lis/les=33990/33991
n=36272 ec=4951/4937 lis/c 33990/33716 les/c/f 33991/33747/0
40813/40813/37166) [933,626,260,804,503,491]p933(0) r=-1 lpr=40813
DELETING pi=[33716,40813)/4 crt=40734'290069 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[0#10:4d28head#]

/var/log/ceph/ceph-osd.851.log:2021-04-14 18:40:13.312 7fd87bded700  0
osd.851 pg_epoch: 40671 pg[10.133fs5( v 40662'288967
(33782'285900,40662'288967] lb MIN (bitwise) local-lis/les=33786/33787
n=13 ec=4947/4937 lis/c 40498/33714 les/c/f 40499/33747/0
40670/40670/33432) [859,199,913,329,439,79]p859(0) r=-1 lpr=40670
DELETING pi=[33714,40670)/4 crt=40662'288967 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[5#10:fcc8head#]

/var/log/ceph/ceph-osd.851.log:2021-04-14 20:58:14.393 7fd87adeb700  0
osd.851 pg_epoch: 40906 pg[10.2e8s3( v 40610'288991
(33782'285900,40610'288991] lb MIN (bitwise) local-lis/les=33786/33787
n=161220 ec=4937/4937 lis/c 39826/33716 les/c/f 39827/33747/0
40617/40617/39225) [717,933,727,792,607,129]p717(0) r=-1 lpr=40617
DELETING pi=[33716,40617)/3 crt=40610'288991 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[3#10:1740head#]

/var/log/ceph/ceph-osd.883.log:2021-04-14 18:55:16.822 7f78c485d700  0
osd.883 pg_epoch: 40857 pg[7.d4( v 40804'9911289
(35835'9908201,40804'9911289] lb MIN (bitwise)
local-lis/les=40782/40783 n=195 ec=2063/1989 lis/c 40782/40782 les/c/f
40783/40844/0 40781/40845/40845) [877,870,894] r=-1 lpr=40845 DELETING
pi=[40782,40845)/1 crt=40804'9911289 lcod 40804'9911288 unknown NOTIFY
mbc={}] _delete_some additional unexpected onode list (new onodes has
appeared since PG removal started[#7:2b00head#]

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] ERROR: read_key_entry() idx= 1000_ ret=-2

2021-04-14 Thread by morphin

Hello everyone!

I'm running nautilus 14.2.16 and I'm using RGW with Beast frontend.
I see this eror log in every SSD osd which is using for rgw index.
Can you please tell me what is the problem?

OSD LOG:
cls_rgw.cc:1102: ERROR: read_key_entry()
idx=�1000_matches/xdir/05/21/27260.jpg ret=-2
cls_rgw.cc:1102: ERROR: read_key_entry()
idx=�1000_matches/xdir/05/21/27253.jpg ret=-2


RADOSGW LOG:
2021-04-15 01:53:54.385 7f2e0f8e7700  1 beast: 0x55a4439f8710:
10.151.101.15 - - [2021-04-15 01:53:54.0.385327s] "HEAD
/xdir/04/13/704745.jpg HTTP/1.1" 200 0 - "aws-sdk-java/1.11.638
Linux/3.10.0-1062.12.1.el7.x86_64
Java_HotSpot(TM)_64-Bit_Server_VM/25.251-b08 java/1.8.0_251 groovy/2.4.3
vendor/Oracle_Corporation" -
2021-04-15 01:53:54.385 7f2d8b7df700  1 == starting new request
req=0x55a4439f8710 =
2021-04-15 01:53:54.405 7f2e008c9700  1 == req done req=0x55a43dbc6710
op status=0 http_status=204 latency=0.33s ==
2021-04-15 01:53:54.405 7f2e008c9700  1 beast: 0x55a43dbc6710:
10.151.101.15 - - [2021-04-15 01:53:54.0.405327s] "DELETE
/xdir/05/21/21586.gz HTTP/1.1" 204 0 - "aws-sdk-java/1.11.638
Linux/3.10.0-1062.12.1.el7.x86_64
Java_HotSpot(TM)_64-Bit_Server_VM/25.251-b08 java/1.8.0_251 groovy/2.4.3
vendor/Oracle_Corporation" -
2021-04-15 01:53:54.405 7f2d92fee700  1 == starting new request
req=0x55a43dbc6710 =
2021-04-15 01:53:54.405 7f2d92fee700  0 WARNING: couldn't find acl header
for object, generating default
2021-04-15 01:53:54.405 7f2d92fee700  1 == req done req=0x55a43dbc6710
op status=0 http_status=200 latency=0s ==
2021-04-15 01:53:54.405 7f2d92fee700  1 beast: 0x55a43dbc6710:
10.151.101.15 - - [2021-04-15 01:53:54.0.405327s] "HEAD
/xdir/2013/11/20/2a67508e-d7dd-4e0f-b959-d7575d5f65b1 HTTP/1.1" 200 0 -
"aws-sdk-java/1.11.638 Linux/3.10.0-1160.11.1.el7.x86_64
Java_HotSpot(TM)_64-Bit_Server_VM/25.281-b09 java/1.8.0_281 groovy/2.5.6
vendor/Oracle_Corporation" -


CEPH OSD DF
ID  CLASS WEIGHT   REWEIGHT SIZERAW USE DATAOMAPMETAAVAIL
%USE  VAR  PGS STATUS
 19   ssd  0.87320  1.0 894 GiB 436 GiB 101 GiB 332 GiB 2.5 GiB 458 GiB
48.75 1.84 115 up
208   ssd  0.87329  1.0 894 GiB 161 GiB  87 GiB  73 GiB 978 MiB 733 GiB
18.00 0.68 113 up
199   ssd  0.87320  1.0 894 GiB 272 GiB 106 GiB 163 GiB 2.4 GiB 623 GiB
30.37 1.14 123 up
202   ssd  0.87329  1.0 894 GiB 239 GiB  73 GiB 165 GiB 1.4 GiB 655 GiB
26.77 1.01 106 up
 39   ssd  0.87320  1.0 894 GiB 450 GiB  87 GiB 361 GiB 2.3 GiB 444 GiB
50.36 1.90 113 up
207   ssd  0.87329  1.0 894 GiB 204 GiB 100 GiB  98 GiB 6.0 GiB 691 GiB
22.76 0.86 118 up
 59   ssd  0.87320  1.0 894 GiB 372 GiB 107 GiB 263 GiB 3.0 GiB 522 GiB
41.64 1.57 122 up
203   ssd  0.87329  1.0 894 GiB 206 GiB  79 GiB 124 GiB 2.4 GiB 689 GiB
23.00 0.87 117 up
 79   ssd  0.87320  1.0 894 GiB 447 GiB 103 GiB 342 GiB 1.8 GiB 447 GiB
49.97 1.88 120 up
206   ssd  0.87329  1.0 894 GiB 200 GiB  81 GiB 119 GiB 1.0 GiB 694 GiB
22.38 0.84  94 up
 99   ssd  0.87320  1.0 894 GiB 333 GiB  87 GiB 244 GiB 2.0 GiB 562 GiB
37.19 1.40 106 up
205   ssd  0.87329  1.0 894 GiB 316 GiB  83 GiB 232 GiB 1.1 GiB 579 GiB
35.29 1.33 117 up
114   ssd  0.87329  1.0 894 GiB 256 GiB 100 GiB 154 GiB 1.7 GiB 638 GiB
28.61 1.08 113 up
200   ssd  0.87329  1.0 894 GiB 266 GiB 100 GiB 165 GiB 1.1 GiB 628 GiB
29.76 1.12 128 up
139   ssd  0.87320  1.0 894 GiB 234 GiB  79 GiB 153 GiB 1.7 GiB 660 GiB
26.14 0.98 104 up
204   ssd  0.87329  1.0 894 GiB 173 GiB 113 GiB  59 GiB 1.2 GiB 721 GiB
19.37 0.73 124 up
119   ssd  0.87329  1.0 894 GiB 248 GiB 108 GiB 139 GiB 1.9 GiB 646 GiB
27.76 1.05 130 up
159   ssd  0.87329  1.0 894 GiB 196 GiB  94 GiB  99 GiB 2.6 GiB 699 GiB
21.87 0.82 109 up
179   ssd  0.87329  1.0 894 GiB 427 GiB  81 GiB 341 GiB 4.7 GiB 467 GiB
47.73 1.80 114 up
201   ssd  0.87329  1.0 894 GiB 346 GiB 102 GiB 242 GiB 1.8 GiB 548 GiB
38.71 1.46 128 up

CEPH IOSTAT
+---+---+---+---+---+---+
|  Read | Write | Total | Read IOPS |Write
IOPS |Total IOPS |
+---+---+---+---+---+---+
| 329 MiB/s |  39 MiB/s | 368 MiB/s |109027 |
1646 |110673 |
| 329 MiB/s |  39 MiB/s | 368 MiB/s |109027 |
1646 |110673 |
| 331 MiB/s |  39 MiB/s | 371 MiB/s |114915 |
1631 |116547 |
| 331 MiB/s |  39 MiB/s | 371 MiB/s |114915 |
1631 |116547 |
| 308 MiB/s |  42 MiB/s | 350 MiB/s |108469 |
1635 |110104 |
| 308 MiB/s |  42 MiB/s | 350 MiB/s |108469 |
1635 |110104 |
| 291 MiB/s |  44 MiB/s | 335 MiB/s |105828 |
1687 |107516 |
___
ceph-users mailing list -- c

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-14 Thread Neha Ojha

We saw this warning once in testing
(https://tracker.ceph.com/issues/49900#note-1), but there, the problem
was different, which also led to a crash. That issue has been fixed
but if you can provide osd logs with verbose logging, we might be able
to investigate further.

Neha

On Wed, Apr 14, 2021 at 4:14 PM Igor Fedotov  wrote:
>
> Hi Dan,
>
> Seen that once before and haven't thoroughly investigated yet but I
> think the new PG removal stuff just revealed this "issue". In fact it
> had been in the code before the patch.
>
> The warning means that new object(s) (given the object names these are
> apparently system objects, don't remember what's this exactly)  has been
> written to a PG after it was staged for removal.
>
> New PG removal properly handles that case - that was just a paranoid
> check for an unexpected situation which has actually triggered. Hence
> IMO no need to worry at this point but developers might want to validate
> why this is happening
>
>
> Thanks,
>
> Igor
>
> On 4/14/2021 10:26 PM, Dan van der Ster wrote:
> > Hi Igor,
> >
> > After updating to 14.2.19 and then moving some PGs around we have a
> > few warnings related to the new efficient PG removal code, e.g. [1].
> > Is that something to worry about?
> >
> > Best Regards,
> >
> > Dan
> >
> > [1]
> >
> > /var/log/ceph/ceph-osd.792.log:2021-04-14 20:34:34.353 7fb2439d4700  0
> > osd.792 pg_epoch: 40906 pg[10.14b2s0( v 40734'290069
> > (33782'287000,40734'290069] lb MIN (bitwise) local-lis/les=33990/33991
> > n=36272 ec=4951/4937 lis/c 33990/33716 les/c/f 33991/33747/0
> > 40813/40813/37166) [933,626,260,804,503,491]p933(0) r=-1 lpr=40813
> > DELETING pi=[33716,40813)/4 crt=40734'290069 unknown NOTIFY mbc={}]
> > _delete_some additional unexpected onode list (new onodes has appeared
> > since PG removal started[0#10:4d28head#]
> >
> > /var/log/ceph/ceph-osd.851.log:2021-04-14 18:40:13.312 7fd87bded700  0
> > osd.851 pg_epoch: 40671 pg[10.133fs5( v 40662'288967
> > (33782'285900,40662'288967] lb MIN (bitwise) local-lis/les=33786/33787
> > n=13 ec=4947/4937 lis/c 40498/33714 les/c/f 40499/33747/0
> > 40670/40670/33432) [859,199,913,329,439,79]p859(0) r=-1 lpr=40670
> > DELETING pi=[33714,40670)/4 crt=40662'288967 unknown NOTIFY mbc={}]
> > _delete_some additional unexpected onode list (new onodes has appeared
> > since PG removal started[5#10:fcc8head#]
> >
> > /var/log/ceph/ceph-osd.851.log:2021-04-14 20:58:14.393 7fd87adeb700  0
> > osd.851 pg_epoch: 40906 pg[10.2e8s3( v 40610'288991
> > (33782'285900,40610'288991] lb MIN (bitwise) local-lis/les=33786/33787
> > n=161220 ec=4937/4937 lis/c 39826/33716 les/c/f 39827/33747/0
> > 40617/40617/39225) [717,933,727,792,607,129]p717(0) r=-1 lpr=40617
> > DELETING pi=[33716,40617)/3 crt=40610'288991 unknown NOTIFY mbc={}]
> > _delete_some additional unexpected onode list (new onodes has appeared
> > since PG removal started[3#10:1740head#]
> >
> > /var/log/ceph/ceph-osd.883.log:2021-04-14 18:55:16.822 7f78c485d700  0
> > osd.883 pg_epoch: 40857 pg[7.d4( v 40804'9911289
> > (35835'9908201,40804'9911289] lb MIN (bitwise)
> > local-lis/les=40782/40783 n=195 ec=2063/1989 lis/c 40782/40782 les/c/f
> > 40783/40844/0 40781/40845/40845) [877,870,894] r=-1 lpr=40845 DELETING
> > pi=[40782,40845)/1 crt=40804'9911289 lcod 40804'9911288 unknown NOTIFY
> > mbc={}] _delete_some additional unexpected onode list (new onodes has
> > appeared since PG removal started[#7:2b00head#]
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ERROR: read_key_entry() idx= 1000_ ret=-2

2021-04-14 Thread by morphin

More informations:

I have a overlimit bucket and the error belongs to this bucket.

fill_status=OVER 100%
objects_per_shard: 363472 (I use default 100K per shard)
num_shards: 750


I'm deleting objects from this bucket with absolute path and I dont use
dynamic bucket resharding due to multisite.
I've reviewed the code and I think I see the error because these objects
not exist on index.
Can anyone explain the code and the error please?

OSD LOG:
cls_rgw.cc:1102: ERROR: read_key_entry()
idx=�1000_matches/xdir/05/21/27260.jpg ret=-2

https://github.com/ceph/ceph/blob/master/src/cls/rgw/cls_rgw.cc


public:
BIVerObjEntry(cls_method_context_t& _hctx, const cls_rgw_obj_key& _key) :
hctx(_hctx), key(_key), initialized(false) {
}
int init(bool check_delete_marker = true) {
int ret = read_key_entry(hctx, key, &instance_idx, &instance_entry,
check_delete_marker && key.instance.empty()); /* this is potentially a
delete marker, for null objects we
keep separate instance entry for the delete markers */
if (ret < 0) {
CLS_LOG(0, "ERROR: read_key_entry() idx=%s ret=%d", instance_idx.c_str(),
ret);
return ret;
}
initialized = true;
CLS_LOG(20, "read instance_entry key.name=%s key.instance=%s flags=%d",
instance_entry.key.name.c_str(), instance_entry.key.instance.c_str(),
instance_entry.flags);
return 0;
}
rgw_bucket_dir_entry& get_dir_entry() {
return instance_entry;
}


by morphin , 15 Nis 2021 Per, 02:19 tarihinde
şunu yazdı:

> Hello everyone!
>
> I'm running nautilus 14.2.16 and I'm using RGW with Beast frontend.
> I see this eror log in every SSD osd which is using for rgw index.
> Can you please tell me what is the problem?
>
> OSD LOG:
> cls_rgw.cc:1102: ERROR: read_key_entry()
> idx=�1000_matches/xdir/05/21/27260.jpg ret=-2
> cls_rgw.cc:1102: ERROR: read_key_entry()
> idx=�1000_matches/xdir/05/21/27253.jpg ret=-2
>
>
> RADOSGW LOG:
> 2021-04-15 01:53:54.385 7f2e0f8e7700  1 beast: 0x55a4439f8710:
> 10.151.101.15 - - [2021-04-15 01:53:54.0.385327s] "HEAD
> /xdir/04/13/704745.jpg HTTP/1.1" 200 0 - "aws-sdk-java/1.11.638
> Linux/3.10.0-1062.12.1.el7.x86_64
> Java_HotSpot(TM)_64-Bit_Server_VM/25.251-b08 java/1.8.0_251 groovy/2.4.3
> vendor/Oracle_Corporation" -
> 2021-04-15 01:53:54.385 7f2d8b7df700  1 == starting new request
> req=0x55a4439f8710 =
> 2021-04-15 01:53:54.405 7f2e008c9700  1 == req done req=0x55a43dbc6710
> op status=0 http_status=204 latency=0.33s ==
> 2021-04-15 01:53:54.405 7f2e008c9700  1 beast: 0x55a43dbc6710:
> 10.151.101.15 - - [2021-04-15 01:53:54.0.405327s] "DELETE
> /xdir/05/21/21586.gz HTTP/1.1" 204 0 - "aws-sdk-java/1.11.638
> Linux/3.10.0-1062.12.1.el7.x86_64
> Java_HotSpot(TM)_64-Bit_Server_VM/25.251-b08 java/1.8.0_251 groovy/2.4.3
> vendor/Oracle_Corporation" -
> 2021-04-15 01:53:54.405 7f2d92fee700  1 == starting new request
> req=0x55a43dbc6710 =
> 2021-04-15 01:53:54.405 7f2d92fee700  0 WARNING: couldn't find acl header
> for object, generating default
> 2021-04-15 01:53:54.405 7f2d92fee700  1 == req done req=0x55a43dbc6710
> op status=0 http_status=200 latency=0s ==
> 2021-04-15 01:53:54.405 7f2d92fee700  1 beast: 0x55a43dbc6710:
> 10.151.101.15 - - [2021-04-15 01:53:54.0.405327s] "HEAD
> /xdir/2013/11/20/2a67508e-d7dd-4e0f-b959-d7575d5f65b1 HTTP/1.1" 200 0 -
> "aws-sdk-java/1.11.638 Linux/3.10.0-1160.11.1.el7.x86_64
> Java_HotSpot(TM)_64-Bit_Server_VM/25.281-b09 java/1.8.0_281 groovy/2.5.6
> vendor/Oracle_Corporation" -
>
>
> CEPH OSD DF
> ID  CLASS WEIGHT   REWEIGHT SIZERAW USE DATAOMAPMETAAVAIL
>   %USE  VAR  PGS STATUS
>  19   ssd  0.87320  1.0 894 GiB 436 GiB 101 GiB 332 GiB 2.5 GiB 458
> GiB 48.75 1.84 115 up
> 208   ssd  0.87329  1.0 894 GiB 161 GiB  87 GiB  73 GiB 978 MiB 733
> GiB 18.00 0.68 113 up
> 199   ssd  0.87320  1.0 894 GiB 272 GiB 106 GiB 163 GiB 2.4 GiB 623
> GiB 30.37 1.14 123 up
> 202   ssd  0.87329  1.0 894 GiB 239 GiB  73 GiB 165 GiB 1.4 GiB 655
> GiB 26.77 1.01 106 up
>  39   ssd  0.87320  1.0 894 GiB 450 GiB  87 GiB 361 GiB 2.3 GiB 444
> GiB 50.36 1.90 113 up
> 207   ssd  0.87329  1.0 894 GiB 204 GiB 100 GiB  98 GiB 6.0 GiB 691
> GiB 22.76 0.86 118 up
>  59   ssd  0.87320  1.0 894 GiB 372 GiB 107 GiB 263 GiB 3.0 GiB 522
> GiB 41.64 1.57 122 up
> 203   ssd  0.87329  1.0 894 GiB 206 GiB  79 GiB 124 GiB 2.4 GiB 689
> GiB 23.00 0.87 117 up
>  79   ssd  0.87320  1.0 894 GiB 447 GiB 103 GiB 342 GiB 1.8 GiB 447
> GiB 49.97 1.88 120 up
> 206   ssd  0.87329  1.0 894 GiB 200 GiB  81 GiB 119 GiB 1.0 GiB 694
> GiB 22.38 0.84  94 up
>  99   ssd  0.87320  1.0 894 GiB 333 GiB  87 GiB 244 GiB 2.0 GiB 562
> GiB 37.19 1.40 106 up
> 205   ssd  0.87329  1.0 894 GiB 316 GiB  83 GiB 232 GiB 1.1 GiB 579
> GiB 35.29 1.33 117 up
> 114   ssd  0.87329  1.0 894 GiB 256 GiB 100 GiB 154 GiB 1.7 GiB 638
> GiB 28.61 1.08 113 up
> 200   ssd  0.87329  1.0 894 GiB 266 GiB 100 GiB 165 GiB 1.1 GiB 628
> GiB 29.76 1.12 128 up
> 139   s

[ceph-users] Re: Revisit Large OMAP Objects

2021-04-14 Thread by morphin

I've same issue and joined to the club.
Almost every deleted bucket is still there due to multisite. Also I've
removed secondary zone and stopped sync but these stale-instance's still
there.
Before adding new secondary zone I want to remove them. If you gonna run
anything let me know please.




 adresine sahip kullanıcı 14 Nis 2021 Çar, 21:20
tarihinde şunu yazdı:

> Casey;
>
> That makes sense, and I appreciate the explanation.
>
> If I were to shut down all uses of RGW, and wait for replication to catch
> up, would this then address most known issues with running this command in
> a multi-site environment?  Can I offline RADOSGW daemons as an added
> precaution?
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director – Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
> -Original Message-
> From: Casey Bodley [mailto:cbod...@redhat.com]
> Sent: Wednesday, April 14, 2021 9:03 AM
> To: Dominic Hilsbos
> Cc: k0...@k0ste.ru; ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: Revisit Large OMAP Objects
>
> On Wed, Apr 14, 2021 at 11:44 AM  wrote:
> >
> > Konstantin;
> >
> > Dynamic resharding is disabled in multisite environments.
> >
> > I believe you mean radosgw-admin reshard stale-instances rm.
> >
> > Documentation suggests this shouldn't be run in a multisite
> environment.  Does anyone know the reason for this?
>
> say there's a bucket with 10 objects in it, and that's been fully
> replicated to a secondary zone. if you want to remove the bucket, you
> delete its objects then delete the bucket
>
> when the bucket is deleted, rgw can't delete its bucket instance yet
> because the secondary zone may not be caught up with sync - it
> requires access to the bucket instance (and its index) to sync those
> last 10 object deletions
>
> so the risk with 'stales-instances rm' in multisite is that you might
> delete instances before other zones catch up, which can lead to
> orphaned objects
>
> >
> > Is it, in fact, safe, even in a multisite environment?
> >
> > Thank you,
> >
> > Dominic L. Hilsbos, MBA
> > Director – Information Technology
> > Perform Air International Inc.
> > dhils...@performair.com
> > www.PerformAir.com
> >
> >
> > -Original Message-
> > From: Konstantin Shalygin [mailto:k0...@k0ste.ru]
> > Sent: Wednesday, April 14, 2021 12:15 AM
> > To: Dominic Hilsbos
> > Cc: ceph-users@ceph.io
> > Subject: Re: [ceph-users] Revisit Large OMAP Objects
> >
> > Run reshard instances rm
> > And reshard your bucket by hand or leave dynamic resharding process to
> do this work
> >
> >
> > k
> >
> > Sent from my iPhone
> >
> > > On 13 Apr 2021, at 19:33, dhils...@performair.com wrote:
> > >
> > > All;
> > >
> > > We run 2 Nautilus clusters, with RADOSGW replication (14.2.11 -->
> 14.2.16).
> > >
> > > Initially our bucket grew very quickly, as I was loading old data into
> it and we quickly ran into Large OMAP Object warnings.
> > >
> > > I have since done a couple manual reshards, which has fixed the
> warning on the primary cluster.  I have never been able to get rid of the
> issue on the cluster with the replica.
> > >
> > > I prior conversation on this list led me to this command:
> > > radosgw-admin reshard stale-instances list --yes-i-really-mean-it
> > >
> > > The results of which look like this:
> > > [
> > >"nextcloud-ra:f91aeff8-a365-47b4-a1c8-928cd66134e8.185262.1",
> > >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.6",
> > >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.2",
> > >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.5",
> > >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.4",
> > >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.3",
> > >"nextcloud:f91aeff8-a365-47b4-a1c8-928cd66134e8.53761.1",
> > >"3520ae821f974340afd018110c1065b8/OS
> Development:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.1",
> > >
> "10dfdfadb7374ea1ba37bee1435d87ad/volumebackups:f91aeff8-a365-47b4-a1c8-928cd66134e8.4298264.2",
> > >"WorkOrder:f91aeff8-a365-47b4-a1c8-928cd66134e8.44130.1"
> > > ]
> > >
> > > I find this particularly interesting, as nextcloud-ra, /OS
> Development, /volumbackups, and WorkOrder buckets no longer exist.
> > >
> > > When I run:
> > > for obj in $(rados -p 300.rgw.buckets.index ls | grep
> f91aeff8-a365-47b4-a1c8-928cd66134e8.3512190.1);   do   printf "%-60s
> %7d\n" $obj $(rados -p 300.rgw.buckets.index listomapkeys $obj | wc -l);
>  done
> > >
> > > I get the expected 64 entries, with counts around 2 +/- 1000.
> > >
> > > Are the above listed stale instances ok to delete?  If so, how do I go
> about doing so?
> > >
> > > Thank you,
> > >
> > > Dominic L. Hilsbos, MBA
> > > Director - Information Technology
> > > Perform Air International Inc.
> > > dhils...@performair.com
> > > www.PerformAir.com
> > >
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to

[ceph-users] Re: How to disable ceph-grafana during cephadm bootstrap

2021-04-14 Thread mabi

Thank you for the hint regarding the --skip-monitoring-stack parameter.

Actually I already bootstrapped my cluster without this option, so is there a 
way to disable and remove the ceph-grafana part now? or do I need to bootstrap 
my cluster again?

‐‐‐ Original Message ‐‐‐
On Wednesday, April 14, 2021 3:41 PM, Sebastian Wagner  
wrote:

> cephadm bootstrap --skip-monitoring-stack
>
> should to the trick. See man cephadm
>
> On Tue, Apr 13, 2021 at 6:05 PM mabi  wrote:
>
>> Hello,
>>
>> When bootstrapping a new ceph Octopus cluster with "cephadm bootstrap", how 
>> can I tell the cephadm bootstrap NOT to install the ceph-grafana container?
>>
>> Thank you very much in advance for your answer.
>>
>> Best regards,
>> Mabi
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Revisit Large OMAP Objects

[ceph-users] Re: Exporting CephFS using Samba preferred method

[ceph-users] Re: Exporting CephFS using Samba preferred method

[ceph-users] Re: Exporting CephFS using Samba preferred method

[ceph-users] DocuBetter Meeting This Week -- 1630 UTC

[ceph-users] Re: Abandon incomplete (damaged EC) pgs - How to manage the impact on cephfs?

[ceph-users] Re: Abandon incomplete (damaged EC) pgs - How to manage the impact on cephfs?

[ceph-users] Re: How to disable ceph-grafana during cephadm bootstrap

[ceph-users] Monitor dissapears/stopped after testing monitor-host loss and recovery

[ceph-users] Re: Revisit Large OMAP Objects

[ceph-users] Re: Revisit Large OMAP Objects

[ceph-users] Cephadm upgrade to Pacific problem

[ceph-users] Re: Revisit Large OMAP Objects

[ceph-users] _delete_some new onodes has appeared since PG removal started

[ceph-users] Ceph Month June 2021 Event

[ceph-users] Re: [External Email] Cephadm upgrade to Pacific problem

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

[ceph-users] Re: [External Email] Cephadm upgrade to Pacific problem

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

[ceph-users] ERROR: read_key_entry() idx= 1000_ ret=-2

[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

[ceph-users] Re: ERROR: read_key_entry() idx= 1000_ ret=-2

[ceph-users] Re: Revisit Large OMAP Objects

[ceph-users] Re: How to disable ceph-grafana during cephadm bootstrap

24 matches

Site Navigation

Mail list logo

Footer information