[ceph-users] Re: Mysterious Disk-Space Eater

2023-01-12 Thread Eneko Lacunza

Hi,

El 12/1/23 a las 3:59, duluxoz escribió:

Got a funny one, which I'm hoping someone can help us with.

We've got three identical(?) Ceph Quincy Nodes running on Rocky Linux 
8.7. Each Node has 4 OSDs, plus Monitor, Manager, and iSCSI G/W 
services running on them (we're only a small shop). Each Node has a 
separate 16 GiB partition mounted as /var. Everything is running well 
and the Ceph Cluster is handling things very well).


However, one of the Nodes (not the one currently acting as the Active 
Manager) is running out of space on /var. Normally, all of the Nodes 
have around 10% space used (via a df -H command), but the problem Node 
only takes 1 to 3 days to run out of space, hence taking it out of 
Quorum. Its currently at 85% and growing.


At first we thought this was caused by an overly large log file, but 
investigations showed that all the logs on all 3 Nodes were of 
comparable size. Also, searching for the 20 largest files on the 
problem Node's /var didn't produce any significant results.


Coincidentally, unrelated to this issue, the problem Node (but not the 
other 2 Nodes) was re-booted a couple of days ago and, when the 
Cluster had re-balanced itself and everything was back online and 
reporting as Healthy, the problem Node's /var was back down to around 
10%, the same as the other two Nodes.


This lead us to suspect that there was some sort of "run-away" process 
or journaling/logging/temporary file(s) or whatever that the re-boot 
has "cleaned up". So we've been keeping an eye on things but we can't 
see anything causing the issue and now, as I said above, the problem 
Node's /var is back up to 85% and growing.


I've been looking at the log files, tying to determine the issue, but 
as I don't really know what I'm looking for I don't even know if I'm 
looking in the *correct* log files...


Obviously rebooting the problem Node every couple of days is not a 
viable option, and increasing the size of the /var partition is only 
going to postpone the issue, not resolve it. So if anyone has any 
ideas we'd love to hear about it - thanks


This seems one or more files that are removed but some process has their 
handle open (and maybe is still writing...). When rebooting process is 
terminated and file(s) effectively removed.


Try to inspect each process' open files and find what file(s) have no 
longer a directory entry... that would give you a hint.


Cheers


Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 |https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Mysterious Disk-Space Eater

2023-01-12 Thread E Taka
We had a similar problem, and it was a (visible) logfile. It is easy to
find with the ncdu utility (`ncdu -x /var`). There's no need of a reboot,
you can get rid of it with restarting the Monitor with `ceph orch daemon
restart mon.NODENAME`. You may also lower the debug level.

Am Do., 12. Jan. 2023 um 09:14 Uhr schrieb Eneko Lacunza :

> Hi,
>
> El 12/1/23 a las 3:59, duluxoz escribió:
> > Got a funny one, which I'm hoping someone can help us with.
> >
> > We've got three identical(?) Ceph Quincy Nodes running on Rocky Linux
> > 8.7. Each Node has 4 OSDs, plus Monitor, Manager, and iSCSI G/W
> > services running on them (we're only a small shop). Each Node has a
> > separate 16 GiB partition mounted as /var. Everything is running well
> > and the Ceph Cluster is handling things very well).
> >
> > However, one of the Nodes (not the one currently acting as the Active
> > Manager) is running out of space on /var. Normally, all of the Nodes
> > have around 10% space used (via a df -H command), but the problem Node
> > only takes 1 to 3 days to run out of space, hence taking it out of
> > Quorum. Its currently at 85% and growing.
> >
> > At first we thought this was caused by an overly large log file, but
> > investigations showed that all the logs on all 3 Nodes were of
> > comparable size. Also, searching for the 20 largest files on the
> > problem Node's /var didn't produce any significant results.
> >
> > Coincidentally, unrelated to this issue, the problem Node (but not the
> > other 2 Nodes) was re-booted a couple of days ago and, when the
> > Cluster had re-balanced itself and everything was back online and
> > reporting as Healthy, the problem Node's /var was back down to around
> > 10%, the same as the other two Nodes.
> >
> > This lead us to suspect that there was some sort of "run-away" process
> > or journaling/logging/temporary file(s) or whatever that the re-boot
> > has "cleaned up". So we've been keeping an eye on things but we can't
> > see anything causing the issue and now, as I said above, the problem
> > Node's /var is back up to 85% and growing.
> >
> > I've been looking at the log files, tying to determine the issue, but
> > as I don't really know what I'm looking for I don't even know if I'm
> > looking in the *correct* log files...
> >
> > Obviously rebooting the problem Node every couple of days is not a
> > viable option, and increasing the size of the /var partition is only
> > going to postpone the issue, not resolve it. So if anyone has any
> > ideas we'd love to hear about it - thanks
>
> This seems one or more files that are removed but some process has their
> handle open (and maybe is still writing...). When rebooting process is
> terminated and file(s) effectively removed.
>
> Try to inspect each process' open files and find what file(s) have no
> longer a directory entry... that would give you a hint.
>
> Cheers
>
>
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
>
> Tel. +34 943 569 206 |https://www.binovo.es
> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
>
> https://www.youtube.com/user/CANALBINOVO
> https://www.linkedin.com/company/37269706/
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Removing OSDs - draining but never completes.

2023-01-12 Thread E Taka
You have to wait until the rebalancing finished.

Am Di., 10. Jan. 2023 um 17:14 Uhr schrieb Wyll Ingersoll <
wyllys.ingers...@keepertech.com>:

> Running ceph-pacific 16.2.9 using ceph orchestrator.
>
> We made a mistake adding a disk to the cluster and immediately issued a
> command to remove it using "ceph orch osd rm ### --replace --force".
>
> This OSD had no data on it at the time and was removed after just a few
> minutes.  "ceph orch osd rm status" shows that it is still "draining".
> ceph osd df shows that the osd being removed has -1 PGs.
>
> So - why is the simple act of removal taking so long and can we abort it
> and manually remove that osd somehow?
>
> Note: the cluster is also doing a rebalance while this is going on, but
> the osd being removed never had any data and should not be affected by the
> rebalance.
>
> thanks!
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Current min_alloc_size of OSD?

2023-01-12 Thread Robert Sander

On 11.01.23 23:47, Anthony D'Atri wrote:


It’s printed in the OSD log at startup.


But which info is it exactly?

This line looks like reporting the block_size of the device:

  bdev(0x55b50a2e5800 /var/lib/ceph/osd/ceph-0/block) open size 107369988096 
(0x18ffc0, 100 GiB) block_size 4096 (4 KiB) non-rotational discard supported

Is it this line?

  bluefs _init_alloc shared, id 1, capacity 0x18ffc0, block size 0x1

  0x1 equals 65536 aka 64K in decimal.

Is it this line?

  bluestore(/var/lib/ceph/osd/ceph-0) _open_super_meta min_alloc_size 0x1000

Or this one?

  bluestore(/var/lib/ceph/osd/ceph-0) _init_alloc loaded 100 GiB in 1 extents, 
allocator type hybrid, capacity 0x18ffc0, block size 0x1000, free 
0x18ffbfd000, fragmentation 0


I don’t immediately see it in `ceph osd metadata` ; arguably it should be there.


An entry in ceph osd metadata would be great to have.


`config show` on the admin socket I suspect does not show the existing value.


This show the value currently set in the configuration.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Current min_alloc_size of OSD?

2023-01-12 Thread Gerdriaan Mulder

Hi,

On 12/01/2023 10.26, Robert Sander wrote:

Is it this line?

   bluestore(/var/lib/ceph/osd/ceph-0) _open_super_meta min_alloc_size 
0x1000


That seems to be it: 
https://github.com/ceph/ceph/blob/v15.2.17/src/os/bluestore/BlueStore.cc#L11754-L11755


A few lines later it should state the same, but with function name 
"_set_alloc_sizes": 
https://github.com/ceph/ceph/blob/v15.2.17/src/os/bluestore/BlueStore.cc#L5220-L5226, 
althoug "dout(10)" probably means it only outputs this in a higher debug 
level.


Best regards,
Gerdriaan Mulder
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [solved] Current min_alloc_size of OSD?

2023-01-12 Thread Robert Sander

Hi,

On 12.01.23 11:11, Gerdriaan Mulder wrote:


On 12/01/2023 10.26, Robert Sander wrote:

Is it this line?

    bluestore(/var/lib/ceph/osd/ceph-0) _open_super_meta min_alloc_size
0x1000


That seems to be it:
https://github.com/ceph/ceph/blob/v15.2.17/src/os/bluestore/BlueStore.cc#L11754-L11755


Thanks for the confirmation.

So one can grep for "min_alloc_size" in the OSD's log output.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Octopus rbd images stuck in trash

2023-01-12 Thread Eugen Block

Hi,

just wondering if you're looking in the right pool(s)? The default  
pool is "rbd", are those images you listed from the "rbd" pool? Do you  
use an alias for the "rbd" command? If that's not it maybe increase  
rbd client debug logs to see where it goes wrong.
From time to time I also have to clean up some orphans from the  
trash, but I believe I was able to restore the images from trash  
before looking for watchers. But just in case you get there, with 'rbd  
status /' you should see if there is a watcher which you  
can blacklist then, then clean up snapshots etc.


Regards,
Eugen

Zitat von Jeff Welling :


Hello there,

I'm running Ceph 15.2.17 (Octopus) on Debian Buster and I'm starting  
an upgrade but I'm seeing a problem and I wanted to ask how best to  
proceed in case I make things worse by mucking with it without  
asking experts.


I've moved an rbd image to the trash without clearing the snapshots  
first, and then tried to 'trash purge'. This resulted in an error  
because the image still has snapshots, but I'm unable to remove the  
image from the pool to clear the snapshots either. At least one of  
these images is from a clone of a snapshot from another trashed  
image, which I'm already kicking myself for.


The contents of my trash:

# rbd trash ls
07afadac0ed69c nfsroot_pi08
240ae5a5eb3214 bigdisk
7fd5138848231e nfsroot_pi01
f33e1f5bad0952 bigdisk2
fcdeb1f96a6124 raspios-64bit-lite-manuallysetup-p1
fcdebd2237697a raspios-64bit-lite-manuallysetup-p2
fd51418d5c43da nfsroot_pi02
fd514a6b4d3441 nfsroot_pi03
fd515061816c70 nfsroot_pi04
fd51566859250b nfsroot_pi05
fd5162c5885d9c nfsroot_pi07
fd5171c27c36c2 nfsroot_pi09
fd51743cb8813c nfsroot_pi10
fd517ad3bc3c9d nfsroot_pi11
fd5183bfb1e588 nfsroot_pi12


This is the error I get trying to purge the trash:

# rbd trash purge
Removing images: 0% complete...failed.
rbd: some expired images could not be removed
Ensure that they are closed/unmapped, do not have snapshots  
(including trashed snapshots with linked clones), are not in a group  
and were moved to the trash successfully.



This is the error when I try and restore one of the trashed images:

# rbd trash restore nfsroot_pi08
rbd: error: image does not exist in trash
2023-01-11T12:28:52.982-0800 7f4b69a7c3c0 -1 librbd::api::Trash:  
restore: error getting image id nfsroot_pi08 info from trash: (2) No  
such file or directory


Trying to restore other images gives the same error.

These trash images are now taking up a significant portion of the  
cluster space. One thought was to upgrade and see if that resolves  
the problem, but I've shot myself in the foot doing that in the past  
without confirming it would solve the problem, so I'm looking for a  
second opinion on how best to clear these?


These are all Debian Buster systems, the kernel version of the host  
I'm running these commands on is:


Linux zim 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1+deb10u1  
(2020-04-27) x86_64 GNU/Linux


I'm going to be upgrading that too but one step at a time.
The exact ceph version is:

ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4)  
octopus (stable)


This was installed from the ceph repos, not the debian repos, using  
cephadm. If there's any additional details I can share please let me  
know, any and all thoughts welcome! I've been googling and have  
found folks with similar issues but nothing similar enough to feel  
helpful.


Thanks in advance, and thank you to any and everyone who contributes  
to Ceph, it's awesome!

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Creating nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-12 Thread Ruidong Gao
Hi,

This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will 
crash Ganesha server pod. CephFS export works just fine. Here are steps of it:
1, create export:
bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path 
/bucketexport --bucket testbk
{
"bind": "/bucketexport",
"path": "testbk",
"cluster": "nfs4rgw",
"mode": "RW",
"squash": "none"
}

2, check pods status afterwards:
rook-ceph-nfs-nfs1-a-679fdb795-82tcx  2/2 Running 0 
 4h3m
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42  1/2 Error   2 
 4h6m

3, check failing pod’s logs:

11/01/2023 08:11:53 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 
: nfs-ganesha-1[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, 
duration 90
11/01/2023 08:11:54 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 
: nfs-ganesha-1[main] nfs_start_grace :STATE :EVENT :grace reload client info 
completed from backend
11/01/2023 08:11:54 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 
: nfs-ganesha-1[main] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim 
complete(0) clid count(0)
11/01/2023 08:11:57 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 
: nfs-ganesha-1[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT 
IN GRACE
11/01/2023 08:11:57 : epoch 63be6f49 : rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 
: nfs-ganesha-1[main] export_defaults_commit :CONFIG :INFO :Export Defaults now 
(options=03303002/0008   , ,,   ,   
, ,,, expire=   0)
2023-01-11T08:11:57.853+ 7f59dac7c200 -1 auth: unable to find a keyring on 
/var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory
2023-01-11T08:11:57.853+ 7f59dac7c200 -1 AuthRegistry(0x56476817a480) no 
keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling cephx
2023-01-11T08:11:57.855+ 7f59dac7c200 -1 auth: unable to find a keyring on 
/var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory
2023-01-11T08:11:57.855+ 7f59dac7c200 -1 AuthRegistry(0x7ffe4d092c90) no 
keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling cephx
2023-01-11T08:11:57.856+ 7f5987537700 -1 monclient(hunting): 
handle_auth_bad_method server allowed_methods [2] but i only support [1]
2023-01-11T08:11:57.856+ 7f5986535700 -1 monclient(hunting): 
handle_auth_bad_method server allowed_methods [2] but i only support [1]
2023-01-11T08:12:00.861+ 7f5986d36700 -1 monclient(hunting): 
handle_auth_bad_method server allowed_methods [2] but i only support [1]
2023-01-11T08:12:00.861+ 7f59dac7c200 -1 monclient: authenticate NOTE: no 
keyring found; disabled cephx authentication
failed to fetch mon config (--no-mon-config to skip)

4, delete the export:
ceph nfs export delete nfs4rgw /bucketexport

Ganesha servers go back normal:
rook-ceph-nfs-nfs1-a-679fdb795-82tcx  2/2 Running 0 
 4h30m
rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42  2/2 Running 
10 4h33m

Any ideas to make it work?

Thanks
Ben
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: iscsi target lun error

2023-01-12 Thread Frédéric Nass
Hi Xiubo, Randy,

This is due to ' host.containers.internal' being added to the 
container's /etc/hosts since Podman 4.1+.

The workaround consists of either downgrading Podman package to v4.0 (on RHEL8, 
dnf downgrade podman-4.0.2-6.module+el8.6.0+14877+f643d2d6) or adding the 
--no-hosts option to 'podman run' command in /var/lib/ceph/$(ceph 
fsid)/iscsi.iscsi.test-iscsi1.xx/unit.run and restart the iscsi container 
service.

[1] and [2] could well have the same cause. RHCS Block Device Guide [3] quotes 
RHEL 8.4 as a prerequisites. I don't know what was the version of Podman in 
RHEL 8.4 at the time, but with RHEL 8.7 and Podman 4.2, it's broken.

I'll open a RHCS case today to have it fixed and have other containers like 
grafana, prometheus, etc. being checked against this new podman behavior.

Regards,
Frédéric.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1979449
[2] https://tracker.ceph.com/issues/57018
[3] 
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html-single/block_device_guide/index#prerequisites_9

- Le 21 Nov 22, à 6:45, Xiubo Li xiu...@redhat.com a écrit :

> On 15/11/2022 23:44, Randy Morgan wrote:
>> You are correct I am using the cephadm to create the iscsi portals.
>> The cluster had been one I was learning a lot with and I wondered if
>> it was because of the number of creations and deletions of things, so
>> I rebuilt the cluster, now I am getting this response even when
>> creating my first iscsi target.   Here is the output of the gwcli ls:
>>
>> sh-4.4# gwcli ls
>> o- /
>> 
>> [...]
>>   o- cluster
>> 
>> [Clusters: 1]
>>   | o- ceph
>> .
>> [HEALTH_WARN]
>>   |   o- pools
>> .
>> [Pools: 8]
>>   |   | o- .rgw.root
>>  [(x3),
>> Commit: 0.00Y/71588776M (0%), Used: 1323b]
>>   |   | o- cephfs_data
>> .. [(x3),
>> Commit: 0.00Y/71588776M (0%), Used: 1639b]
>>   |   | o- cephfs_metadata
>> .. [(x3), Commit:
>> 0.00Y/71588776M (0%), Used: 3434b]
>>   |   | o- default.rgw.control
>> .. [(x3), Commit:
>> 0.00Y/71588776M (0%), Used: 0.00Y]
>>   |   | o- default.rgw.log
>> .. [(x3), Commit:
>> 0.00Y/71588776M (0%), Used: 3702b]
>>   |   | o- default.rgw.meta
>> .. [(x3), Commit:
>> 0.00Y/71588776M (0%), Used: 382b]
>>   |   | o- device_health_metrics
>>  [(x3), Commit:
>> 0.00Y/71588776M (0%), Used: 0.00Y]
>>   |   | o- rhv-ceph-ssd
>> . [(x3), Commit:
>> 0.00Y/7868560896K (0%), Used: 511746b]
>>   |   o- topology
>> ..
>> [OSDs: 36,MONs: 3]
>>   o- disks
>> ..
>> [0.00Y, Disks: 0]
>>   o- iscsi-targets
>> ..
>> [DiscoveryAuth: None, Targets: 1]
>>     o- iqn.2001-07.com.ceph:1668466555428
>> ... [Auth:
>> None, Gateways: 1]
>>   o- disks
>> .
>> [Disks: 0]
>>   o- gateways
>> ...
>> [Up: 1/1, Portals: 1]
>>   | o- host.containers.internal
>> 
>> [192.168.105.145 (UP)]
> 
> Please manually remove this gateway before doing further steps.
> 
> It should be a bug in cephadm and you can raise one tracker for this.
> 
> Thanks
> 
> 
>> o- host-groups
>> .
>> [Groups : 0]
>>   o- hosts
>> ..
>> [Auth: ACL_ENABLED, Hosts: 0]
>> sh-4.4#
>>
>> Randy
>>
>> On 11/9/2022 6:36 PM, Xiubo Li wrote:
>>>
>>> On 10/11/2022 02:21, Randy Morgan wrote:
 I am trying to create a second iscsi target and I keep getting an
 error when I create the second target:


    Failed to update target 'iqn.2001-07.com.ceph:1667946365517'

 disk 

[ceph-users] Re: Creating nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-12 Thread Matt Benjamin
Hi Ben,

The issue seems to be that you don't have a ceph keyring available to the
nfs-ganesha server.  The upstream doc talks about this.  The nfs-ganesha
runtime environment needs to be essentially identical to one (a pod, I
guess) that would run radosgw.

Matt

On Thu, Jan 12, 2023 at 7:27 AM Ruidong Gao  wrote:

> Hi,
>
> This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will
> crash Ganesha server pod. CephFS export works just fine. Here are steps of
> it:
> 1, create export:
> bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path
> /bucketexport --bucket testbk
> {
> "bind": "/bucketexport",
> "path": "testbk",
> "cluster": "nfs4rgw",
> "mode": "RW",
> "squash": "none"
> }
>
> 2, check pods status afterwards:
> rook-ceph-nfs-nfs1-a-679fdb795-82tcx  2/2 Running
>0  4h3m
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42  1/2 Error
>2  4h6m
>
> 3, check failing pod’s logs:
>
> 11/01/2023 08:11:53 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
> 11/01/2023 08:11:54 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> nfs_start_grace :STATE :EVENT :grace reload client info completed from
> backend
> 11/01/2023 08:11:54 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid
> count(0)
> 11/01/2023 08:11:57 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
> 11/01/2023 08:11:57 : epoch 63be6f49 :
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main]
> export_defaults_commit :CONFIG :INFO :Export Defaults now
> (options=03303002/0008   , ,,   ,
>  , ,,, expire=   0)
> 2023-01-11T08:11:57.853+ 7f59dac7c200 -1 auth: unable to find a
> keyring on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or
> directory
> 2023-01-11T08:11:57.853+ 7f59dac7c200 -1 AuthRegistry(0x56476817a480)
> no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling
> cephx
> 2023-01-11T08:11:57.855+ 7f59dac7c200 -1 auth: unable to find a
> keyring on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or
> directory
> 2023-01-11T08:11:57.855+ 7f59dac7c200 -1 AuthRegistry(0x7ffe4d092c90)
> no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling
> cephx
> 2023-01-11T08:11:57.856+ 7f5987537700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [1]
> 2023-01-11T08:11:57.856+ 7f5986535700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [1]
> 2023-01-11T08:12:00.861+ 7f5986d36700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [1]
> 2023-01-11T08:12:00.861+ 7f59dac7c200 -1 monclient: authenticate NOTE:
> no keyring found; disabled cephx authentication
> failed to fetch mon config (--no-mon-config to skip)
>
> 4, delete the export:
> ceph nfs export delete nfs4rgw /bucketexport
>
> Ganesha servers go back normal:
> rook-ceph-nfs-nfs1-a-679fdb795-82tcx  2/2 Running
>0  4h30m
> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42  2/2 Running
>10 4h33m
>
> Any ideas to make it work?
>
> Thanks
> Ben
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Peter van Heusden
Hello everyone

I have a Ceph installation where some of the OSDs were misconfigured to use
1GB SSD partitions for rocksdb. This caused a spillover ("BlueFS *spillover*
detected"). I recently upgraded to quincy using cephadm (17.2.5) the
spillover warning vanished. This is
despite bluestore_warn_on_bluefs_spillover still being set to true.

Is there a way to investigate the current state of the DB to see if
spillover is, indeed, still happening?

Thank you,
Peter
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Mysterious Disk-Space Eater

2023-01-12 Thread Anthony D'Atri
One can even remove the log and tell the daemon to reopen it without having to 
restart.  I’ve had mons do enough weird things on me that I try to avoid 
restarting them.  ymmv.

It’s possible that the OP has a large file that’s unlinked but still open, 
historically “fsck -n” would find these, today that would depend on the 
filesystem in use.

It’s also possible that there is data under a mountpoint directory within /var, 
that’s masked by the overlaid mount.




http://cephnotes.ksperis.com/blog/2017/01/20/change-log-level-on-the-fly-to-ceph-daemons/

> On Jan 12, 2023, at 4:04 AM, E Taka <0eta...@gmail.com> wrote:
> 
> We had a similar problem, and it was a (visible) logfile. It is easy to
> find with the ncdu utility (`ncdu -x /var`). There's no need of a reboot,
> you can get rid of it with restarting the Monitor with `ceph orch daemon
> restart mon.NODENAME`. You may also lower the debug level.
> 
> Am Do., 12. Jan. 2023 um 09:14 Uhr schrieb Eneko Lacunza > :
> 
>> Hi,
>> 
>> El 12/1/23 a las 3:59, duluxoz escribió:
>>> Got a funny one, which I'm hoping someone can help us with.
>>> 
>>> We've got three identical(?) Ceph Quincy Nodes running on Rocky Linux
>>> 8.7. Each Node has 4 OSDs, plus Monitor, Manager, and iSCSI G/W
>>> services running on them (we're only a small shop). Each Node has a
>>> separate 16 GiB partition mounted as /var. Everything is running well
>>> and the Ceph Cluster is handling things very well).
>>> 
>>> However, one of the Nodes (not the one currently acting as the Active
>>> Manager) is running out of space on /var. Normally, all of the Nodes
>>> have around 10% space used (via a df -H command), but the problem Node
>>> only takes 1 to 3 days to run out of space, hence taking it out of
>>> Quorum. Its currently at 85% and growing.
>>> 
>>> At first we thought this was caused by an overly large log file, but
>>> investigations showed that all the logs on all 3 Nodes were of
>>> comparable size. Also, searching for the 20 largest files on the
>>> problem Node's /var didn't produce any significant results.
>>> 
>>> Coincidentally, unrelated to this issue, the problem Node (but not the
>>> other 2 Nodes) was re-booted a couple of days ago and, when the
>>> Cluster had re-balanced itself and everything was back online and
>>> reporting as Healthy, the problem Node's /var was back down to around
>>> 10%, the same as the other two Nodes.
>>> 
>>> This lead us to suspect that there was some sort of "run-away" process
>>> or journaling/logging/temporary file(s) or whatever that the re-boot
>>> has "cleaned up". So we've been keeping an eye on things but we can't
>>> see anything causing the issue and now, as I said above, the problem
>>> Node's /var is back up to 85% and growing.
>>> 
>>> I've been looking at the log files, tying to determine the issue, but
>>> as I don't really know what I'm looking for I don't even know if I'm
>>> looking in the *correct* log files...
>>> 
>>> Obviously rebooting the problem Node every couple of days is not a
>>> viable option, and increasing the size of the /var partition is only
>>> going to postpone the issue, not resolve it. So if anyone has any
>>> ideas we'd love to hear about it - thanks
>> 
>> This seems one or more files that are removed but some process has their
>> handle open (and maybe is still writing...). When rebooting process is
>> terminated and file(s) effectively removed.
>> 
>> Try to inspect each process' open files and find what file(s) have no
>> longer a directory entry... that would give you a hint.
>> 
>> Cheers
>> 
>> 
>> Eneko Lacunza
>> Zuzendari teknikoa | Director técnico
>> Binovo IT Human Project
>> 
>> Tel. +34 943 569 206 |https://www.binovo.es
>> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
>> 
>> https://www.youtube.com/user/CANALBINOVO
>> https://www.linkedin.com/company/37269706/
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Eugen Block

Hi,

I usually look for this:

[ceph: root@storage01 /]# ceph daemon osd.0 perf dump bluefs | grep -E  
"db_|slow_"

"db_total_bytes": 21470642176,
"db_used_bytes": 179699712,
"slow_total_bytes": 0,
"slow_used_bytes": 0,

If you have spillover I would expect the "slow_bytes" values to be >  
0. Is it possible that the OSDs were compacted during/after the  
upgrade so the spillover would have been corrected (temporarily)? Do  
you know how much spillover you had before? And how big was the db  
when you had the warnings?


Regards,
Eugen

Zitat von Peter van Heusden :


Hello everyone

I have a Ceph installation where some of the OSDs were misconfigured to use
1GB SSD partitions for rocksdb. This caused a spillover ("BlueFS *spillover*
detected"). I recently upgraded to quincy using cephadm (17.2.5) the
spillover warning vanished. This is
despite bluestore_warn_on_bluefs_spillover still being set to true.

Is there a way to investigate the current state of the DB to see if
spillover is, indeed, still happening?

Thank you,
Peter
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Creating nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-12 Thread Ruidong Gao
Hi Matt,

Thanks for the reply.

I did following as you suggested:
bash-4.4$ ceph auth get-or-create client.demouser mon 'allow r' osd 'allow rw 
pool=.nfs namespace=nfs4rgw, allow rw tag cephfs data=myfs' mds 'allow rw 
path=/bucketexport'
[client.demouser]
key = AQCZJ8BjDbqZKBAAQVQbGZ4EYATtENbMv6a/sA==

bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path 
/bucketexport --bucket testbk --user-id demouser

After this, Ganesha server crashes the same way.

I wonder where is the problem to set it up.

Ben
> 2023年1月12日 21:03,Matt Benjamin  写道:
> 
> Hi Ben,
> 
> The issue seems to be that you don't have a ceph keyring available to the 
> nfs-ganesha server.  The upstream doc talks about this.  The nfs-ganesha 
> runtime environment needs to be essentially identical to one (a pod, I guess) 
> that would run radosgw.
> 
> Matt
> 
> On Thu, Jan 12, 2023 at 7:27 AM Ruidong Gao  > wrote:
>> Hi,
>> 
>> This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will 
>> crash Ganesha server pod. CephFS export works just fine. Here are steps of 
>> it:
>> 1, create export:
>> bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path 
>> /bucketexport --bucket testbk
>> {
>> "bind": "/bucketexport",
>> "path": "testbk",
>> "cluster": "nfs4rgw",
>> "mode": "RW",
>> "squash": "none"
>> }
>> 
>> 2, check pods status afterwards:
>> rook-ceph-nfs-nfs1-a-679fdb795-82tcx  2/2 Running
>>  0  4h3m
>> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42  1/2 Error  
>>  2  4h6m
>> 
>> 3, check failing pod’s logs:
>> 
>> 11/01/2023 08:11:53 : epoch 63be6f49 : 
>> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] 
>> nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
>> 11/01/2023 08:11:54 : epoch 63be6f49 : 
>> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] 
>> nfs_start_grace :STATE :EVENT :grace reload client info completed from 
>> backend
>> 11/01/2023 08:11:54 : epoch 63be6f49 : 
>> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] 
>> nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid 
>> count(0)
>> 11/01/2023 08:11:57 : epoch 63be6f49 : 
>> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] 
>> nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
>> 11/01/2023 08:11:57 : epoch 63be6f49 : 
>> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42 : nfs-ganesha-1[main] 
>> export_defaults_commit :CONFIG :INFO :Export Defaults now 
>> (options=03303002/0008   , ,,   ,
>>, ,,, expire=   0)
>> 2023-01-11T08:11:57.853+ 7f59dac7c200 -1 auth: unable to find a keyring 
>> on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory
>> 2023-01-11T08:11:57.853+ 7f59dac7c200 -1 AuthRegistry(0x56476817a480) no 
>> keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling cephx
>> 2023-01-11T08:11:57.855+ 7f59dac7c200 -1 auth: unable to find a keyring 
>> on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory
>> 2023-01-11T08:11:57.855+ 7f59dac7c200 -1 AuthRegistry(0x7ffe4d092c90) no 
>> keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling cephx
>> 2023-01-11T08:11:57.856+ 7f5987537700 -1 monclient(hunting): 
>> handle_auth_bad_method server allowed_methods [2] but i only support [1]
>> 2023-01-11T08:11:57.856+ 7f5986535700 -1 monclient(hunting): 
>> handle_auth_bad_method server allowed_methods [2] but i only support [1]
>> 2023-01-11T08:12:00.861+ 7f5986d36700 -1 monclient(hunting): 
>> handle_auth_bad_method server allowed_methods [2] but i only support [1]
>> 2023-01-11T08:12:00.861+ 7f59dac7c200 -1 monclient: authenticate NOTE: 
>> no keyring found; disabled cephx authentication
>> failed to fetch mon config (--no-mon-config to skip)
>> 
>> 4, delete the export:
>> ceph nfs export delete nfs4rgw /bucketexport
>> 
>> Ganesha servers go back normal:
>> rook-ceph-nfs-nfs1-a-679fdb795-82tcx  2/2 Running
>>  0  4h30m
>> rook-ceph-nfs-nfs4rgw-a-5c594d67dc-nlr42  2/2 Running
>>  10 4h33m
>> 
>> Any ideas to make it work?
>> 
>> Thanks
>> Ben
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io 
>> To unsubscribe send an email to ceph-users-le...@ceph.io 
>> 
> 
> 
> -- 
> 
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
> 
> http://www.redhat.com/en/technologies/storage
> 
> tel.  734-821-5101
> fax.  734-769-8938
> cel.  734-216-5309

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: pg mapping verification

2023-01-12 Thread Eugen Block

Hi,

I don't have an automation for that. I test a couple of random pg  
mappings if they meet my requirements, usually I do that directly with  
the output of crushtool. Here's one example from a small test cluster  
with three different rooms in the crushmap:


# test cluster (note that the columns may differ between ceph versions  
when using awk as I did here)

storage01:~ # ceph pg ls-by-pool  | awk '{print $15}'
ACTING
[8,13,5]p8
[22,4,13]p22
[28,22,26]p28
[21,5,1]p21
[20,34,27]p20
[...]

for i in {20,34,27}; do ceph osd find $i | grep room; done
"room": "room2",
"room": "room3",
"room": "room1",

For this rule I have a room resiliency requirement so I grep for the  
room of each acting set.
The output of crushtool is helpful if you don't want to inject a new  
osdmap into a production cluster. Just one example:


crushtool -i crushmap.bin --test --rule 5 --show-mappings --num-rep 6 | head
CRUSH rule 5 x 0 [19,7,13,22,16,28]
CRUSH rule 5 x 1 [21,3,15,31,19,7]
[...]

Regards,
Eugen

Zitat von Christopher Durham :


Hi,
For a given crush rule and pool that uses it, how can I verify hat  
the pgs in that pool folllow the rule? I have a requirement to  
'prove' that the pgs are mapping correctly.

I see: https://pypi.org/project/crush/
This allows me to read in a crushmap file that I could then use to  
verify a pg with some scripting, but this pypi is very old and seems  
not to be maintained or updatedsince 2017.
I am sure there is a way, using osdmaptool or something else, but it  
is not obvious. Before i spend alot of time searching, I thought I  
would ask here.

Basically, having a list of pgs like this:
[[1,2,3,4,5],[2,3,4,5,6],...]
Given a read-in crushmap and a specific rule therein, I want to  
verify that all pgs in my list are consistent with the rule specified.

Let me know if there is a proper way to do this, and thanks.
-Chris


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD crash on Onode::put

2023-01-12 Thread Igor Fedotov

Hi Frank,

IMO all the below logic is a bit of overkill and no one can provide 100% 
valid guidance on specific numbers atm. Generally I agree with 
Dongdong's point that crash is effectively an OSD restart and hence no 
much sense to perform such a restart manually - well, the rationale 
might be to do that gracefully and avoid some potential issues though...


Anyway I'd rather recommend to do periodic(!) manual OSD restart e.g. on 
a daily basis at off-peak hours instead of using tricks with mempool 
stats analysis..



Thanks,

Igor


On 1/10/2023 1:15 PM, Frank Schilder wrote:

Hi Dongdong and Igor,

thanks for pointing to this issue. I guess if its a memory leak issue (well, 
cache pool trim issue), checking for some indicator and an OSD restart should 
be a work-around? Dongdong promised a work-around but talks only about a patch 
(fix).

Looking at the tracker items, my conclusion is that unusually low values of 
.mempool.by_pool.bluestore_cache_onode.items of an OSD might be such an 
indicator. I just run a very simple check on all our OSDs:

for o in $(ceph osd ls); do n_onode="$(ceph tell "osd.$o" dump_mempools | jq 
".mempool.by_pool.bluestore_cache_onode.items")"; echo -n "$o: "; ((n_onode<10)) && echo "$n_onode"; 
done; echo ""

and found 2 with seemingly very unusual values:

: 3098
1112: 7403

Comparing two OSDs with same disk on the same host gives:

# ceph daemon osd. dump_mempools | jq 
".mempool.by_pool.bluestore_cache_onode.items,.mempool.by_pool.bluestore_cache_onode.bytes,.mempool.by_pool.bluestore_cache_other.items,.mempool.by_pool.bluestore_cache_other.bytes"
3200
1971200
260924
900303680

# ceph daemon osd.1030 dump_mempools | jq 
".mempool.by_pool.bluestore_cache_onode.items,.mempool.by_pool.bluestore_cache_onode.bytes,.mempool.by_pool.bluestore_cache_other.items,.mempool.by_pool.bluestore_cache_other.bytes"
60281
37133096
8908591
255862680

OSD  does look somewhat bad. Shortly after restarting this OSD I get

# ceph daemon osd. dump_mempools | jq 
".mempool.by_pool.bluestore_cache_onode.items,.mempool.by_pool.bluestore_cache_onode.bytes,.mempool.by_pool.bluestore_cache_other.items,.mempool.by_pool.bluestore_cache_other.bytes"
20775
12797400
803582
24017100

So, the above procedure seems to work and, yes, there seems to be a leak of 
items in cache_other that pushes other pools down to 0. There seem to be 2 
useful indicators:

- very low .mempool.by_pool.bluestore_cache_onode.items
- very high 
.mempool.by_pool.bluestore_cache_other.bytes/.mempool.by_pool.bluestore_cache_other.items

Here a command to get both numbers with OSD ID in an awk-friendly format:

for o in $(ceph osd ls); do printf "%6d %8d %7.2f\n" "$o" $(ceph tell "osd.$o" 
dump_mempools | jq 
".mempool.by_pool.bluestore_cache_onode.items,.mempool.by_pool.bluestore_cache_other.bytes/.mempool.by_pool.bluestore_cache_other.items");
 done

Pipe it to a file and do things like:

awk '$2<5 || $3>200' FILE

For example, I still get:

# awk '$2<5 || $3>200' cache_onode.txt
   109249225   43.74
   109346193   43.70
   109847550   43.47
   110148873   43.34
   110248008   43.31
   110348152   43.29
   110549235   43.59
   110746694   43.35
   110948511   43.08
   111314612  739.46
   111413199  693.76
   111645300  205.70

flagging 3 more outliers.

Would it be possible to provide a bit of guidance to everyone about when to 
consider restarting an OSD? What values of the above variables are critical and 
what are tolerable? Of course a proper fix would be better, but I doubt that 
everyone is willing to apply a patch. Therefore, some guidance on how to 
mitigate this problem to acceptable levels might be useful. I'm thinking here 
how few onode items are acceptable before performance drops painfully.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Igor Fedotov
Sent: 09 January 2023 13:34:42
To: Dongdong Tao;ceph-users@ceph.io
Cc:d...@ceph.io
Subject: [ceph-users] Re: OSD crash on Onode::put

Hi Dongdong,

thanks a lot for your post, it's really helpful.


Thanks,

Igor

On 1/5/2023 6:12 AM, Dongdong Tao wrote:

I see many users recently reporting that they have been struggling
with this Onode::put race condition issue[1] on both the latest
Octopus and pacific.
Igor opened a PR [2]  to address this issue, I've reviewed it
carefully, and looks good to me. I'm hoping this could get some
priority from the community.

For those who had been hitting this issue, I would like to share a
workaround that could unblock you:

During the investigation of this issue, I found this race condition
always happens after the bluestore onode cache size becomes 0.
Setting debug_bluestore = 1/30 will allow you to see the cache size
after the crash:
---
2022-10-25T00:47:26.562+ 7f424f78e700 30
bluestore.MempoolThread(0x564a9dae2a68) _resize_shards
max_shard_onodes: 0 max_shard_buffer: 8

[ceph-users] CephFS: Questions regarding Namespaces, Subvolumes and Mirroring

2023-01-12 Thread Jonas Schwab

Dear everyone,

I have several questions regarding CephFS connected to Namespaces, 
Subvolumes and snapshot Mirroring:


*1. How to display/create namespaces used for isolating subvolumes?*
    I have created multiple subvolumes with the option 
--namespace-isolated, so I was expecting to see the namespaces returned from

ceph fs subvolume info  
    also returned by
rbd namespace ls  --format=json
    But the latter command just returns an empty list. Are the 
namespaces used for rdb and CephFS different ones?


*2. Can CephFS Snapshot mirroring also be applied to subvolumes?*
    I tried this, but without success. Is there something to take into 
account rather than just mirroring the directory, or is it just not 
possible right now?


*3. Can xattr for namespaces and pools also be mirrored?*
    Or more specifically, is there a way to preserve the namespace and 
pool layout of mirrored directories?


Thank you for your help!

Best regrads,
Jonas
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd-mirror ceph quincy Not able to find rbd_mirror_journal_max_fetch_bytes config in rbd mirror

2023-01-12 Thread ankit raikwar
Hello All,
 In the ceph quincy Not able to find 
rbd_mirror_journal_max_fetch_bytes config
in rbd mirror
 i configured the ceph cluster almost 400 tb and enable the 
rbd-mirror in the
starting stage i'm able to achive the almost 9 GB speed , but after the rebalane
completed of the all the images . rbd-mirror speed got automaticily reduce to 
between 4 to
5 mbps. 
  in my primary cluster we are continuelsy writing the 50 to 400 mbps data but 
replication
speed only we get the 4 to 5 mbps. also we have the 10 Gbps replication network
bandwidth.
   

Note::- I also try to find the option rbd_mirror_journal_max_fetch_bytes  but 
i'm not
able to find the this option in the configuration. also when i try to set from 
the command
line it's showing error  like

command:
 ceph config set client.rbd rbd_mirror_journal_max_fetch_bytes 33554432

error:
Error EINVAL: unrecognized config option 'rbd_mirror_journal_max_fetch_bytes'

cluster version
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)

Please suggest any alternative way to configurre this option or how i  can 
improve the
replication n/w speed.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS: Questions regarding Namespaces, Subvolumes and Mirroring

2023-01-12 Thread Robert Sander

On 12.01.23 17:13, Jonas Schwab wrote:


rbd namespace ls  --format=json
      But the latter command just returns an empty list. Are the
namespaces used for rdb and CephFS different ones?


RBD and CephFS are different interfaces. You would need to use rados to 
list all objects and their namespaces. I have not found a way to only 
list namespaces of a pool.


root@cephtest20:~# rados -p .nfs --all ls
nfs01   rec-0049:nfs.nfs01.0
nfs01   export-1
nfs01   rec-002a:nfs.nfs01.1
nfs01   conf-nfs.nfs01
nfs01   rec-0049:nfs.nfs01.1
nfs01   rec-0013:nfs.nfs01.1
nfs01   grace
nfs01   rec-0008:nfs.nfs01.1
nfs01   rec-0049:nfs.nfs01.2
nfs01   rec-0029:nfs.nfs01.2
nfs01   rec-0010:nfs.nfs01.1
nfs01   rec-0011:nfs.nfs01.2
nfs01   rec-0009:nfs.nfs01.0
root@cephtest20:~# rbd namespace ls .nfs
root@cephtest20:~#

Where "nfs01" is a namespace in the pool .nfs

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

2023-01-12 Thread Jan Pekař - Imatic

Hi all,

I have problem upgrading nautilus to octopus on my OSD.

Upgrade mon and mgr was OK and first OSD stuck on

2023-01-12T09:25:54.122+0100 7f49ff3eae00  1 osd.0 126556 init upgrade 
snap_mapper (first start as octopus)

and there were no activity after that for more than 48 hours. No disk activity.

I restarted OSD many times and nothing changed.

It is old, filestore OSD based on XFS filesystem. Is upgrade to snap mapper 2 reliable? What is OSD waiting for? Can I start OSD without 
upgrade and get cluster healthy with old snap structure? Or should I skip octopus upgrade and go to pacific directly (some bug backport is 
missing?).


Thank you for help, I'm sending some logs below..

Log shows

2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 ceph version 15.2.17 (694d03a6f6c6e9f814446223549caf9a9f60dba0) octopus (stable), process 
ceph-osd, pid 2566563

2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 pidfile_write: ignore empty 
--pid-file
2023-01-09T19:12:49.499+0100 7f41f60f1e00 -1 missing 'type' file, inferring 
filestore from current/ dir
2023-01-09T19:12:49.531+0100 7f41f60f1e00  0 starting osd.0 osd_data 
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
2023-01-09T19:12:49.531+0100 7f41f60f1e00 -1 Falling back to public interface
2023-01-09T19:12:49.871+0100 7f41f60f1e00  0 load: jerasure load: lrc load: isa
2023-01-09T19:12:49.875+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:0.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, 
cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:1.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, 
cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:2.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, 
cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:3.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, 
cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:4.OSDShard using op scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, 
cutoff=196)

2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl is disabled via 
'filestore fiemap' config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: SEEK_DATA/SEEK_HOLE is 
disabled via 'filestore seek data hole' config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice() is disabled via 
'filestore splice' config option
2023-01-09T19:12:49.983+0100 7f41f60f1e00  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) syscall fully 
supported (by glibc and kernel)

2023-01-09T19:12:49.983+0100 7f41f60f1e00  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is 
disabled by conf
2023-01-09T19:12:50.015+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) start omap initiation
2023-01-09T19:12:50.079+0100 7f41f60f1e00  1 leveldb: Recovering log #165531
2023-01-09T19:12:50.083+0100 7f41f60f1e00  1 leveldb: Level-0 table #165533: 
started
2023-01-09T19:12:50.235+0100 7f41f60f1e00  1 leveldb: Level-0 table #165533: 
1598 bytes OK
2023-01-09T19:12:50.583+0100 7f41f60f1e00  1 leveldb: Delete type=0 #165531

2023-01-09T19:12:50.615+0100 7f41f60f1e00  1 leveldb: Delete type=3 #165529

2023-01-09T19:12:51.339+0100 7f41f60f1e00  0 filestore(/var/lib/ceph/osd/ceph-0) mount(1861): enabling WRITEAHEAD journal mode: checkpoint 
is not enabled
2023-01-09T19:12:51.379+0100 7f41f60f1e00  1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 35: 2998927360 bytes, block size 4096 bytes, 
directio = 1, aio = 1

2023-01-09T19:12:51.931+0100 7f41f60f1e00 -1 journal do_read_entry(243675136): 
bad header magic
2023-01-09T19:12:51.939+0100 7f41f60f1e00  1 journal _open /var/lib/ceph/osd/ceph-0/journal fd 35: 2998927360 bytes, block size 4096 bytes, 
directio = 1, aio = 1

2023-01-09T19:12:51.943+0100 7f41f60f1e00  1 
filestore(/var/lib/ceph/osd/ceph-0) upgrade(1466)
2023-01-09T19:12:52.015+0100 7f41f60f1e00  1 osd.0 126556 init upgrade 
snap_mapper (first start as octopus)

lsof shows

COMMAND PID USER   FD  TYPE DEVICE SIZE/OFF   NODE NAME
ceph-osd 225860 ceph  cwd   DIR  9,127 4096  2 /
ceph-osd 225860 ceph  rtd   DIR  9,127 4096  2 /
ceph-osd 225860 ceph  txt   REG  9,127 31762544   5021 
/usr/bin/ceph-osd
ceph-osd 225860 ceph  mem   REG   8,70  2147237 68104224 
/var/lib/ceph/osd/ceph-0/current/omap/165546.ldb
ceph-osd 225860 ceph  mem   REG   8,70  2147792 68104190 
/var/lib/ceph/osd/ce

[ceph-users] CephFS: Questions regarding Namespaces, Subvolumes and Mirroring

2023-01-12 Thread Jonas Schwab

Dear everyone,

I have several questions regarding CephFS connected to Namespaces, 
Subvolumes and snapshot Mirroring:


*1. How to display/create namespaces used for isolating subvolumes?*
    I have created multiple subvolumes with the option 
--namespace-isolated, so I was expecting to see the namespaces returned from

ceph fs subvolume info  
    also returned by
rbd namespace ls  --format=json
    But the latter command just returns an empty list. Are the 
namespaces used for rdb and CephFS different ones?


*2. Can CephFS Snapshot mirroring also be applied to subvolumes?*
    I tried this, but without success. Is there something to take into 
account rather than just mirroring the directory, or is it just not 
possible right now?


*3. Can xattr for namespaces and pools also be mirrored?*
    Or more specifically, is there a way to preserve the namespace and 
pool layout of mirrored directories?


Thank you for your help!

Best regrads,
Jonas

PS: You could receive this mail twice, sine this email address somehow 
got removed from the ceph-users list.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Fox, Kevin M
If you have prometheus enabled, the metrics should be in there I think?

Thanks,
Kevin


From: Peter van Heusden 
Sent: Thursday, January 12, 2023 6:12 AM
To: ceph-users@ceph.io
Subject: [ceph-users] BlueFS spillover warning gone after upgrade to Quincy

Check twice before you click! This email originated from outside PNNL.


Hello everyone

I have a Ceph installation where some of the OSDs were misconfigured to use
1GB SSD partitions for rocksdb. This caused a spillover ("BlueFS *spillover*
detected"). I recently upgraded to quincy using cephadm (17.2.5) the
spillover warning vanished. This is
despite bluestore_warn_on_bluefs_spillover still being set to true.

Is there a way to investigate the current state of the DB to see if
spillover is, indeed, still happening?

Thank you,
Peter
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Laggy PGs on a fairly high performance cluster

2023-01-12 Thread Matthew Stroud
We have a 14 osd node all ssd cluster and for some reason we are continually 
getting laggy PGs and those seem to correlate to slow requests on Quincy 
(doesn't seem to happen on our Pacific clusters). These laggy pgs seem to shift 
between osds. The network seems solid, as in I'm not seeing errors or slowness. 
OSD hosts are heavily underutilized, normally sub 1 load and the cpus are 98% 
idle. I have been looking through the logs and nothing is really standing out 
in the OSD or ceph logs.

Some things we have tried:

  1.  Updating our cluster to 17.2.5
  2.  Manually setting our mClock profile to high_client_ops.
  3.  Increasing our total number of PGs (this something that should've 
happened anyways.)
  4.  Verified that jumbo frames, lacp, and throughput were functioning as 
intended.
  5.  Took some of our newer nodes out to see if that was an issue. Also 
rebooted the cluster just to be sure.

I'm curious if someone in the community has experience with this kind of issue 
and maybe could point to something I have overlooked.

Some example logs:

2023-01-10T22:50:23.245823+ mgr.openstack-mon01.b.pc.ostk.com.flbudm 
(mgr.120371640) 231175 : cluster [DBG] pgmap v235204: 2625 pgs: 1 
active+clean+laggy, 2624 active+clean; 6.0 TiB data, 18 TiB used, 84 TiB
 / 102 TiB avail; 19 MiB/s rd, 67 MiB/s wr, 4.76k op/s
2023-01-10T22:50:23.762562+ osd.83 (osd.83) 906 : cluster [WRN] 6 slow 
requests (by type [ 'delayed' : 5 'waiting for sub ops' : 1 ] most affected 
pool [ 'vms' : 6 ])
2023-01-10T22:50:24.771260+ osd.83 (osd.83) 907 : cluster [WRN] 6 slow 
requests (by type [ 'delayed' : 5 'waiting for sub ops' : 1 ] most affected 
pool [ 'vms' : 6 ])




CONFIDENTIALITY NOTICE: This message is intended only for the use and review of 
the individual or entity to which it is addressed and may contain information 
that is privileged and confidential. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message solely to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify 
sender immediately by telephone or return email. Thank you.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] OSDs failed to start after host reboot | Cephadm

2023-01-12 Thread Ben Meinhart
Hello all!

Linked stackoverflow post: 
https://stackoverflow.com/questions/75101087/cephadm-ceph-osd-fails-to-start-after-reboot-of-host
 


A couple of weeks ago I deployed a new Ceph cluster using Cephadm. It is a 
three node cluster (node1, node2, & node3) with 6 OSD’s each; 6x18TB Seagate 
hard drives with a 2TB NVMe drive set as a DB device. Everything has been 
running smoothly until today when I went to perform maintenance on one of the 
nodes. I first moved all of the services off the host and put it into 
maintenance mode. I then made some changes to once of the NIC’s and  ran 
updates. After the updates were done, I rebooted the machine. This is when the 
issue occurred.

When the node (node1) finished rebooting, it was still showing as offline in 
the Ceph Dashboard so from one of the host I ran `ceph orch host rescan node1` 
and it came back online in the Ceph dashboard. I’ve seen this before when I’ve 
had to reboot host so NBD so far.

However, after a couple of minutes passed the OSD’s on that host still haven’t 
come online. I then checked the status of the services `systemctl | grep ceph` 
and saw that all of the OSD’s had failed. 
# systemctl status ceph-0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6@osd.0.service
× ceph-0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6@osd.0.service - Ceph osd.0 for 
0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6
 Loaded: loaded 
(/etc/systemd/system/ceph-0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6@.service; 
enabled; vendor preset: enabled)
 Active: failed (Result: exit-code) since Thu 2023-01-12 18:14:27 UTC; 1h 
42min ago
   Main PID: 385982 (code=exited, status=1/FAILURE)
CPU: 292ms

Jan 12 19:48:30 node1 systemd[1]: 
/etc/systemd/system/ceph-0a7ec2ae-816d-11ed-9791-97c1d8fb9dc6@.service:24: Unit 
configured to use KillMode=none. This is unsafe, as it disables systemd's 
process lifecycle management for the service. Please update your service to use 
a safer Kill

It was at the reset counter max so I had to run `systemctl reset-failed` and I 
tried restarting the OSD’s by running `systemctl restart ceph.target`.  I 
watched the service try to load but it kept failing. 

This was the output of /var/log/ceph//ceph-osd.0.log:
2023-01-12T18:12:06.501+ 7fb5d3b1e3c0  0 set uid:gid to 167:167 (ceph:ceph)
2023-01-12T18:12:06.501+ 7fb5d3b1e3c0  0 ceph version 17.2.5 
(98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable), process ceph-osd, 
pid 7
2023-01-12T18:12:06.501+ 7fb5d3b1e3c0  0 pidfile_write: ignore empty 
--pid-file
2023-01-12T18:12:06.505+ 7fb5d3b1e3c0  1 bdev(0x5591e1f87400 
/var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:06.505+ 7fb5d3b1e3c0  1 bdev(0x5591e1f87400 
/var/lib/ceph/osd/ceph-0/block) open size 2584761344 (0x1230bfc0, 18 
TiB) block_size 4096 (4 KiB) rotational discard not supported
2023-01-12T18:12:06.505+ 7fb5d3b1e3c0  1 
bluestore(/var/lib/ceph/osd/ceph-0) _set_cache_sizes cache_size 1073741824 meta 
0.45 kv 0.45 data 0.06
2023-01-12T18:12:06.505+ 7fb5d3b1e3c0  1 bdev(0x5591e1f86c00 
/var/lib/ceph/osd/ceph-0/block.db) open path /var/lib/ceph/osd/ceph-0/block.db
2023-01-12T18:12:06.505+ 7fb5d3b1e3c0  1 bdev(0x5591e1f86c00 
/var/lib/ceph/osd/ceph-0/block.db) open size 96836352 (0x4da000, 310 
GiB) block_size 4096 (4 KiB) non-rotational discard supported
2023-01-12T18:12:06.505+ 7fb5d3b1e3c0  1 bluefs add_block_device bdev 1 
path /var/lib/ceph/osd/ceph-0/block.db size 310 GiB
2023-01-12T18:12:06.513+ 7fb5d3b1e3c0  1 bdev(0x5591e1f86800 
/var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:06.513+ 7fb5d3b1e3c0  1 bdev(0x5591e1f86800 
/var/lib/ceph/osd/ceph-0/block) open size 2584761344 (0x1230bfc0, 18 
TiB) block_size 4096 (4 KiB) rotational discard not supported
2023-01-12T18:12:06.513+ 7fb5d3b1e3c0  1 bluefs add_block_device bdev 2 
path /var/lib/ceph/osd/ceph-0/block size 18 TiB
2023-01-12T18:12:06.513+ 7fb5d3b1e3c0  1 bdev(0x5591e1f86c00 
/var/lib/ceph/osd/ceph-0/block.db) close
2023-01-12T18:12:06.817+ 7fb5d3b1e3c0  1 bdev(0x5591e1f86800 
/var/lib/ceph/osd/ceph-0/block) close
2023-01-12T18:12:07.085+ 7fb5d3b1e3c0  1 bdev(0x5591e1f87400 
/var/lib/ceph/osd/ceph-0/block) close
2023-01-12T18:12:07.305+ 7fb5d3b1e3c0  0 starting osd.0 osd_data 
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
2023-01-12T18:12:07.321+ 7fb5d3b1e3c0  0 load: jerasure load: lrc 
2023-01-12T18:12:07.321+ 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 
/var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12:07.321+ 7fb5d3b1e3c0 -1 bdev(0x5591e2d8e000 
/var/lib/ceph/osd/ceph-0/block) open open got: (13) Permission denied
2023-01-12T18:12:07.321+ 7fb5d3b1e3c0  1 bdev(0x5591e2d8e000 
/var/lib/ceph/osd/ceph-0/block) open path /var/lib/ceph/osd/ceph-0/block
2023-01-12T18:12

[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Peter van Heusden
Thanks. The command definitely shows "slow_bytes":

"db_total_bytes": 1073733632,
"db_used_bytes": 240123904,
"slow_total_bytes": 4000681103360,
"slow_used_bytes": 8355381248,

So I am not sure why the warnings are no longer appearing.

Peter

On Thu, 12 Jan 2023 at 17:41, Eugen Block  wrote:

> Hi,
>
> I usually look for this:
>
> [ceph: root@storage01 /]# ceph daemon osd.0 perf dump bluefs | grep -E
> "db_|slow_"
>  "db_total_bytes": 21470642176,
>  "db_used_bytes": 179699712,
>  "slow_total_bytes": 0,
>  "slow_used_bytes": 0,
>
> If you have spillover I would expect the "slow_bytes" values to be >
> 0. Is it possible that the OSDs were compacted during/after the
> upgrade so the spillover would have been corrected (temporarily)? Do
> you know how much spillover you had before? And how big was the db
> when you had the warnings?
>
> Regards,
> Eugen
>
> Zitat von Peter van Heusden :
>
> > Hello everyone
> >
> > I have a Ceph installation where some of the OSDs were misconfigured to
> use
> > 1GB SSD partitions for rocksdb. This caused a spillover ("BlueFS
> *spillover*
> > detected"). I recently upgraded to quincy using cephadm (17.2.5) the
> > spillover warning vanished. This is
> > despite bluestore_warn_on_bluefs_spillover still being set to true.
> >
> > Is there a way to investigate the current state of the DB to see if
> > spillover is, indeed, still happening?
> >
> > Thank you,
> > Peter
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Benoît Knecht
Hi Peter,

On Thursday, January 12th, 2023 at 15:12, Peter van Heusden  
wrote:
> I have a Ceph installation where some of the OSDs were misconfigured to use
> 1GB SSD partitions for rocksdb. This caused a spillover ("BlueFS spillover
> detected"). I recently upgraded to quincy using cephadm (17.2.5) the
> spillover warning vanished. This is
> despite bluestore_warn_on_bluefs_spillover still being set to true.

I noticed this on Pacific as well, and I think it's due to this commit:
https://github.com/ceph/ceph/commit/d17cd6604b4031ca997deddc5440248aff451269.
It removes the logic that would normally update the spillover health check, so
it never triggers anymore.

As others mentioned, you can get the relevant metrics from Prometheus and setup
alerts there instead. But it does make me wonder how many people might have
spillover in their clusters and not even realize it, since there's no warning by
default.

Cheers,

-- 
Ben
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io