[ceph-users] Re: Mysterious Disk-Space Eater

2023-01-12 Thread Eneko Lacunza
Hi, El 12/1/23 a las 3:59, duluxoz escribió: Got a funny one, which I'm hoping someone can help us with. We've got three identical(?) Ceph Quincy Nodes running on Rocky Linux 8.7. Each Node has 4 OSDs, plus Monitor, Manager, and iSCSI G/W services running on them (we're only a small shop). Ea

[ceph-users] Re: Mysterious Disk-Space Eater

2023-01-12 Thread E Taka
We had a similar problem, and it was a (visible) logfile. It is easy to find with the ncdu utility (`ncdu -x /var`). There's no need of a reboot, you can get rid of it with restarting the Monitor with `ceph orch daemon restart mon.NODENAME`. You may also lower the debug level. Am Do., 12. Jan. 202

[ceph-users] Re: Removing OSDs - draining but never completes.

2023-01-12 Thread E Taka
You have to wait until the rebalancing finished. Am Di., 10. Jan. 2023 um 17:14 Uhr schrieb Wyll Ingersoll < wyllys.ingers...@keepertech.com>: > Running ceph-pacific 16.2.9 using ceph orchestrator. > > We made a mistake adding a disk to the cluster and immediately issued a > command to remove it

[ceph-users] Re: Current min_alloc_size of OSD?

2023-01-12 Thread Robert Sander
On 11.01.23 23:47, Anthony D'Atri wrote: It’s printed in the OSD log at startup. But which info is it exactly? This line looks like reporting the block_size of the device: bdev(0x55b50a2e5800 /var/lib/ceph/osd/ceph-0/block) open size 107369988096 (0x18ffc0, 100 GiB) block_size 4096 (4

[ceph-users] Re: Current min_alloc_size of OSD?

2023-01-12 Thread Gerdriaan Mulder
Hi, On 12/01/2023 10.26, Robert Sander wrote: Is it this line?   bluestore(/var/lib/ceph/osd/ceph-0) _open_super_meta min_alloc_size 0x1000 That seems to be it: https://github.com/ceph/ceph/blob/v15.2.17/src/os/bluestore/BlueStore.cc#L11754-L11755 A few lines later it should state the sa

[ceph-users] Re: [solved] Current min_alloc_size of OSD?

2023-01-12 Thread Robert Sander
Hi, On 12.01.23 11:11, Gerdriaan Mulder wrote: On 12/01/2023 10.26, Robert Sander wrote: Is it this line?   bluestore(/var/lib/ceph/osd/ceph-0) _open_super_meta min_alloc_size 0x1000 That seems to be it: https://github.com/ceph/ceph/blob/v15.2.17/src/os/bluestore/BlueStore.cc#L11754-L1175

[ceph-users] Re: Ceph Octopus rbd images stuck in trash

2023-01-12 Thread Eugen Block
Hi, just wondering if you're looking in the right pool(s)? The default pool is "rbd", are those images you listed from the "rbd" pool? Do you use an alias for the "rbd" command? If that's not it maybe increase rbd client debug logs to see where it goes wrong. From time to time I also have t

[ceph-users] Creating nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-12 Thread Ruidong Gao
Hi, This is running Quincy 17.2.5 deployed by rook on k8s. RGW nfs export will crash Ganesha server pod. CephFS export works just fine. Here are steps of it: 1, create export: bash-4.4$ ceph nfs export create rgw --cluster-id nfs4rgw --pseudo-path /bucketexport --bucket testbk { "bind": "/bu

[ceph-users] Re: iscsi target lun error

2023-01-12 Thread Frédéric Nass
Hi Xiubo, Randy, This is due to ' host.containers.internal' being added to the container's /etc/hosts since Podman 4.1+. The workaround consists of either downgrading Podman package to v4.0 (on RHEL8, dnf downgrade podman-4.0.2-6.module+el8.6.0+14877+f643d2d6) or adding the --no-hosts option t

[ceph-users] Re: Creating nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-12 Thread Matt Benjamin
Hi Ben, The issue seems to be that you don't have a ceph keyring available to the nfs-ganesha server. The upstream doc talks about this. The nfs-ganesha runtime environment needs to be essentially identical to one (a pod, I guess) that would run radosgw. Matt On Thu, Jan 12, 2023 at 7:27 AM Ru

[ceph-users] BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Peter van Heusden
Hello everyone I have a Ceph installation where some of the OSDs were misconfigured to use 1GB SSD partitions for rocksdb. This caused a spillover ("BlueFS *spillover* detected"). I recently upgraded to quincy using cephadm (17.2.5) the spillover warning vanished. This is despite bluestore_warn_on

[ceph-users] Re: Mysterious Disk-Space Eater

2023-01-12 Thread Anthony D'Atri
One can even remove the log and tell the daemon to reopen it without having to restart. I’ve had mons do enough weird things on me that I try to avoid restarting them. ymmv. It’s possible that the OP has a large file that’s unlinked but still open, historically “fsck -n” would find these, tod

[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Eugen Block
Hi, I usually look for this: [ceph: root@storage01 /]# ceph daemon osd.0 perf dump bluefs | grep -E "db_|slow_" "db_total_bytes": 21470642176, "db_used_bytes": 179699712, "slow_total_bytes": 0, "slow_used_bytes": 0, If you have spillover I would expect the "sl

[ceph-users] Re: Creating nfs RGW export makes nfs-gnaesha server in crash loop

2023-01-12 Thread Ruidong Gao
Hi Matt, Thanks for the reply. I did following as you suggested: bash-4.4$ ceph auth get-or-create client.demouser mon 'allow r' osd 'allow rw pool=.nfs namespace=nfs4rgw, allow rw tag cephfs data=myfs' mds 'allow rw path=/bucketexport' [client.demouser] key = AQCZJ8BjDbqZKBAAQVQbGZ4EYA

[ceph-users] Re: pg mapping verification

2023-01-12 Thread Eugen Block
Hi, I don't have an automation for that. I test a couple of random pg mappings if they meet my requirements, usually I do that directly with the output of crushtool. Here's one example from a small test cluster with three different rooms in the crushmap: # test cluster (note that the colu

[ceph-users] Re: OSD crash on Onode::put

2023-01-12 Thread Igor Fedotov
Hi Frank, IMO all the below logic is a bit of overkill and no one can provide 100% valid guidance on specific numbers atm. Generally I agree with Dongdong's point that crash is effectively an OSD restart and hence no much sense to perform such a restart manually - well, the rationale might be

[ceph-users] CephFS: Questions regarding Namespaces, Subvolumes and Mirroring

2023-01-12 Thread Jonas Schwab
Dear everyone, I have several questions regarding CephFS connected to Namespaces, Subvolumes and snapshot Mirroring: *1. How to display/create namespaces used for isolating subvolumes?*     I have created multiple subvolumes with the option --namespace-isolated, so I was expecting to see the

[ceph-users] rbd-mirror ceph quincy Not able to find rbd_mirror_journal_max_fetch_bytes config in rbd mirror

2023-01-12 Thread ankit raikwar
Hello All, In the ceph quincy Not able to find rbd_mirror_journal_max_fetch_bytes config in rbd mirror i configured the ceph cluster almost 400 tb and enable the rbd-mirror in the starting stage i'm able to achive the almost 9 GB speed , but after the rebalane completed

[ceph-users] Re: CephFS: Questions regarding Namespaces, Subvolumes and Mirroring

2023-01-12 Thread Robert Sander
On 12.01.23 17:13, Jonas Schwab wrote: rbd namespace ls --format=json     But the latter command just returns an empty list. Are the namespaces used for rdb and CephFS different ones? RBD and CephFS are different interfaces. You would need to use rados to list all objects and their namespa

[ceph-users] OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

2023-01-12 Thread Jan Pekař - Imatic
Hi all, I have problem upgrading nautilus to octopus on my OSD. Upgrade mon and mgr was OK and first OSD stuck on 2023-01-12T09:25:54.122+0100 7f49ff3eae00  1 osd.0 126556 init upgrade snap_mapper (first start as octopus) and there were no activity after that for more than 48 hours. No disk a

[ceph-users] CephFS: Questions regarding Namespaces, Subvolumes and Mirroring

2023-01-12 Thread Jonas Schwab
Dear everyone, I have several questions regarding CephFS connected to Namespaces, Subvolumes and snapshot Mirroring: *1. How to display/create namespaces used for isolating subvolumes?*     I have created multiple subvolumes with the option --namespace-isolated, so I was expecting to see the

[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Fox, Kevin M
If you have prometheus enabled, the metrics should be in there I think? Thanks, Kevin From: Peter van Heusden Sent: Thursday, January 12, 2023 6:12 AM To: ceph-users@ceph.io Subject: [ceph-users] BlueFS spillover warning gone after upgrade to Quincy Chec

[ceph-users] Laggy PGs on a fairly high performance cluster

2023-01-12 Thread Matthew Stroud
We have a 14 osd node all ssd cluster and for some reason we are continually getting laggy PGs and those seem to correlate to slow requests on Quincy (doesn't seem to happen on our Pacific clusters). These laggy pgs seem to shift between osds. The network seems solid, as in I'm not seeing errors

[ceph-users] OSDs failed to start after host reboot | Cephadm

2023-01-12 Thread Ben Meinhart
Hello all! Linked stackoverflow post: https://stackoverflow.com/questions/75101087/cephadm-ceph-osd-fails-to-start-after-reboot-of-host A couple of weeks ago I deployed a new Ceph cluster using

[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Peter van Heusden
Thanks. The command definitely shows "slow_bytes": "db_total_bytes": 1073733632, "db_used_bytes": 240123904, "slow_total_bytes": 4000681103360, "slow_used_bytes": 8355381248, So I am not sure why the warnings are no longer appearing. Peter On Thu, 12 Jan 2023 at

[ceph-users] Re: BlueFS spillover warning gone after upgrade to Quincy

2023-01-12 Thread Benoît Knecht
Hi Peter, On Thursday, January 12th, 2023 at 15:12, Peter van Heusden wrote: > I have a Ceph installation where some of the OSDs were misconfigured to use > 1GB SSD partitions for rocksdb. This caused a spillover ("BlueFS spillover > detected"). I recently upgraded to quincy using cephadm (17.2.