[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-24 Thread Adrien Georget
Hi Xiubo, We did the upgrade in rolling mode as always, with only few kubernetes pods as clients accessing their PVC on CephFS. I can reproduce the problem everytime I restart the MDS daemon. You can find the MDS log with debug_mds 25 and debug_ms 1 here : https://filesender.renater.fr/?s=dow

[ceph-users] Best practice taking cluster down

2022-11-24 Thread Dominique Ramaekers
Hi, We are going to do some maintenance on our power grid. I'll need to put my ceph cluster down. My cluster is a simple tree node cluster. Before shutting down the systems, I'll take down all virtual machines and other services who depend on the cluster storage. Is it sufficient I set the 'n

[ceph-users] SSE-KMS vs SSE-S3 with per-object-data-keys

2022-11-24 Thread Stefan Schueffler
Hi, i appreciate a lot the recently added SSE-S3 encryption in radosgw. As far as i know, this encryption works very similar to the „original“ design in Amazon S3: - it uses a per-bucket master key (used solely to encrypt the data-keys), stored in rgw_crypt_sse_s3_vault_prefix. - and it creates

[ceph-users] rook 1.10.6 problem with rgw

2022-11-24 Thread Oğuz Yarımtepe
Hi, I am not sure whether others having problem with the latest rook version: CephObjectStore: failed to commit RGW configuration period changes (helm install) · Issue #11333 · rook/rook (github.com) I would like to know whether any workaround for it?

[ceph-users] Re: Issues during Nautilus Pacific upgrade

2022-11-24 Thread Ana Aviles
On 11/23/22 19:49, Marc wrote: We would like to share our experience upgrading one of our clusters from Nautilus (14.2.22-1bionic) to Pacific (16.2.10-1bionic) a few weeks ago. To start with, we had to convert our monitors databases to rockdb in Weirdly I have just one monitor db in leveldb st

[ceph-users] Re: Ceph cluster shutdown procedure

2022-11-24 Thread Steven Goodliff
Hi, Thanks Eugen,I found some similar docs on the Redhat site as well and made a Ansible playbook to follow the steps. Cheers On Thu, 17 Nov 2022 at 13:28, Steven Goodliff wrote: > Hi, > > Is there a recommended way of shutting a cephadm cluster down completely? > > I tried using cephadm to

[ceph-users] Persistent Bucket Notification performance

2022-11-24 Thread Steven Goodliff
Hi, I'm really struggling with persistent bucket notifications running 17.2.3. I can't get much more than 600 notifications a second but when changing to async then i see higher rates using the following metric sum(rate(ceph_rgw_pubsub_push_ok[$__rate_interval])) I believe this is mainly down to

[ceph-users] Re: Best practice taking cluster down

2022-11-24 Thread Murilo Morais
Hi Dominique! On this list, there was recently a thread discussing the same subject. [1] You can follow SUSE's recommendations and it's a success! [2] Have a good day! [1] https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/QN4GUPPZ5IZYLQ4PD4KV737L5M6DJ4CI/ [2] https://documentation.

[ceph-users] Re: Persistent Bucket Notification performance

2022-11-24 Thread Yuval Lifshitz
Hi Steven, When using synchronous (=non-persistent) notifications, the overall rate is dependent on the latency between the RGW and the endpoint to which you are sending the notifications. The protocols for sending the notifications (kafka/amqp) are using batches and are usually very efficient. How

[ceph-users] Re: Persistent Bucket Notification performance

2022-11-24 Thread Steven Goodliff
Hi, Thanks for the quick response, I have the notifications going to a http endpoint running on one of the RGW's machines so the latency is as low as I can make it for both methods. If the limiting factor is at the rados layer my only tuning options are to put the rgw log pool on the fastest media

[ceph-users] Clean prometheus files in /var/lib/ceph

2022-11-24 Thread Mevludin Blazevic
Hi all, on my ceph admin machine, a lot of large files are produced by prometheus, e.g.: ./var/lib/ceph/8c774934-1535-11ec-973e-525400130e4f/prometheus.cephadm/data/wal/00026165 ./var/lib/ceph/8c774934-1535-11ec-973e-525400130e4f/prometheus.cephadm/data/wal/00026166 ./var/lib/ceph/8c774934-153

[ceph-users] Re: Configuring rgw connection timeouts

2022-11-24 Thread Thilo-Alexander Ginkel
Hi Kevin, all, I tried what you suggested, but AFAICS (and judging from the error message) supplying these config parameters via the RGW service spec is not supported right now. Applying it causers an error: Error EINVAL: Invalid config option request_timeout_ms in spec The spec looks like this

[ceph-users] Re: failure resharding radosgw bucket

2022-11-24 Thread Jan Horstmann
On Wed, 2022-11-23 at 12:57 -0500, Casey Bodley wrote: > hi Jan, > > On Wed, Nov 23, 2022 at 12:45 PM Jan Horstmann > wrote: > > > > Hi list, > > I am completely lost trying to reshard a radosgw bucket which fails > > with the error: > > > > process_single_logshard: Error during resharding buc

[ceph-users] Upgrade 16.2.10 to 17.2.x: any caveats?

2022-11-24 Thread Zakhar Kirpichenko
Hi! I'm planning a service window to make some network upgrades, and would like to use the same window to upgrade our Ceph cluster from 16.2.10 to the latest 17.2.x available on that date. The cluster is a fairly simple 6-node setup with a mix of NVME (WAL/DB) and HDD (block) drives, several repl

[ceph-users] Re: filesystem became read only after Quincy upgrade

2022-11-24 Thread Xiubo Li
Hi Adren, Thank you for your logs. From your logs I found one bug and I have raised one new tracker [1] to follow it, and raised a ceph PR [2] to fix this. More detail please my analysis in the tracker [2]. [1] https://tracker.ceph.com/issues/58082 [2] https://github.com/ceph/ceph/pull/49048

[ceph-users] Re: Upgrade 16.2.10 to 17.2.x: any caveats?

2022-11-24 Thread Zakhar Kirpichenko
Thanks, Stefan. I've read this very well. My question is whether there's anything not covered by the available documentation that we should be aware of. /Z On Fri, 25 Nov 2022 at 09:11, Stefan Kooman wrote: > On 11/24/22 18:53, Zakhar Kirpichenko wrote: > > Hi! > > > > I'm planning a service w