[ceph-users] How to repair the OSDs while WAL/DB device breaks down

2023-03-14 Thread Norman
hi, everyone, I have a question about repairing the broken WAL/DB device. I have a cluster with 8 OSDs, and 4 WAL/DB devices(1 OSD per WAL/DB device), and hwo can I repair the OSDs quickly if  one WAL/DB device breaks down without rebuilding the them? Thanks. _

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Richard Bade
Hi, I found the documentation for metadata get to be unhelpful for what syntax to use. I eventually found that it's this: radosgw-admin metadata get bucket:{bucket_name} or radosgw-admin metadata get bucket.instance:{bucket_name}:{instance_id} Hopefully that helps you or someone else struggling wi

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Gaël THEROND
Thanks a lot for this spreadsheet, I’ll check that on but I doubt we store data smaller than the min_alloc size. Yes we do use an EC pool type of 2+1 with failure_domain being at host level. Le mar. 14 mars 2023 à 19:38, Mark Nelson a écrit : > Is it possible that you are storing object (chunks

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Mark Nelson
Is it possible that you are storing object (chunks if EC) that are smaller than the min_alloc size? This cheat sheet might help: https://docs.google.com/spreadsheets/d/1rpGfScgG-GLoIGMJWDixEkqs-On9w8nAUToPQjN8bDI/edit?usp=sharing Mark On 3/14/23 12:34, Gaël THEROND wrote: Hi everyone, I’ve g

[ceph-users] Re: CephFS thrashing through the page cache

2023-03-14 Thread Ashu Pachauri
Got the answer to my own question; posting here if someone else encounters the same problem. The issue is that the default stripe size in a cephfs mount is 4 MB. If you are doing small reads (like 4k reads in the test I posted) inside the file, you'll end up pulling at least 4MB to the client (and

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Gaël THEROND
Alright, Seems something is odd out there, if I do a radosgw-admin metadata list I’ve got the following list: [ ”bucket”, ”bucket.instance”, ”otp”, ”user” ] BUT When I try a radosgw-admin metadata get bucket or bucket.instance it complain with the following error: ERROR: can’t

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Robin H. Johnson
On Tue, Mar 14, 2023 at 06:59:51PM +0100, Gaël THEROND wrote: > Versioning wasn’t enabled, at least not explicitly and for the > documentation it isn’t enabled by default. > > Using nautilus. > > I’ll get all the required missing information on tomorrow morning, thanks > for the help! > > Is the

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Gaël THEROND
Versioning wasn’t enabled, at least not explicitly and for the documentation it isn’t enabled by default. Using nautilus. I’ll get all the required missing information on tomorrow morning, thanks for the help! Is there a way to tell CEPH to delete versions that aren’t current used one with rados

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Robin H. Johnson
On Tue, Mar 14, 2023 at 06:34:54PM +0100, Gaël THEROND wrote: > Hi everyone, I’ve got a quick question regarding one of our RadosGW bucket. > > This bucket is used to store docker registries, and the total amount of > data we use is supposed to be 4.5Tb BUT it looks like ceph told us we > rather u

[ceph-users] 10x more used space than expected

2023-03-14 Thread Gaël THEROND
Hi everyone, I’ve got a quick question regarding one of our RadosGW bucket. This bucket is used to store docker registries, and the total amount of data we use is supposed to be 4.5Tb BUT it looks like ceph told us we rather use ~53Tb of data. One interesting thing is, this bucket seems to shard

[ceph-users] Last day to sponsor Cephalocon Amsterdam 2023

2023-03-14 Thread Mike Perez
Hi everyone, Today is the last day to sponsor Cephalocon Amsterdam 2023! I want to thank our current sponsors: Platinum: IBM Silver: 42on, Canonical Ubuntu, Clyso Startup: Koor Also, thank you to Clyso for their lanyard add-on and 42on's offsite attendee party. We are still short in covering th

[ceph-users] Re: pg wait too long when osd restart

2023-03-14 Thread yite gu
Hello, baergen Thanks for your reply, I got it. ☺ Best regards Yitte Gu Josh Baergen 于2023年3月13日周一 23:15写道: > (trimming out the dev list and Radoslaw's email) > > Hello, > > I think the two critical PRs were: > * https://github.com/ceph/ceph/pull/44585 - included in 15.2.16 > * https://github.c

[ceph-users] Re: Upgrade 16.2.11 -> 17.2.0 failed

2023-03-14 Thread Robert Sander
On 14.03.23 14:21, bbk wrote: ` # ceph orch upgrade start --ceph-version 17.2.0 I would never recommend to update to a .0 release. Why not go directly to the latest 17.2.5? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Te

[ceph-users] Re: Upgrade 16.2.11 -> 17.2.0 failed

2023-03-14 Thread Adam King
That's very odd, I haven't seen this before. What container image is the upgraded mgr running on (to know for sure, can check the podman/docker run command at the end of the /var/lib/ceph//mgr./unit.run file on the mgr's host)? Also, could maybe try "ceph mgr module enable cephadm" to see if it doe

[ceph-users] Upgrade 16.2.11 -> 17.2.0 failed

2023-03-14 Thread bbk
Dear List, Today i was sucessfully upgrading with cephadm from 16.2.8 -> 16.2.9 -> 16.2.10 -> 16.2.11 Now i wanted to upgrade to 17.2.0 but after starting the upgrade with ``` # ceph orch upgrade start --ceph-version 17.2.0 ``` The orch manager module seems to be gone now and the upgrade don't

[ceph-users] Re: rbd on EC pool with fast and extremely slow writes/reads

2023-03-14 Thread Rok Jaklič
1, 2 times a year we are having similar problem in *not* ceph disk cluster, where working -> but slow disk writes give us slow reads. We somehow "understand it", since probably slow writes fill up queues and buffers. On Thu, Mar 9, 2023 at 11:37 AM Andrej Filipcic wrote: > > Thanks for the hint

[ceph-users] handle_read_frame_preamble_main read frame preamble failed r=-1 ((1) Operation not permitted)

2023-03-14 Thread Arvid Picciani
since quincy i'm randomly getting authentication issues from clients to osds. symptom is qemu hangs, but when it happens, i can reproduce it using: > ceph tell osd.\* version some - but only some - osds will never respond, but only to clients on _some_ hosts. the client gets stuck in a loop w

[ceph-users] Re: Mixed mode ssd and hdd issue

2023-03-14 Thread Eneko Lacunza
Hi, We need more info to be able to help you. What CPU and network in nodes? What model of SSD? Cheers El 13/3/23 a las 16:27, xadhoo...@gmail.com escribió: Hi, we have a cluster with 3 nodes . Each node has 4 HDD and 1 SSD We would like to have a pool only on ssd and a pool only on hdd, usin

[ceph-users] Re: upgrading from 15.2.17 to 16.2.11 - Health ERROR

2023-03-14 Thread Alessandro Bolgia
cephadm is 16.2.11, because the error comes from the upgrade from 15 to 16. Il giorno lun 13 mar 2023 alle ore 18:27 Clyso GmbH - Ceph Foundation Member ha scritto: > which version of cephadm you are using? > > ___ > Clyso GmbH - Ceph Foundation Member > > Am 10.0