[ceph-users] Mysterious Disk-Space Eater

2023-01-11 Thread duluxoz
Hi All, Got a funny one, which I'm hoping someone can help us with. We've got three identical(?) Ceph Quincy Nodes running on Rocky Linux 8.7. Each Node has 4 OSDs, plus Monitor, Manager, and iSCSI G/W services running on them (we're only a small shop). Each Node has a separate 16 GiB partiti

[ceph-users] Re: Current min_alloc_size of OSD?

2023-01-11 Thread Anthony D'Atri
It’s printed in the OSD log at startup. I don’t immediately see it in `ceph osd metadata` ; arguably it should be there. `config show` on the admin socket I suspect does not show the existing value. > > Hi, > > Ceph 16 Pacific introduced a new smaller default min_alloc_size of 4096 bytes >

[ceph-users] Current min_alloc_size of OSD?

2023-01-11 Thread Robert Sander
Hi, Ceph 16 Pacific introduced a new smaller default min_alloc_size of 4096 bytes for HDD and SSD OSDs. How can I get the current min_allloc_size of OSDs that were created with older Ceph versions? Is there a command that shows this info from the on disk format of a bluestore OSD? Regards --

[ceph-users] Re: pg mapping verification

2023-01-11 Thread Stephen Smith6
I think “ceph pg dump” is what you’re after, look for the “UP” and “ACTIVE” fields to map a PG to an OSD. From there it’s just a matter of verifying your PG placement matches the CRUSH rule. From: Christopher Durham Date: Wednesday, January 11, 2023 at 3:56 PM To: ceph-users@ceph.io Subject: [

[ceph-users] pg mapping verification

2023-01-11 Thread Christopher Durham
Hi, For a given crush rule and pool that uses it, how can I verify hat the pgs in that pool folllow the rule? I have a requirement to 'prove' that the pgs are mapping correctly. I see: https://pypi.org/project/crush/ This allows me to read in a crushmap file that I could then use to verify a pg

[ceph-users] Ceph Octopus rbd images stuck in trash

2023-01-11 Thread Jeff Welling
Hello there, I'm running Ceph 15.2.17 (Octopus) on Debian Buster and I'm starting an upgrade but I'm seeing a problem and I wanted to ask how best to proceed in case I make things worse by mucking with it without asking experts. I've moved an rbd image to the trash without clearing the snapsh

[ceph-users] Move bucket between realms

2023-01-11 Thread mahnoosh shahidi
Hi all, Is there any way in rgw to move a bucket from one realm to another one in the same cluster? Best regards, Mahnoosh ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: adding OSD to orchestrated system, ignoring osd service spec.

2023-01-11 Thread Eugen Block
I just wanted to see if something like "all available devices" is managed and could possibly override your drivegroups.yml. Here's an example: storage01:~ # ceph orch ls osd NAME PORTS RUNNING REFRESHED AGE PLACEMENT osd 3 9m ago

[ceph-users] Re: Serious cluster issue - Incomplete PGs

2023-01-11 Thread Eugen Block
I can't recall having used the objectstore-tool to mark PGs as complete, so I can't really comfirm if that will work and if it will unblock the stuck requests (I would assume it does). Hopefully someone can chime in here. Zitat von Deep Dish : Eugen, I never insinuated my circumstance is

[ceph-users] Re: adding OSD to orchestrated system, ignoring osd service spec.

2023-01-11 Thread Wyll Ingersoll
Not really, its on an airgapped/secure network and I cannot copy-and-paste from it. What are you looking for? This cluster has 720 OSDs across 18 storage nodes. I think we have identified the problem and it may not be a ceph issue, but need to investigate further. It has something to do wit

[ceph-users] Permanently ignore some warning classes

2023-01-11 Thread Nicola Mori
Dear Ceph users, my cluster is build with old hardware on a gigabit network, so I often experience warnings like OSD_SLOW_PING_TIME_BACK. These in turn triggers alert mails too often, forcing me to disable alerts which is not sustainable. So my question is: is it possible to tell Ceph to ignor

[ceph-users] OSD crash with "FAILED ceph_assert(v.length() == p->shard_info->bytes)"

2023-01-11 Thread Yu Changyuan
One of OSD(other OSDs are fine) was crashed, and try "ceph-bluestore-tool fsck" also crashed with same error. Besides destroy this OSD and re-create, are there any other steps I can do to restore the OSD? Below is part of message: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILAB

[ceph-users] Re: OSD crash on Onode::put

2023-01-11 Thread Frank Schilder
Hi Anthony and Serkan, I checked the drive temperatures and there is nothing special about this slot. The disks in this slot are from different vendors and were not populated incrementally. It might be a very weird coincidence. I seem to have an OSD developing this problem in another slot on a

[ceph-users] Re: Snap trimming best practice

2023-01-11 Thread Frank Schilder
Hi Istvan, our experience is the opposite. We put as many PGs in pools as the OSDs can manage. We aim for between 100 and 200 for HDDs and accept larger than 200 for SSDs. The smaller the PGs the better work all internal operations, including snaptrim, recovery, scrubbing etc. on our cluster.

[ceph-users] Re: OSD crash on Onode::put

2023-01-11 Thread Frank Schilder
Hi Dongdong. > is simple and can be applied cleanly. I understand this statement from a developer's perspective. Now, try to explain to a user with a cephadm deployed containerized cluster how to build a container from source, point cephadm to use this container and what to do for the next upg

[ceph-users] Re: [ERR] OSD_SCRUB_ERRORS: 2 scrub errors

2023-01-11 Thread Konstantin Shalygin
Hi, > On 10 Jan 2023, at 07:10, David Orman wrote: > > We ship all of this to our centralized monitoring system (and a lot more) and > have dashboards/proactive monitoring/alerting with 100PiB+ of Ceph. If you're > running Ceph in production, I believe host-level monitoring is critical, > abo

[ceph-users] Intel Cache Solution with HA Cluster on the iSCSI Gateway node

2023-01-11 Thread Kamran Zafar Syed
HI There, Is there someone, who had some experience of implementing Intel Cache Accelerator Solution on top of iSCSI Gateway. Thanks and Regards, Koki ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.

[ceph-users] Re: adding OSD to orchestrated system, ignoring osd service spec.

2023-01-11 Thread Eugen Block
Hi, can you share the output of storage01:~ # ceph orch ls osd Thanks, Eugen Zitat von Wyll Ingersoll : When adding a new OSD to a ceph orchestrated system (16.2.9) on a storage node that has a specification profile that dictates which devices to use as the db_devices (SSDs), the newly ad

[ceph-users] Snap trimming best practice

2023-01-11 Thread Szabo, Istvan (Agoda)
Hi, Wonder have you ever faced issue with snaptrimming if you follow ceph pg allocation recommendation (100pg/osd)? We have a nautilus cluster and we scare to increase the pg-s of the pools because seems like even if we have 4osd/nvme, if the pg number is higher = the snaptrimming is slower.