[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Petr Bena
Hello, No I don't have osd_scrub_auto_repair, interestingly after about a week after forgetting about this, an error manifested: [ERR] OSD_SCRUB_ERRORS: 1 scrub errors [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent pg 4.1d is active+clean+inconsistent, acting [4,2] which could be

[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Eugen Block
That would have been my next question, did you verify that the corrupted OSD was a primary? The default deep-scrub config scrubs all PGs within a week, so yeah, it can take a week until it's detected. It could have been detected sooner if those objects would have been in use by clients and

[ceph-users] Re: Question regarding bluestore labels

2024-06-10 Thread Igor Fedotov
Hi Bailey, yes, this should be doable using the following steps: 1. Copy the very first block 0~4096 from a different OSD to that non-working one. 2. Use ceph-bluestore-tool's set-label-key command to modify "osd_uud" at target OSD 3. Adjust "size" field at target OSD if DB volume size at

[ceph-users] Performance issues RGW (S3)

2024-06-10 Thread sinan
Hi all, My Ceph setup: - 12 OSD nodes, 4 OSD nodes per rack. Replication of 3, 1 replica per rack. - 20 spinning SAS disks per node. - Some nodes have 256GB RAM, some nodes 128GB. - CPU varies between Intel E5-2650 and Intel Gold 5317. - Each node has 10Gbit/s network. Using rados bench I am g

[ceph-users] Re: Question regarding bluestore labels

2024-06-10 Thread Bailey Allison
Hey Igor, Thanks for the validation, I was also able to validate this in testing on the weekend myself, though on a db I messed up myself, and it was able to be restored. If this ends up being the solution for the customer in this case, I will follow up here if anyone is curious. Thanks again Ig

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread Anthony D'Atri
> Hi all, > > My Ceph setup: > - 12 OSD nodes, 4 OSD nodes per rack. Replication of 3, 1 replica per rack. > - 20 spinning SAS disks per node. Don't use legacy HDDs if you care about performance. > - Some nodes have 256GB RAM, some nodes 128GB. 128GB is on the low side for 20 OSDs. > - CPU

[ceph-users] Stuck OSD down/out + workaround

2024-06-10 Thread Mazzystr
Hi ceph users, I've seen this happen a couple times and been meaning to ask the group about it. Sometimes I get a failed block device and I have to replace it. My normal process is - * stop the osd process * remove the osd from crush map * rm -rf /var/lib/ceph/osd/-/* * run mkfs * start osd proce

[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Petr Bena
Most likely it wasn't, the ceph help or documentation is not very clear about this: osd deep-scrub initiate deep scrub on osd , or use to deep scrub all It doesn't say anything like "initiate dee

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread sinan
On 2024-06-10 15:20, Anthony D'Atri wrote: Hi all, My Ceph setup: - 12 OSD nodes, 4 OSD nodes per rack. Replication of 3, 1 replica per rack. - 20 spinning SAS disks per node. Don't use legacy HDDs if you care about performance. You are right here, but we use Ceph mainly for RBD. It perfor

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread Anthony D'Atri
>>> - 20 spinning SAS disks per node. >> Don't use legacy HDDs if you care about performance. > > You are right here, but we use Ceph mainly for RBD. It performs 'good enough' > for our RBD load. You use RBD for archival? >>> - Some nodes have 256GB RAM, some nodes 128GB. >> 128GB is on the

[ceph-users] Re: Testing CEPH scrubbing / self-healing capabilities

2024-06-10 Thread Anthony D'Atri
Scrubs are of PGs not OSDs, the lead OSD for a PG orchestrates subops to secondary OSDs. If you can point me to where this is in docs/src I'll clarify it, ideally if you can put in a tracker ticket and send me a link. Scrubbing all PGs on an OSD at once or even in sequence would be impactful.

[ceph-users] Ceph Leadership Team Weekly Minutes 2024-06-10

2024-06-10 Thread Casey Bodley
# quincy now past estimated 2024-06-01 end-of-life will 17.2.8 be the last point release? maybe not, depending on timing # centos 8 eol * Casey tried to summarize the fallout in https://lists.ceph.io/hyperkitty/list/d...@ceph.io/thread/H7I4Q4RAIT6UZQNPPZ5O3YB6AUXLLAFI/ * c8 builds were disabled

[ceph-users] Re: Ceph RBD, MySQL write IOPs - what is possible?

2024-06-10 Thread Mark Lehrer
> Not the most helpful response, but on a (admittedly well-tuned) Actually this was the most helpful since you ran the same rados bench command. I'm trying to stay away from rbd & qemu issues and just test rados bench on a non-virtualized client. I have a test instance newer drives, CPUs, and Ce

[ceph-users] Re: Ceph RBD, MySQL write IOPs - what is possible?

2024-06-10 Thread Anthony D'Atri
Eh? cf. Mark and Dan's 1TB/s presentation. > On Jun 10, 2024, at 13:58, Mark Lehrer wrote: > > It > seems like Ceph still hasn't adjusted to SSD performance. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-user

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread sinan
On 2024-06-10 17:42, Anthony D'Atri wrote: - 20 spinning SAS disks per node. Don't use legacy HDDs if you care about performance. You are right here, but we use Ceph mainly for RBD. It performs 'good enough' for our RBD load. You use RBD for archival? No, storage for (light-weight) virtua

[ceph-users] Attention: Documentation - mon states and names

2024-06-10 Thread Joel Davidow
As this is my first submission to the Ceph docs, I want to start by saying a big thank you to the Ceph team for all the efforts that have been put into improving the docs. The improvements already made have been many and have made it easier for me to operate Ceph. In https://docs.ceph.com/en/lates

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread Anthony D'Atri
>>> You are right here, but we use Ceph mainly for RBD. It performs 'good >>> enough' for our RBD load. >> You use RBD for archival? > > No, storage for (light-weight) virtual machines. I'm surprised that it's enough, I've seen HDDs fail miserably in that role. > The (CPU) load on the

[ceph-users] Re: MDS crashes to damaged metadata

2024-06-10 Thread Patrick Donnelly
You could try manually deleting the files from the directory fragments, using `rados` commands. Make sure to flush your MDS journal first and take the fs offline (`ceph fs fail`). On Tue, Jun 4, 2024 at 8:50 AM Stolte, Felix wrote: > > Hi Patrick, > > it has been a year now and we did not have a

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread sinan
On 2024-06-10 21:37, Anthony D'Atri wrote: You are right here, but we use Ceph mainly for RBD. It performs 'good enough' for our RBD load. You use RBD for archival? No, storage for (light-weight) virtual machines. I'm surprised that it's enough, I've seen HDDs fail miserably in that role.

[ceph-users] multipart uploads in reef 18.2.2

2024-06-10 Thread Christopher Durham
We have a reef 18.2.2 cluster with 6 radosgw servers on Rocky 8.9. The radosgw servers are not fronted by anything like HAProxy as the clients connect directly to a DNS name via a round-robin DNS. Each of the radosgw servers have a certificate using SAN entries for all 6 radosgw servers as well

[ceph-users] Re: Attention: Documentation - mon states and names

2024-06-10 Thread Zac Dover
Joel, Thank you for this message. This is a model of what in a perfect world user communication with upstream documentation could be. I identify four things in your message that I can work on immediately: 1. leader/peon documentation improvement 2. Ceph command-presentation convention standardi

[ceph-users] Re: Performance issues RGW (S3)

2024-06-10 Thread Anthony D'Atri
>> To be clear, you don't need more nodes. You can add RGWs to the ones you >> already have. You have 12 OSD nodes - why not put an RGW on each? > Might be an option, just don't like the idea to host multiple components on > nodes. But I'll consider it. I really don't like mixing mon/mgr wi

[ceph-users] Re: Ceph RBD, MySQL write IOPs - what is possible?

2024-06-10 Thread Mark Lehrer
If they can do 1 TB/s with a single 16K write thread, that will be quite impressive :DOtherwise not really applicable. Ceph scaling has always been good. More seriously, would you mind sending a link to this? Thanks! Mark On Mon, Jun 10, 2024 at 12:01 PM Anthony D'Atri wrote: > > Eh? cf

[ceph-users] Re: About disk disk iops and ultil peak

2024-06-10 Thread Anthony D'Atri
What specifically are your OSD devices? > On Jun 10, 2024, at 22:23, Phong Tran Thanh wrote: > > Hi ceph user! > > I am encountering a problem with IOPS and disk utilization of OSD. Sometimes, > my disk peaks in IOPS and utilization become too high, which affects my > cluster and causes slow