[ceph-users] Re: Best practice for removing failing host from cluster?

2022-11-09 Thread Robert Sander
On 10.11.22 03:19, Matt Larson wrote: Should I use `ceph orch osd rm XX` for each of the OSDs of this host or should I set the weights of each of the OSDs as 0? Can I do this while the host is offline, or should I bring it online first before setting weights or using `ceph orch osd rm`? I wou

[ceph-users] Re: Question regarding Quincy mclock scheduler.

2022-11-09 Thread philippe
Hi, Thanks a lot for the clarification, so we will adapt or setup using custom profile with the proposed parameters. Kr Philippe On 9/11/22 14:04, Aishwarya Mathuria wrote: Hello Philippe, Your understanding is correct, 50% of IOPS are reserved for client operations. osd_mclock_max_capacity_

[ceph-users] Re: CephFS constant high write I/O to the metadata pool

2022-11-09 Thread Venky Shankar
Hi Olli, On Mon, Oct 17, 2022 at 1:08 PM Olli Rajala wrote: > > Hi Patrick, > > With "objecter_ops" did you mean "ceph tell mds.pve-core-1 ops" and/or > "ceph tell mds.pve-core-1 objecter_requests"? Both these show very few > requests/ops - many times just returning empty lists. I'm pretty sure t

[ceph-users] Best practice for removing failing host from cluster?

2022-11-09 Thread Matt Larson
We have a Ceph cluster running Octopus v 15.2.3 , and 1 of 12 of the hosts in the cluster has started having what appears to be a hardware issue causing it to freeze. This began with a freeze and reported 'CATERR' in the server logs. The host has been having repeated freeze issues over the last we

[ceph-users] Re: iscsi target lun error

2022-11-09 Thread Xiubo Li
On 10/11/2022 02:21, Randy Morgan wrote: I am trying to create a second iscsi target and I keep getting an error when I create the second target:    Failed to update target 'iqn.2001-07.com.ceph:1667946365517' disk create/update failed on host.containers.internal. LUN allocation fai

[ceph-users] iscsi target lun error

2022-11-09 Thread Randy Morgan
I am trying to create a second iscsi target and I keep getting an error when I create the second target: Failed to update target 'iqn.2001-07.com.ceph:1667946365517' disk create/update failed on host.containers.internal. LUN allocation failure I am running ceph Pacific: *Version*

[ceph-users] Rook mgr module failing

2022-11-09 Thread Mikhail Sidorov
Hello! I tried to turn on rook ceph mgr module like so: ceph mgr module enable rook ceph orch set backend rook But after that I started getting errors: 500 - Internal Server Error The server encountered an unexpected condition which prevented it from fulfilling the request. Mgr logs show this t

[ceph-users] Re: How to check available storage with EC and different sized OSD's ?

2022-11-09 Thread Paweł Kowalski
I don't need to redistribute data after OSD failure. All I want to do in this test setup is to keep data safe in RO after such failure. Paweł W dniu 9.11.2022 o 17:09, Danny Webb pisze: With a 3 osd pool it's not possible for data to be redistributed on failure of an OSD. with a K=2,M=1 va

[ceph-users] Re: How to check available storage with EC and different sized OSD's ?

2022-11-09 Thread Danny Webb
With a 3 osd pool it's not possible for data to be redistributed on failure of an OSD. with a K=2,M=1 value your minimum number of OSDs for distributions sake is 3. If you need the ability to redistribute data on failure you'd need a 4th OSD. You k/m value can't be larger than your failure do

[ceph-users] Re: How to check available storage with EC and different sized OSD's ?

2022-11-09 Thread Paweł Kowalski
If I start to use all available space that pool can offer (4.5T) and first OSD (2.7T) fails, I'm sure I'll end up with lost data since it's not possible to fit 4.5T on 2 remaining drives with total raw capacity of 3.6T. I'm wondering why ceph isn't complaining now. I thought it should place d

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-09 Thread Mark Nelson
On 11/9/22 4:48 AM, Stefan Kooman wrote: On 11/8/22 21:20, Mark Nelson wrote: Hi Folks, I thought I would mention that I've released a couple of performance articles on the Ceph blog recently that might be of interest to people: For sure, thanks a lot, it's really informative! Can we also

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-09 Thread Mark Nelson
On 11/9/22 6:03 AM, Eshcar Hillel wrote: Hi Mark, Thanks for posting these blogs. They are very interesting to read. Maybe you have an answer to a question I asked in the dev list: We run fio benchmark against a 3-node ceph cluster with 96 OSDs. Objects are 4kb. We use gdbpmp profilerhttps://

[ceph-users] Re: Question regarding Quincy mclock scheduler.

2022-11-09 Thread Aishwarya Mathuria
Hello Philippe, Your understanding is correct, 50% of IOPS are reserved for client operations. osd_mclock_max_capacity_iops_hdd defines the capacity per OSD. There is a mClock queue for each OSD shard. The number of shards are defined by osd_op_num_shards_hdd

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-09 Thread Eshcar Hillel
Hi Mark, Thanks for posting these blogs. They are very interesting to read. Maybe you have an answer to a question I asked in the dev list: We run fio benchmark against a 3-node ceph cluster with 96 OSDs. Objects are 4kb. We use gdbpmp profiler https://github.com/markhpc/gdbpmp to analyze the t

[ceph-users] Large strange flip in storage accounting

2022-11-09 Thread Frank Schilder
Hi all, during maintenance yesterday we observed something extremely strange on our production cluster. We needed to rebalance storage from slow to fast SSDs in small pools. The pools affected by this operation were con-rbd-meta-hpc-one, con-fs2-meta1 and con-fs2-meta2 (see ceph df output below

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-09 Thread Stefan Kooman
On 11/8/22 21:20, Mark Nelson wrote: Hi Folks, I thought I would mention that I've released a couple of performance articles on the Ceph blog recently that might be of interest to people: For sure, thanks a lot, it's really informative! Can we also ask for special requests? One of the things

[ceph-users] Re: How to ... alertmanager and prometheus

2022-11-09 Thread Sake Paulusma
Hi I noticed that cephadm would update the grafana-frontend-api-url with version 17.2.3, but it looks broken with version 17.2.5. It isn't a big deal to update the url by myself, but it's quite irritating to do if in the past it corrected itself. Best regards, Sake

[ceph-users] Re: How to ... alertmanager and prometheus

2022-11-09 Thread Eugen Block
The only thing I noticed was that I had to change the grafana-api-url for the dashboard when I stopped one of the two grafana instances. I wasn't able to test the dashboard before because I had to wait for new certificates so my browser wouldn't complain about the cephadm cert. So it seems

[ceph-users] Question regarding Quincy mclock scheduler.

2022-11-09 Thread philippe
Hi, We have a quincy 17.2.5 based cluster, and we have some question regarding the mclock iops scheduler. Looking into the documentation, the default profile is the HIGH_CLIENT_OPS that mean that 50% of IOPS for an OSD are reserved for clients operations. But looking into OSD configuration settin