[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Frank Schilder
Hi, there seem to be replies missing to this list. For example, I can't find 
any messages that contain information that could lead to this conclusion:

> * pg_num too low (defaults are too low)
> * pg_num not a power of 2
> * pg_num != number of OSDs in the pool
> * balancer not enabled

It is horrible for other users to follow threads or learn from them if part of 
the communication is private. This thread is not the first occurrence, it seems 
to become more frequent recently. Could posters please reply to the list 
instead of individual users?

Thanks for your consideration.
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Anthony D'Atri 
Sent: Wednesday, June 12, 2024 2:53 PM
To: Eugen Block
Cc: Lars Köppel; ceph-users@ceph.io
Subject: [ceph-users] Re: CephFS metadata pool size

If you have:

* pg_num too low (defaults are too low)
* pg_num not a power of 2
* pg_num != number of OSDs in the pool
* balancer not enabled

any of those might result in imbalance.

> On Jun 12, 2024, at 07:33, Eugen Block  wrote:
>
> I don't have any good explanation at this point. Can you share some more 
> information like:
>
> ceph pg ls-by-pool 
> ceph osd df (for the relevant OSDs)
> ceph df
>
> Thanks,
> Eugen
>
> Zitat von Lars Köppel :
>
>> Since my last update the size of the largest OSD increased by 0.4 TiB while
>> the smallest one only increased by 0.1 TiB. How is this possible?
>>
>> Because the metadata pool reported to have only 900MB space left, I stopped
>> the hot-standby MDS. This gave me 8GB back but these filled up in the last
>> 2h.
>> I think I have to zap the next OSD because the filesystem is getting read
>> only...
>>
>> How is it possible that an OSD has over 1 TiB less data on it after a
>> rebuild? And how is it possible to have so different sizes of OSDs?
>>
>>
>> [image: ariadne.ai Logo] Lars Köppel
>> Developer
>> Email: lars.koep...@ariadne.ai
>> Phone: +49 6221 5993580 <+4962215993580>
>> ariadne.ai (Germany) GmbH
>> Häusserstraße 3, 69115 Heidelberg
>> Amtsgericht Mannheim, HRB 744040
>> Geschäftsführer: Dr. Fabian Svara
>> https://ariadne.ai
>>
>>
>> On Tue, Jun 11, 2024 at 3:47 PM Lars Köppel  wrote:
>>
>>> Only in warning mode. And there were no PG splits or merges in the last 2
>>> month.
>>>
>>>
>>> [image: ariadne.ai Logo] Lars Köppel
>>> Developer
>>> Email: lars.koep...@ariadne.ai
>>> Phone: +49 6221 5993580 <+4962215993580>
>>> ariadne.ai (Germany) GmbH
>>> Häusserstraße 3, 69115 Heidelberg
>>> Amtsgericht Mannheim, HRB 744040
>>> Geschäftsführer: Dr. Fabian Svara
>>> https://ariadne.ai
>>>
>>>
>>> On Tue, Jun 11, 2024 at 3:32 PM Eugen Block  wrote:
>>>
 I don't think scrubs can cause this. Do you have autoscaler enabled?

 Zitat von Lars Köppel :

 > Hi,
 >
 > thank you for your response.
 >
 > I don't think this thread covers my problem, because the OSDs for the
 > metadata pool fill up at different rates. So I would think this is no
 > direct problem with the journal.
 > Because we had earlier problems with the journal I changed some
 > settings(see below). I already restarted all MDS multiple times but no
 > change here.
 >
 > The health warnings regarding cache pressure resolve normally after a
 > short period of time, when the heavy load on the client ends. Sometimes
 it
 > stays a bit longer because an rsync is running and copying data on the
 > cluster(rsync is not good at releasing the caps).
 >
 > Could it be a problem if scrubs run most of the time in the background?
 Can
 > this block any other tasks or generate new data itself?
 >
 > Best regards,
 > Lars
 >
 >
 > global  basic mds_cache_memory_limit
 > 17179869184
 > global  advanced  mds_max_caps_per_client
 >16384
 > global  advanced
 mds_recall_global_max_decay_threshold
 >262144
 > global  advanced  mds_recall_max_decay_rate
 >1.00
 > global  advanced  mds_recall_max_decay_threshold
 > 262144
 > mds advanced  mds_cache_trim_threshold
 > 131072
 > mds advanced  mds_heartbeat_grace
 >120.00
 > mds advanced  mds_heartbeat_reset_grace
 >7400
 > mds advanced  mds_tick_interval
 >3.00
 >
 >
 > [image: ariadne.ai Logo] Lars Köppel
 > Developer
 > Email: lars.koep...@ariadne.ai
 > Phone: +49 6221 5993580 <+4962215993580>
 > ariadne.ai (Germany) GmbH
 > Häusserstraße 3, 69115 Heidelberg
 > Amtsgericht Mannheim, HRB 744040
 > Geschäftsführer: Dr. Fabian Svara
 > https://ariadne.ai
 >
 >
 > On Tue, Jun 11, 2024 at 2:05 PM Eugen Block  wrote:
 >
 >> Hi,
 >>
>

[ceph-users] Re: How radosgw considers that the file upload is done?

2024-06-12 Thread Daniel Gryniewicz

On 6/12/24 5:43 AM, Szabo, Istvan (Agoda) wrote:

Hi,

Wonder how radosgw knows that a transaction is done and didn't break the 
connection between the user interface and gateway?

Let's see this is one request:

2024-06-12T16:26:03.386+0700 7fa34c7f0700  1 beast: 0x7fa5bc776750: 1.1.1.1 - - 
[2024-06-12T16:26:03.386063+0700] "PUT /bucket/0/2/966394.delta HTTP/1.1" 200 238 - 
"User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.0.0-78, Hadoop 3.2.2, aws-sdk-java/1.11.563 
Linux/5.15.0-101-generic OpenJDK_64-Bit_Server_VM/11.0.18+10-post-Debian-1deb10u1 java/11.0.18 
scala/2.12.15 vendor/Debian com.amazonaws.services.s3.transfer.TransferManager/1.11.563" -
2024-06-12T16:26:03.386+0700 7fa4e9ffb700  1 == req done req=0x7fa5a4572750 
op status=0 http_status=200 latency=737ns ==

What I can see here is the
req done
op status=0

I guess if the connection broke between user and gateway the req will be done 
also, but what s op status? Is it the one that I'm actually looking for? If 
connection broke maybe that is different value?

Thank you



op status will be the error returned from the recv() systemcall, 
effectively.  So probably something like -ECONNRESET, which is -104.


Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Patching Ceph cluster

2024-06-12 Thread Michael Worsham
Interesting. How do you set this "maintenance mode"? If you have a series of 
documented steps that you have to do and could provide as an example, that 
would be beneficial for my efforts.

We are in the process of standing up both a dev-test environment consisting of 
3 Ceph servers (strictly for testing purposes) and a new production environment 
consisting of 20+ Ceph servers.

We are using Ubuntu 22.04.

-- Michael


From: Daniel Brown 
Sent: Wednesday, June 12, 2024 9:18 AM
To: Anthony D'Atri 
Cc: Michael Worsham ; ceph-users@ceph.io 

Subject: Re: [ceph-users] Patching Ceph cluster

This is an external email. Please take care when clicking links or opening 
attachments. When in doubt, check with the Help Desk or Security.


There’s also a Maintenance mode that you can set for each server, as you’re 
doing updates, so that the cluster doesn’t try to move data from affected 
OSD’s, while the server being updated is offline or down. I’ve worked some on 
automating this with Ansible, but have found my process (and/or my cluster) 
still requires some manual intervention while it’s running to get things done 
cleanly.



> On Jun 12, 2024, at 8:49 AM, Anthony D'Atri  wrote:
>
> Do you mean patching the OS?
>
> If so, easy -- one node at a time, then after it comes back up, wait until 
> all PGs are active+clean and the mon quorum is complete before proceeding.
>
>
>
>> On Jun 12, 2024, at 07:56, Michael Worsham  
>> wrote:
>>
>> What is the proper way to patch a Ceph cluster and reboot the servers in 
>> said cluster if a reboot is necessary for said updates? And is it possible 
>> to automate it via Ansible? This message and its attachments are from Data 
>> Dimensions and are intended only for the use of the individual or entity to 
>> which it is addressed, and may contain information that is privileged, 
>> confidential, and exempt from disclosure under applicable law. If the reader 
>> of this message is not the intended recipient, or the employee or agent 
>> responsible for delivering the message to the intended recipient, you are 
>> hereby notified that any dissemination, distribution, or copying of this 
>> communication is strictly prohibited. If you have received this 
>> communication in error, please notify the sender immediately and permanently 
>> delete the original email and destroy any copies or printouts of this email 
>> as well as any attachments.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

This message and its attachments are from Data Dimensions and are intended only 
for the use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify the 
sender immediately and permanently delete the original email and destroy any 
copies or printouts of this email as well as any attachments.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Eugen Block

Which version did you upgrade from to 18.2.2?
I can’t pin it down to a specific issue, but somewhere in the back of  
my mind is something related to a new omap format or something. But  
I’m really not sure at all.


Zitat von Lars Köppel :


I am happy to help you with as much information as possible. I probably
just don't know where to look for it.
Below are the requested information. The cluster is rebuilding the
zapped OSD at the moment. This will probably take the next few days.


sudo ceph pg ls-by-pool metadata
PG OBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES OMAP_BYTES*
 OMAP_KEYS*  LOG   LOG_DUPS  STATE
 SINCE  VERSION  REPORTED UP ACTING
SCRUB_STAMP  DEEP_SCRUB_STAMP
LAST_SCRUB_DURATION  SCRUB_SCHEDULING
10.0   5217325   4994695  00   4194304   5880891340
9393865  1885  3000  active+undersized+degraded+remapped+backfill_wait
2h  79875'180849582  79875:391519635  [0,1,2]p0  [1,2]p1
 2024-06-11T09:08:09.829362+  2024-05-28T05:52:59.321589+
   627  periodic scrub scheduled @ 2024-06-17T08:21:31.808348+
10.1   5214785   5193424  00 0   5843682713
9410150  1912  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180914288  79875:342746928  [2,1,0]p2  [2,1]p2
 2024-06-01T15:56:28.927288+  2024-05-27T03:31:37.682966+
   966  queued for scrub
10.2   5218432   5187168  00 0   6402011266
9812513  1874  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180970531  79875:341340204  [0,1,2]p0  [1,2]p1
 2024-06-11T13:40:58.994256+  2024-06-11T13:40:58.994256+
  1942  periodic scrub scheduled @ 2024-06-17T06:07:15.329675+
10.3   5217413   5217413  00   8388788   5766005023
9271787  1923  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'181012233  79875:388295881  [1,0,2]p1  [1,2]p1
 2024-06-12T00:35:56.965547+  2024-05-23T19:54:56.121729+
   492  periodic scrub scheduled @ 2024-06-18T06:39:31.103864+
10.4   5220069   5220069  00  12583466   6027548724
9537290  1959  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'181576075  79875:405295868  [1,2,0]p1  [1,2]p1
 2024-06-11T17:47:22.923514+  2024-05-31T02:06:55.339574+
   581  periodic scrub scheduled @ 2024-06-17T00:59:37.214420+
10.5   5216162   5211999  00   4194304   5941347251
9542764  1930  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180455793  79875:338418517  [2,1,0]p2  [2,1]p2
 2024-06-11T22:50:16.170708+  2024-05-30T23:49:54.316379+
   528  periodic scrub scheduled @ 2024-06-17T04:39:25.905185+
10.6   5216100   4980459  00   4521984   6428088514
9850762  1911  3000  active+undersized+degraded+remapped+backfill_wait
2h  79875'184045876  79875:396809795  [0,2,1]p0  [1,2]p1
 2024-06-11T22:24:05.102716+  2024-06-11T22:24:05.102716+
  1082  periodic scrub scheduled @ 2024-06-17T07:58:44.289885+
10.7   5218232   5218232  00   4194304   6377065363
9849360  1919  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'182672562  79875:342449062  [1,0,2]p1  [1,2]p1
 2024-06-11T06:22:15.689422+  2024-06-11T06:22:15.689422+
  8225  periodic scrub scheduled @ 2024-06-17T13:05:59.225052+
10.8   5219620   5182816  00 0   6167304290
9691796  1896  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'179628377  79875:378022884  [2,1,0]p2  [2,1]p2
 2024-06-11T22:06:01.386763+  2024-06-11T22:06:01.386763+
  1286  periodic scrub scheduled @ 2024-06-17T07:54:54.133093+
10.9   5219448   5164591  00   8388698   5796048346
9338312  1868  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'181739392  79875:387412389  [2,1,0]p2  [2,1]p2
 2024-06-12T05:21:00.586747+  2024-05-26T11:10:59.780673+
   539  periodic scrub scheduled @ 2024-06-18T15:32:59.155092+
10.a   5219861   5163635  00  12582912   5841839055
9387200  1916  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180205688  79875:379381294  [1,2,0]p1  [1,2]p1
 2024-06-11T12:35:05.571200+  2024-05-22T11:07:16.041773+
  1093  periodic deep scrub scheduled @ 2024-06-17T05:21:40.136463+
10.b   5217949   5217949  00  16777216   5935863260
9462127  1881  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'181655745  79875:343806807  [0,1,2]p0  [1,2]p1
 2024-06-11T22:41:28.976920+  2024-05-26T08:43:29.217457+
   520  periodic scrub scheduled @ 2024-06-17T17:44:32.764093+
10.c   5221697   5217118  00   4194304   6015217841
9574445  1928  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180892

[ceph-users] Re: Patching Ceph cluster

2024-06-12 Thread Anthony D'Atri
That's just setting noout, norebalance, etc.

> On Jun 12, 2024, at 11:28, Michael Worsham  
> wrote:
> 
> Interesting. How do you set this "maintenance mode"? If you have a series of 
> documented steps that you have to do and could provide as an example, that 
> would be beneficial for my efforts.
> 
> We are in the process of standing up both a dev-test environment consisting 
> of 3 Ceph servers (strictly for testing purposes) and a new production 
> environment consisting of 20+ Ceph servers.
> 
> We are using Ubuntu 22.04.
> 
> -- Michael
> 
> 
> From: Daniel Brown 
> Sent: Wednesday, June 12, 2024 9:18 AM
> To: Anthony D'Atri 
> Cc: Michael Worsham ; ceph-users@ceph.io 
> 
> Subject: Re: [ceph-users] Patching Ceph cluster
> 
> This is an external email. Please take care when clicking links or opening 
> attachments. When in doubt, check with the Help Desk or Security.
> 
> 
> There’s also a Maintenance mode that you can set for each server, as you’re 
> doing updates, so that the cluster doesn’t try to move data from affected 
> OSD’s, while the server being updated is offline or down. I’ve worked some on 
> automating this with Ansible, but have found my process (and/or my cluster) 
> still requires some manual intervention while it’s running to get things done 
> cleanly.
> 
> 
> 
>> On Jun 12, 2024, at 8:49 AM, Anthony D'Atri  wrote:
>> 
>> Do you mean patching the OS?
>> 
>> If so, easy -- one node at a time, then after it comes back up, wait until 
>> all PGs are active+clean and the mon quorum is complete before proceeding.
>> 
>> 
>> 
>>> On Jun 12, 2024, at 07:56, Michael Worsham  
>>> wrote:
>>> 
>>> What is the proper way to patch a Ceph cluster and reboot the servers in 
>>> said cluster if a reboot is necessary for said updates? And is it possible 
>>> to automate it via Ansible? This message and its attachments are from Data 
>>> Dimensions and are intended only for the use of the individual or entity to 
>>> which it is addressed, and may contain information that is privileged, 
>>> confidential, and exempt from disclosure under applicable law. If the 
>>> reader of this message is not the intended recipient, or the employee or 
>>> agent responsible for delivering the message to the intended recipient, you 
>>> are hereby notified that any dissemination, distribution, or copying of 
>>> this communication is strictly prohibited. If you have received this 
>>> communication in error, please notify the sender immediately and 
>>> permanently delete the original email and destroy any copies or printouts 
>>> of this email as well as any attachments.
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> This message and its attachments are from Data Dimensions and are intended 
> only for the use of the individual or entity to which it is addressed, and 
> may contain information that is privileged, confidential, and exempt from 
> disclosure under applicable law. If the reader of this message is not the 
> intended recipient, or the employee or agent responsible for delivering the 
> message to the intended recipient, you are hereby notified that any 
> dissemination, distribution, or copying of this communication is strictly 
> prohibited. If you have received this communication in error, please notify 
> the sender immediately and permanently delete the original email and destroy 
> any copies or printouts of this email as well as any attachments.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Patching Ceph cluster

2024-06-12 Thread Eugen Block

There’s also a maintenance mode available for the orchestrator:

https://docs.ceph.com/en/reef/cephadm/host-management/#maintenance-mode

There’s some more information about that in the dev section:
https://docs.ceph.com/en/reef/dev/cephadm/host-maintenance/

Zitat von Anthony D'Atri :


That's just setting noout, norebalance, etc.

On Jun 12, 2024, at 11:28, Michael Worsham  
 wrote:


Interesting. How do you set this "maintenance mode"? If you have a  
series of documented steps that you have to do and could provide as  
an example, that would be beneficial for my efforts.


We are in the process of standing up both a dev-test environment  
consisting of 3 Ceph servers (strictly for testing purposes) and a  
new production environment consisting of 20+ Ceph servers.


We are using Ubuntu 22.04.

-- Michael


From: Daniel Brown 
Sent: Wednesday, June 12, 2024 9:18 AM
To: Anthony D'Atri 
Cc: Michael Worsham ;  
ceph-users@ceph.io 

Subject: Re: [ceph-users] Patching Ceph cluster

This is an external email. Please take care when clicking links or  
opening attachments. When in doubt, check with the Help Desk or  
Security.



There’s also a Maintenance mode that you can set for each server,  
as you’re doing updates, so that the cluster doesn’t try to move  
data from affected OSD’s, while the server being updated is offline  
or down. I’ve worked some on automating this with Ansible, but have  
found my process (and/or my cluster) still requires some manual  
intervention while it’s running to get things done cleanly.




On Jun 12, 2024, at 8:49 AM, Anthony D'Atri  
 wrote:


Do you mean patching the OS?

If so, easy -- one node at a time, then after it comes back up,  
wait until all PGs are active+clean and the mon quorum is complete  
before proceeding.




On Jun 12, 2024, at 07:56, Michael Worsham  
 wrote:


What is the proper way to patch a Ceph cluster and reboot the  
servers in said cluster if a reboot is necessary for said  
updates? And is it possible to automate it via Ansible? This  
message and its attachments are from Data Dimensions and are  
intended only for the use of the individual or entity to which it  
is addressed, and may contain information that is privileged,  
confidential, and exempt from disclosure under applicable law. If  
the reader of this message is not the intended recipient, or the  
employee or agent responsible for delivering the message to the  
intended recipient, you are hereby notified that any  
dissemination, distribution, or copying of this communication is  
strictly prohibited. If you have received this communication in  
error, please notify the sender immediately and permanently  
delete the original email and destroy any copies or printouts of  
this email as well as any attachments.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


This message and its attachments are from Data Dimensions and are  
intended only for the use of the individual or entity to which it  
is addressed, and may contain information that is privileged,  
confidential, and exempt from disclosure under applicable law. If  
the reader of this message is not the intended recipient, or the  
employee or agent responsible for delivering the message to the  
intended recipient, you are hereby notified that any dissemination,  
distribution, or copying of this communication is strictly  
prohibited. If you have received this communication in error,  
please notify the sender immediately and permanently delete the  
original email and destroy any copies or printouts of this email as  
well as any attachments.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Patching Ceph cluster

2024-06-12 Thread Daniel Brown


I have two ansible roles, one for enter, one for exit. There’s likely better 
ways to do this — and I’ll not be surprised if someone here lets me know. 
They’re using orch commands via the cephadm shell. I’m using Ansible for other 
configuration management in my environment, as well, including setting up 
clients of the ceph cluster. 


Below excerpts from main.yml in the “tasks” for the enter/exit roles. The host 
I’m running ansible from is one of my CEPH servers - I’ve limited which process 
run there though so it’s in the cluster but not equal to the others. 


—
Enter
—

- name: Ceph Maintenance Mode Enter
  shell:

cmd: ' cephadm shell ceph orch host maintenance enter {{ 
(ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} --force 
--yes-i-really-mean-it ‘
  become: True



—
Exit
— 


- name: Ceph Maintenance Mode Exit
  shell:
cmd: 'cephadm shell ceph orch host maintenance exit {{ 
(ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} ‘
  become: True
  connection: local


- name: Wait for Ceph to be available
  ansible.builtin.wait_for:
delay: 60
host: '{{ 
(ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }}’
port: 9100
  connection: local






> On Jun 12, 2024, at 11:28 AM, Michael Worsham  
> wrote:
> 
> Interesting. How do you set this "maintenance mode"? If you have a series of 
> documented steps that you have to do and could provide as an example, that 
> would be beneficial for my efforts.
> 
> We are in the process of standing up both a dev-test environment consisting 
> of 3 Ceph servers (strictly for testing purposes) and a new production 
> environment consisting of 20+ Ceph servers.
> 
> We are using Ubuntu 22.04.
> 
> -- Michael
> From: Daniel Brown 
> Sent: Wednesday, June 12, 2024 9:18 AM
> To: Anthony D'Atri 
> Cc: Michael Worsham ; ceph-users@ceph.io 
> 
> Subject: Re: [ceph-users] Patching Ceph cluster
>  This is an external email. Please take care when clicking links or opening 
> attachments. When in doubt, check with the Help Desk or Security.
> 
> 
> There’s also a Maintenance mode that you can set for each server, as you’re 
> doing updates, so that the cluster doesn’t try to move data from affected 
> OSD’s, while the server being updated is offline or down. I’ve worked some on 
> automating this with Ansible, but have found my process (and/or my cluster) 
> still requires some manual intervention while it’s running to get things done 
> cleanly.
> 
> 
> 
> > On Jun 12, 2024, at 8:49 AM, Anthony D'Atri  wrote:
> >
> > Do you mean patching the OS?
> >
> > If so, easy -- one node at a time, then after it comes back up, wait until 
> > all PGs are active+clean and the mon quorum is complete before proceeding.
> >
> >
> >
> >> On Jun 12, 2024, at 07:56, Michael Worsham  
> >> wrote:
> >>
> >> What is the proper way to patch a Ceph cluster and reboot the servers in 
> >> said cluster if a reboot is necessary for said updates? And is it possible 
> >> to automate it via Ansible? This message and its attachments are from Data 
> >> Dimensions and are intended only for the use of the individual or entity 
> >> to which it is addressed, and may contain information that is privileged, 
> >> confidential, and exempt from disclosure under applicable law. If the 
> >> reader of this message is not the intended recipient, or the employee or 
> >> agent responsible for delivering the message to the intended recipient, 
> >> you are hereby notified that any dissemination, distribution, or copying 
> >> of this communication is strictly prohibited. If you have received this 
> >> communication in error, please notify the sender immediately and 
> >> permanently delete the original email and destroy any copies or printouts 
> >> of this email as well as any attachments.
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> This message and its attachments are from Data Dimensions and are intended 
> only for the use of the individual or entity to which it is addressed, and 
> may contain information that is privileged, confidential, and exempt from 
> disclosure under applicable law. If the reader of this message is not the 
> intended recipient, or the employee or agent responsible for delivering the 
> message to the intended recipient, you are hereby notified that any 
> dissemination, distribution, or copying of this communication is strictly 
> prohibited. If you have received this communication in error, please notify 
> the sender immediately and permanently delete the original email and destroy 
> any copies or printouts of this email as well as any attachments.

___
c

[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Marc







On 12 June 2024 13:19:10 UTC, "Lars Köppel"  wrote:
>I am happy to help you with as much information as possible. I probably
>just don't know where to look for it.
>Below are the requested information. The cluster is rebuilding the
>zapped OSD at the moment. This will probably take the next few days.
>
>
>sudo ceph pg ls-by-pool metadata
>PG OBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES OMAP_BYTES*
> OMAP_KEYS*  LOG   LOG_DUPS  STATE
> SINCE  VERSION  REPORTED UP ACTING
>SCRUB_STAMP  DEEP_SCRUB_STAMP
>LAST_SCRUB_DURATION  SCRUB_SCHEDULING
>10.0   5217325   4994695  00   4194304   5880891340
>9393865  1885  3000  active+undersized+degraded+remapped+backfill_wait
>2h  79875'180849582  79875:391519635  [0,1,2]p0  [1,2]p1
> 2024-06-11T09:08:09.829362+  2024-05-28T05:52:59.321589+
>   627  periodic scrub scheduled @ 2024-06-17T08:21:31.808348+
>10.1   5214785   5193424  00 0   5843682713
>9410150  1912  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'180914288  79875:342746928  [2,1,0]p2  [2,1]p2
> 2024-06-01T15:56:28.927288+  2024-05-27T03:31:37.682966+
>   966  queued for scrub
>10.2   5218432   5187168  00 0   6402011266
>9812513  1874  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'180970531  79875:341340204  [0,1,2]p0  [1,2]p1
> 2024-06-11T13:40:58.994256+  2024-06-11T13:40:58.994256+
>  1942  periodic scrub scheduled @ 2024-06-17T06:07:15.329675+
>10.3   5217413   5217413  00   8388788   5766005023
>9271787  1923  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'181012233  79875:388295881  [1,0,2]p1  [1,2]p1
> 2024-06-12T00:35:56.965547+  2024-05-23T19:54:56.121729+
>   492  periodic scrub scheduled @ 2024-06-18T06:39:31.103864+
>10.4   5220069   5220069  00  12583466   6027548724
>9537290  1959  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'181576075  79875:405295868  [1,2,0]p1  [1,2]p1
> 2024-06-11T17:47:22.923514+  2024-05-31T02:06:55.339574+
>   581  periodic scrub scheduled @ 2024-06-17T00:59:37.214420+
>10.5   5216162   5211999  00   4194304   5941347251
>9542764  1930  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'180455793  79875:338418517  [2,1,0]p2  [2,1]p2
> 2024-06-11T22:50:16.170708+  2024-05-30T23:49:54.316379+
>   528  periodic scrub scheduled @ 2024-06-17T04:39:25.905185+
>10.6   5216100   4980459  00   4521984   6428088514
>9850762  1911  3000  active+undersized+degraded+remapped+backfill_wait
>2h  79875'184045876  79875:396809795  [0,2,1]p0  [1,2]p1
> 2024-06-11T22:24:05.102716+  2024-06-11T22:24:05.102716+
>  1082  periodic scrub scheduled @ 2024-06-17T07:58:44.289885+
>10.7   5218232   5218232  00   4194304   6377065363
>9849360  1919  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'182672562  79875:342449062  [1,0,2]p1  [1,2]p1
> 2024-06-11T06:22:15.689422+  2024-06-11T06:22:15.689422+
>  8225  periodic scrub scheduled @ 2024-06-17T13:05:59.225052+
>10.8   5219620   5182816  00 0   6167304290
>9691796  1896  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'179628377  79875:378022884  [2,1,0]p2  [2,1]p2
> 2024-06-11T22:06:01.386763+  2024-06-11T22:06:01.386763+
>  1286  periodic scrub scheduled @ 2024-06-17T07:54:54.133093+
>10.9   5219448   5164591  00   8388698   5796048346
>9338312  1868  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'181739392  79875:387412389  [2,1,0]p2  [2,1]p2
> 2024-06-12T05:21:00.586747+  2024-05-26T11:10:59.780673+
>   539  periodic scrub scheduled @ 2024-06-18T15:32:59.155092+
>10.a   5219861   5163635  00  12582912   5841839055
>9387200  1916  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'180205688  79875:379381294  [1,2,0]p1  [1,2]p1
> 2024-06-11T12:35:05.571200+  2024-05-22T11:07:16.041773+
>  1093  periodic deep scrub scheduled @ 2024-06-17T05:21:40.136463+
>10.b   5217949   5217949  00  16777216   5935863260
>9462127  1881  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'181655745  79875:343806807  [0,1,2]p0  [1,2]p1
> 2024-06-11T22:41:28.976920+  2024-05-26T08:43:29.217457+
>   520  periodic scrub scheduled @ 2024-06-17T17:44:32.764093+
>10.c   5221697   5217118  00   4194304   6015217841
>9574445  1928  3000  active+undersized+degraded+remapped+backfill_wait
>3h  79875'180892826  79875:341490398  [2,1,0]p2  [2,1]p2
> 2024-06-11T09:20:58.443473+  2024-05-30T00:13:50.306507+
>  

[ceph-users] Re: Safe to move misplaced hosts between failure domains in the crush tree?

2024-06-12 Thread Janne Johansson
> We made a mistake when we moved the servers physically so while the
> replica 3 is intact the crush tree is not accurate.
>
> If we just remedy the situation with "ceph osd crush move ceph-flashX
> datacenter=Y" we will just end up with a lot of misplaced data and some
> churn, right? Or will the affected pool go degraded/unavailable?

I know I am late here, but for the record, if you ask crush to change
in such a way that PGs are asked to move to "impossible" places, they
will just end up being remapped/misplaced and continue to serve data.
They will obviously not backfill anywhere, but they will also not
cause troubles apart from ceph -s telling you the whole pool(s) is
misplaced currently. Then you can revert the crush change and
everything goes back to normal again.

I have made such "mistakes" several times, and ceph kept going even
though I panicked and flailed with my arms a lot until I managed to
revert the bad crush map changes.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Safe to move misplaced hosts between failure domains in the crush tree?

2024-06-12 Thread Torkil Svensgaard

Hi

We have 3 servers for replica 3 with failure domain datacenter:

  -1 4437.29248  root default 

 -33 1467.84814  datacenter 714 

 -69   69.86389  host ceph-flash1 

 -34 1511.25378  datacenter HX1 

 -73   69.86389  host ceph-flash2 

 -36 1458.19067  datacenter UXH 

 -77   69.86389  host ceph-flash3 



We made a mistake when we moved the servers physically so while the 
replica 3 is intact the crush tree is not accurate.


If we just remedy the situation with "ceph osd crush move ceph-flashX 
datacenter=Y" we will just end up with a lot of misplaced data and some 
churn, right? Or will the affected pool go degraded/unavailable?


Mvh.

Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Safe to move misplaced hosts between failure domains in the crush tree?

2024-06-12 Thread Matthias Grandl
Correct, this should only result in misplaced objects. 

> We made a mistake when we moved the servers physically so while the replica 3 
> is intact the crush tree is not accurate.

Can you elaborate on that? Does this mean after the move, multiple hosts are 
inside the same physical datacenter? In that case, once you correct the CRUSH 
layout, you would be running misplaced without a way to rebalance pools that 
are you using a datacenter crush rule.

Cheers!

--

Matthias Grandl
Head Storage Engineer
matthias.gra...@croit.io 

Looking for help with your Ceph cluster? Contact us at https://croit 
.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

> On 12. Jun 2024, at 09:13, Torkil Svensgaard  wrote:
> 
> Hi
> 
> We have 3 servers for replica 3 with failure domain datacenter:
> 
>  -1 4437.29248  root default 
> -33 1467.84814  datacenter 714 
> -69   69.86389  host ceph-flash1 
> -34 1511.25378  datacenter HX1 
> -73   69.86389  host ceph-flash2 
> -36 1458.19067  datacenter UXH 
> -77   69.86389  host ceph-flash3 
> 
> We made a mistake when we moved the servers physically so while the replica 3 
> is intact the crush tree is not accurate.
> 
> If we just remedy the situation with "ceph osd crush move ceph-flashX 
> datacenter=Y" we will just end up with a lot of misplaced data and some 
> churn, right? Or will the affected pool go degraded/unavailable?
> 
> Mvh.
> 
> Torkil
> -- 
> Torkil Svensgaard
> Sysadmin
> MR-Forskningssektionen, afs. 714
> DRCMR, Danish Research Centre for Magnetic Resonance
> Hvidovre Hospital
> Kettegård Allé 30
> DK-2650 Hvidovre
> Denmark
> Tel: +45 386 22828
> E-mail: tor...@drcmr.dk
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Safe to move misplaced hosts between failure domains in the crush tree?

2024-06-12 Thread Torkil Svensgaard



On 12/06/2024 10:22, Matthias Grandl wrote:

Correct, this should only result in misplaced objects.

 > We made a mistake when we moved the servers physically so while the 
replica 3 is intact the crush tree is not accurate.


Can you elaborate on that? Does this mean after the move, multiple hosts 
are inside the same physical datacenter? In that case, once you correct 
the CRUSH layout, you would be running misplaced without a way to 
rebalance pools that are you using a datacenter crush rule.


Hi Matthias

Thanks for replying. Two of the three hosts was swapped so I would do:

ceph osd crush move ceph-flash1 datacenter=HX1
ceph osd crush move ceph-flash2 datacenter=714


And end up with 2/3 misplaced:

  -1 4437.29248  root default
 -33 1467.84814  datacenter 714
 -69   69.86389  host ceph-flash2
 -34 1511.25378  datacenter HX1
 -73   69.86389  host ceph-flash1
 -36 1458.19067  datacenter UXH
 -77   69.86389  host ceph-flash3

It would only briefly be invalid between the two commands.

Mvh.

Torkil



Cheers!

--

Matthias Grandl
Head Storage Engineer
matthias.gra...@croit.io 

Looking for help with your Ceph cluster? Contact us at https://croit 
.io


croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On 12. Jun 2024, at 09:13, Torkil Svensgaard  wrote:

Hi

We have 3 servers for replica 3 with failure domain datacenter:

 -1 4437.29248  root default
-33 1467.84814  datacenter 714
-69   69.86389  host ceph-flash1
-34 1511.25378  datacenter HX1
-73   69.86389  host ceph-flash2
-36 1458.19067  datacenter UXH
-77   69.86389  host ceph-flash3

We made a mistake when we moved the servers physically so while the 
replica 3 is intact the crush tree is not accurate.


If we just remedy the situation with "ceph osd crush move ceph-flashX 
datacenter=Y" we will just end up with a lot of misplaced data and 
some churn, right? Or will the affected pool go degraded/unavailable?


Mvh.

Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Safe to move misplaced hosts between failure domains in the crush tree?

2024-06-12 Thread Matthias Grandl
Yeah that should work no problem.

In this case I would even recommend setting `norebalance` and using the trusty 
old upmap-remapped script (credits to Cern), to avoid unnecessary data 
movements: 
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py

Cheers!
--

Matthias Grandl
Head Storage Engineer
matthias.gra...@croit.io 

Looking for help with your Ceph cluster? Contact us at https://croit 
.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

> On 12. Jun 2024, at 09:33, Torkil Svensgaard  wrote:
> 
> 
> 
> On 12/06/2024 10:22, Matthias Grandl wrote:
>> Correct, this should only result in misplaced objects.
>> > We made a mistake when we moved the servers physically so while the 
>> > replica 3 is intact the crush tree is not accurate.
>> Can you elaborate on that? Does this mean after the move, multiple hosts are 
>> inside the same physical datacenter? In that case, once you correct the 
>> CRUSH layout, you would be running misplaced without a way to rebalance 
>> pools that are you using a datacenter crush rule.
> 
> Hi Matthias
> 
> Thanks for replying. Two of the three hosts was swapped so I would do:
> 
> ceph osd crush move ceph-flash1 datacenter=HX1
> ceph osd crush move ceph-flash2 datacenter=714
> 
> 
> And end up with 2/3 misplaced:
> 
>  -1 4437.29248  root default
> -33 1467.84814  datacenter 714
> -69   69.86389  host ceph-flash2
> -34 1511.25378  datacenter HX1
> -73   69.86389  host ceph-flash1
> -36 1458.19067  datacenter UXH
> -77   69.86389  host ceph-flash3
> 
> It would only briefly be invalid between the two commands.
> 
> Mvh.
> 
> Torkil
> 
> 
>> Cheers!
>> --
>> Matthias Grandl
>> Head Storage Engineer
>> matthias.gra...@croit.io  
>> 
>> Looking for help with your Ceph cluster? Contact us at 
>> https://www.google.com/url?q=https://croit&source=gmail-imap&ust=171878601300&usg=AOvVaw20IeQOFWA32aJbNYESGju_
>>  
>> .io
>> croit GmbH, Freseniusstr. 31h, 81247 Munich
>> CEO: Martin Verges - VAT-ID: DE310638492
>> Com. register: Amtsgericht Munich HRB 231263
>> Web: 
>> https://www.google.com/url?q=https://croit.io&source=gmail-imap&ust=171878601300&usg=AOvVaw0xp2aiklN5gc6S8d5AQNDl
>>  | YouTube: 
>> https://www.google.com/url?q=https://goo.gl/PGE1Bx&source=gmail-imap&ust=171878601300&usg=AOvVaw34AhK3lh0-7mFgZUkp4v1g
>>> On 12. Jun 2024, at 09:13, Torkil Svensgaard  wrote:
>>> 
>>> Hi
>>> 
>>> We have 3 servers for replica 3 with failure domain datacenter:
>>> 
>>>  -1 4437.29248  root default
>>> -33 1467.84814  datacenter 714
>>> -69   69.86389  host ceph-flash1
>>> -34 1511.25378  datacenter HX1
>>> -73   69.86389  host ceph-flash2
>>> -36 1458.19067  datacenter UXH
>>> -77   69.86389  host ceph-flash3
>>> 
>>> We made a mistake when we moved the servers physically so while the replica 
>>> 3 is intact the crush tree is not accurate.
>>> 
>>> If we just remedy the situation with "ceph osd crush move ceph-flashX 
>>> datacenter=Y" we will just end up with a lot of misplaced data and some 
>>> churn, right? Or will the affected pool go degraded/unavailable?
>>> 
>>> Mvh.
>>> 
>>> Torkil
>>> -- 
>>> Torkil Svensgaard
>>> Sysadmin
>>> MR-Forskningssektionen, afs. 714
>>> DRCMR, Danish Research Centre for Magnetic Resonance
>>> Hvidovre Hospital
>>> Kettegård Allé 30
>>> DK-2650 Hvidovre
>>> Denmark
>>> Tel: +45 386 22828
>>> E-mail: tor...@drcmr.dk
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> -- 
> Torkil Svensgaard
> Sysadmin
> MR-Forskningssektionen, afs. 714
> DRCMR, Danish Research Centre for Magnetic Resonance
> Hvidovre Hospital
> Kettegård Allé 30
> DK-2650 Hvidovre
> Denmark
> Tel: +45 386 22828
> E-mail: tor...@drcmr.dk 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Lars Köppel
Since my last update the size of the largest OSD increased by 0.4 TiB while
the smallest one only increased by 0.1 TiB. How is this possible?

Because the metadata pool reported to have only 900MB space left, I stopped
the hot-standby MDS. This gave me 8GB back but these filled up in the last
2h.
I think I have to zap the next OSD because the filesystem is getting read
only...

How is it possible that an OSD has over 1 TiB less data on it after a
rebuild? And how is it possible to have so different sizes of OSDs?


[image: ariadne.ai Logo] Lars Köppel
Developer
Email: lars.koep...@ariadne.ai
Phone: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai


On Tue, Jun 11, 2024 at 3:47 PM Lars Köppel  wrote:

> Only in warning mode. And there were no PG splits or merges in the last 2
> month.
>
>
> [image: ariadne.ai Logo] Lars Köppel
> Developer
> Email: lars.koep...@ariadne.ai
> Phone: +49 6221 5993580 <+4962215993580>
> ariadne.ai (Germany) GmbH
> Häusserstraße 3, 69115 Heidelberg
> Amtsgericht Mannheim, HRB 744040
> Geschäftsführer: Dr. Fabian Svara
> https://ariadne.ai
>
>
> On Tue, Jun 11, 2024 at 3:32 PM Eugen Block  wrote:
>
>> I don't think scrubs can cause this. Do you have autoscaler enabled?
>>
>> Zitat von Lars Köppel :
>>
>> > Hi,
>> >
>> > thank you for your response.
>> >
>> > I don't think this thread covers my problem, because the OSDs for the
>> > metadata pool fill up at different rates. So I would think this is no
>> > direct problem with the journal.
>> > Because we had earlier problems with the journal I changed some
>> > settings(see below). I already restarted all MDS multiple times but no
>> > change here.
>> >
>> > The health warnings regarding cache pressure resolve normally after a
>> > short period of time, when the heavy load on the client ends. Sometimes
>> it
>> > stays a bit longer because an rsync is running and copying data on the
>> > cluster(rsync is not good at releasing the caps).
>> >
>> > Could it be a problem if scrubs run most of the time in the background?
>> Can
>> > this block any other tasks or generate new data itself?
>> >
>> > Best regards,
>> > Lars
>> >
>> >
>> > global  basic mds_cache_memory_limit
>> > 17179869184
>> > global  advanced  mds_max_caps_per_client
>> >16384
>> > global  advanced
>> mds_recall_global_max_decay_threshold
>> >262144
>> > global  advanced  mds_recall_max_decay_rate
>> >1.00
>> > global  advanced  mds_recall_max_decay_threshold
>> > 262144
>> > mds advanced  mds_cache_trim_threshold
>> > 131072
>> > mds advanced  mds_heartbeat_grace
>> >120.00
>> > mds advanced  mds_heartbeat_reset_grace
>> >7400
>> > mds advanced  mds_tick_interval
>> >3.00
>> >
>> >
>> > [image: ariadne.ai Logo] Lars Köppel
>> > Developer
>> > Email: lars.koep...@ariadne.ai
>> > Phone: +49 6221 5993580 <+4962215993580>
>> > ariadne.ai (Germany) GmbH
>> > Häusserstraße 3, 69115 Heidelberg
>> > Amtsgericht Mannheim, HRB 744040
>> > Geschäftsführer: Dr. Fabian Svara
>> > https://ariadne.ai
>> >
>> >
>> > On Tue, Jun 11, 2024 at 2:05 PM Eugen Block  wrote:
>> >
>> >> Hi,
>> >>
>> >> can you check if this thread [1] applies to your situation? You don't
>> >> have multi-active MDS enabled, but maybe it's still some journal
>> >> trimming, or maybe misbehaving clients? In your first post there were
>> >> health warnings regarding cache pressure and cache size. Are those
>> >> resolved?
>> >>
>> >> [1]
>> >>
>> >>
>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7U27L27FHHPDYGA6VNNVWGLTXCGP7X23/#VOOV235D4TP5TEOJUWHF4AVXIOTHYQQE
>> >>
>> >> Zitat von Lars Köppel :
>> >>
>> >> > Hello everyone,
>> >> >
>> >> > short update to this problem.
>> >> > The zapped OSD is rebuilt and it has now 1.9 TiB (the expected size
>> >> ~50%).
>> >> > The other 2 OSDs are now at 2.8 respectively 3.2 TiB. They jumped up
>> and
>> >> > down a lot but the higher one has now also reached 'nearfull'
>> status. How
>> >> > is this possible? What is going on?
>> >> >
>> >> > Does anyone have a solution how to fix this without zapping the OSD?
>> >> >
>> >> > Best regards,
>> >> > Lars
>> >> >
>> >> >
>> >> > [image: ariadne.ai Logo] Lars Köppel
>> >> > Developer
>> >> > Email: lars.koep...@ariadne.ai
>> >> > Phone: +49 6221 5993580 <+4962215993580>
>> >> > ariadne.ai (Germany) GmbH
>> >> > Häusserstraße 3, 69115 Heidelberg
>> >> > Amtsgericht Mannheim, HRB 744040
>> >> > Geschäftsführer: Dr. Fabian Svara
>> >> > https://ariadne.ai
>> >> > ___
>> >> > ceph-users mailing list -- ceph-users@ceph.io
>> >> > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >>
>> 

[ceph-users] How radosgw considers that the file upload is done?

2024-06-12 Thread Szabo, Istvan (Agoda)
Hi,

Wonder how radosgw knows that a transaction is done and didn't break the 
connection between the user interface and gateway?

Let's see this is one request:

2024-06-12T16:26:03.386+0700 7fa34c7f0700  1 beast: 0x7fa5bc776750: 1.1.1.1 - - 
[2024-06-12T16:26:03.386063+0700] "PUT /bucket/0/2/966394.delta HTTP/1.1" 200 
238 - "User-Agent: APN/1.0 Hortonworks/1.0 HDP/3.1.0.0-78, Hadoop 3.2.2, 
aws-sdk-java/1.11.563 Linux/5.15.0-101-generic 
OpenJDK_64-Bit_Server_VM/11.0.18+10-post-Debian-1deb10u1 java/11.0.18 
scala/2.12.15 vendor/Debian 
com.amazonaws.services.s3.transfer.TransferManager/1.11.563" -
2024-06-12T16:26:03.386+0700 7fa4e9ffb700  1 == req done req=0x7fa5a4572750 
op status=0 http_status=200 latency=737ns ==

What I can see here is the
req done
op status=0

I guess if the connection broke between user and gateway the req will be done 
also, but what s op status? Is it the one that I'm actually looking for? If 
connection broke maybe that is different value?

Thank you


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Safe to move misplaced hosts between failure domains in the crush tree?

2024-06-12 Thread Torkil Svensgaard



On 12/06/2024 11:20, Matthias Grandl wrote:

Yeah that should work no problem.

In this case I would even recommend setting `norebalance` and using the 
trusty old upmap-remapped script (credits to Cern), to avoid unnecessary 
data movements: 
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py 


Worked like a charm with hardly any data movement. I used the pgremapper 
tool[1] just in case and now letting the balancer do its thing.



Cheers!


Thanks!

Mvh.

Torkil

[1] https://github.com/digitalocean/pgremapper


--

Matthias Grandl
Head Storage Engineer
matthias.gra...@croit.io 

Looking for help with your Ceph cluster? Contact us at https://croit 
.io


croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


On 12. Jun 2024, at 09:33, Torkil Svensgaard  wrote:



On 12/06/2024 10:22, Matthias Grandl wrote:

Correct, this should only result in misplaced objects.
> We made a mistake when we moved the servers physically so while the 
replica 3 is intact the crush tree is not accurate.
Can you elaborate on that? Does this mean after the move, multiple 
hosts are inside the same physical datacenter? In that case, once you 
correct the CRUSH layout, you would be running misplaced without a 
way to rebalance pools that are you using a datacenter crush rule.


Hi Matthias

Thanks for replying. Two of the three hosts was swapped so I would do:

ceph osd crush move ceph-flash1 datacenter=HX1
ceph osd crush move ceph-flash2 datacenter=714


And end up with 2/3 misplaced:

 -1 4437.29248  root default
-33 1467.84814  datacenter 714
-69   69.86389  host ceph-flash2
-34 1511.25378  datacenter HX1
-73   69.86389  host ceph-flash1
-36 1458.19067  datacenter UXH
-77   69.86389  host ceph-flash3

It would only briefly be invalid between the two commands.

Mvh.

Torkil



Cheers!
--
Matthias Grandl
Head Storage Engineer
matthias.gra...@croit.io 
>
Looking for help with your Ceph cluster? Contact us 
athttps://www.google.com/url?q=https://croit&source=gmail-imap&ust=171878601300&usg=AOvVaw20IeQOFWA32aJbNYESGju_ >.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: 
https://www.google.com/url?q=https://croit.io&source=gmail-imap&ust=171878601300&usg=AOvVaw0xp2aiklN5gc6S8d5AQNDl  | YouTube: https://www.google.com/url?q=https://goo.gl/PGE1Bx&source=gmail-imap&ust=171878601300&usg=AOvVaw34AhK3lh0-7mFgZUkp4v1g 

On 12. Jun 2024, at 09:13, Torkil Svensgaard  wrote:

Hi

We have 3 servers for replica 3 with failure domain datacenter:

 -1 4437.29248  root default
-33 1467.84814  datacenter 714
-69   69.86389  host ceph-flash1
-34 1511.25378  datacenter HX1
-73   69.86389  host ceph-flash2
-36 1458.19067  datacenter UXH
-77   69.86389  host ceph-flash3

We made a mistake when we moved the servers physically so while the 
replica 3 is intact the crush tree is not accurate.


If we just remedy the situation with "ceph osd crush move 
ceph-flashX datacenter=Y" we will just end up with a lot of 
misplaced data and some churn, right? Or will the affected pool go 
degraded/unavailable?


Mvh.

Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail:tor...@drcmr.dk 




--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Res

[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Eugen Block
I don't have any good explanation at this point. Can you share some  
more information like:


ceph pg ls-by-pool 
ceph osd df (for the relevant OSDs)
ceph df

Thanks,
Eugen

Zitat von Lars Köppel :


Since my last update the size of the largest OSD increased by 0.4 TiB while
the smallest one only increased by 0.1 TiB. How is this possible?

Because the metadata pool reported to have only 900MB space left, I stopped
the hot-standby MDS. This gave me 8GB back but these filled up in the last
2h.
I think I have to zap the next OSD because the filesystem is getting read
only...

How is it possible that an OSD has over 1 TiB less data on it after a
rebuild? And how is it possible to have so different sizes of OSDs?


[image: ariadne.ai Logo] Lars Köppel
Developer
Email: lars.koep...@ariadne.ai
Phone: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai


On Tue, Jun 11, 2024 at 3:47 PM Lars Köppel  wrote:


Only in warning mode. And there were no PG splits or merges in the last 2
month.


[image: ariadne.ai Logo] Lars Köppel
Developer
Email: lars.koep...@ariadne.ai
Phone: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai


On Tue, Jun 11, 2024 at 3:32 PM Eugen Block  wrote:


I don't think scrubs can cause this. Do you have autoscaler enabled?

Zitat von Lars Köppel :

> Hi,
>
> thank you for your response.
>
> I don't think this thread covers my problem, because the OSDs for the
> metadata pool fill up at different rates. So I would think this is no
> direct problem with the journal.
> Because we had earlier problems with the journal I changed some
> settings(see below). I already restarted all MDS multiple times but no
> change here.
>
> The health warnings regarding cache pressure resolve normally after a
> short period of time, when the heavy load on the client ends. Sometimes
it
> stays a bit longer because an rsync is running and copying data on the
> cluster(rsync is not good at releasing the caps).
>
> Could it be a problem if scrubs run most of the time in the background?
Can
> this block any other tasks or generate new data itself?
>
> Best regards,
> Lars
>
>
> global  basic mds_cache_memory_limit
> 17179869184
> global  advanced  mds_max_caps_per_client
>16384
> global  advanced
mds_recall_global_max_decay_threshold
>262144
> global  advanced  mds_recall_max_decay_rate
>1.00
> global  advanced  mds_recall_max_decay_threshold
> 262144
> mds advanced  mds_cache_trim_threshold
> 131072
> mds advanced  mds_heartbeat_grace
>120.00
> mds advanced  mds_heartbeat_reset_grace
>7400
> mds advanced  mds_tick_interval
>3.00
>
>
> [image: ariadne.ai Logo] Lars Köppel
> Developer
> Email: lars.koep...@ariadne.ai
> Phone: +49 6221 5993580 <+4962215993580>
> ariadne.ai (Germany) GmbH
> Häusserstraße 3, 69115 Heidelberg
> Amtsgericht Mannheim, HRB 744040
> Geschäftsführer: Dr. Fabian Svara
> https://ariadne.ai
>
>
> On Tue, Jun 11, 2024 at 2:05 PM Eugen Block  wrote:
>
>> Hi,
>>
>> can you check if this thread [1] applies to your situation? You don't
>> have multi-active MDS enabled, but maybe it's still some journal
>> trimming, or maybe misbehaving clients? In your first post there were
>> health warnings regarding cache pressure and cache size. Are those
>> resolved?
>>
>> [1]
>>
>>
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7U27L27FHHPDYGA6VNNVWGLTXCGP7X23/#VOOV235D4TP5TEOJUWHF4AVXIOTHYQQE
>>
>> Zitat von Lars Köppel :
>>
>> > Hello everyone,
>> >
>> > short update to this problem.
>> > The zapped OSD is rebuilt and it has now 1.9 TiB (the expected size
>> ~50%).
>> > The other 2 OSDs are now at 2.8 respectively 3.2 TiB. They jumped up
and
>> > down a lot but the higher one has now also reached 'nearfull'
status. How
>> > is this possible? What is going on?
>> >
>> > Does anyone have a solution how to fix this without zapping the OSD?
>> >
>> > Best regards,
>> > Lars
>> >
>> >
>> > [image: ariadne.ai Logo] Lars Köppel
>> > Developer
>> > Email: lars.koep...@ariadne.ai
>> > Phone: +49 6221 5993580 <+4962215993580>
>> > ariadne.ai (Germany) GmbH
>> > Häusserstraße 3, 69115 Heidelberg
>> > Amtsgericht Mannheim, HRB 744040
>> > Geschäftsführer: Dr. Fabian Svara
>> > https://ariadne.ai
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le

[ceph-users] Patching Ceph cluster

2024-06-12 Thread Michael Worsham
What is the proper way to patch a Ceph cluster and reboot the servers in said 
cluster if a reboot is necessary for said updates? And is it possible to 
automate it via Ansible? This message and its attachments are from Data 
Dimensions and are intended only for the use of the individual or entity to 
which it is addressed, and may contain information that is privileged, 
confidential, and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution, or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately and permanently delete the 
original email and destroy any copies or printouts of this email as well as any 
attachments.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Patching Ceph cluster

2024-06-12 Thread Anthony D'Atri
Do you mean patching the OS?

If so, easy -- one node at a time, then after it comes back up, wait until all 
PGs are active+clean and the mon quorum is complete before proceeding.



> On Jun 12, 2024, at 07:56, Michael Worsham  
> wrote:
> 
> What is the proper way to patch a Ceph cluster and reboot the servers in said 
> cluster if a reboot is necessary for said updates? And is it possible to 
> automate it via Ansible? This message and its attachments are from Data 
> Dimensions and are intended only for the use of the individual or entity to 
> which it is addressed, and may contain information that is privileged, 
> confidential, and exempt from disclosure under applicable law. If the reader 
> of this message is not the intended recipient, or the employee or agent 
> responsible for delivering the message to the intended recipient, you are 
> hereby notified that any dissemination, distribution, or copying of this 
> communication is strictly prohibited. If you have received this communication 
> in error, please notify the sender immediately and permanently delete the 
> original email and destroy any copies or printouts of this email as well as 
> any attachments.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Anthony D'Atri
If you have:

* pg_num too low (defaults are too low)
* pg_num not a power of 2
* pg_num != number of OSDs in the pool
* balancer not enabled

any of those might result in imbalance.

> On Jun 12, 2024, at 07:33, Eugen Block  wrote:
> 
> I don't have any good explanation at this point. Can you share some more 
> information like:
> 
> ceph pg ls-by-pool 
> ceph osd df (for the relevant OSDs)
> ceph df
> 
> Thanks,
> Eugen
> 
> Zitat von Lars Köppel :
> 
>> Since my last update the size of the largest OSD increased by 0.4 TiB while
>> the smallest one only increased by 0.1 TiB. How is this possible?
>> 
>> Because the metadata pool reported to have only 900MB space left, I stopped
>> the hot-standby MDS. This gave me 8GB back but these filled up in the last
>> 2h.
>> I think I have to zap the next OSD because the filesystem is getting read
>> only...
>> 
>> How is it possible that an OSD has over 1 TiB less data on it after a
>> rebuild? And how is it possible to have so different sizes of OSDs?
>> 
>> 
>> [image: ariadne.ai Logo] Lars Köppel
>> Developer
>> Email: lars.koep...@ariadne.ai
>> Phone: +49 6221 5993580 <+4962215993580>
>> ariadne.ai (Germany) GmbH
>> Häusserstraße 3, 69115 Heidelberg
>> Amtsgericht Mannheim, HRB 744040
>> Geschäftsführer: Dr. Fabian Svara
>> https://ariadne.ai
>> 
>> 
>> On Tue, Jun 11, 2024 at 3:47 PM Lars Köppel  wrote:
>> 
>>> Only in warning mode. And there were no PG splits or merges in the last 2
>>> month.
>>> 
>>> 
>>> [image: ariadne.ai Logo] Lars Köppel
>>> Developer
>>> Email: lars.koep...@ariadne.ai
>>> Phone: +49 6221 5993580 <+4962215993580>
>>> ariadne.ai (Germany) GmbH
>>> Häusserstraße 3, 69115 Heidelberg
>>> Amtsgericht Mannheim, HRB 744040
>>> Geschäftsführer: Dr. Fabian Svara
>>> https://ariadne.ai
>>> 
>>> 
>>> On Tue, Jun 11, 2024 at 3:32 PM Eugen Block  wrote:
>>> 
 I don't think scrubs can cause this. Do you have autoscaler enabled?
 
 Zitat von Lars Köppel :
 
 > Hi,
 >
 > thank you for your response.
 >
 > I don't think this thread covers my problem, because the OSDs for the
 > metadata pool fill up at different rates. So I would think this is no
 > direct problem with the journal.
 > Because we had earlier problems with the journal I changed some
 > settings(see below). I already restarted all MDS multiple times but no
 > change here.
 >
 > The health warnings regarding cache pressure resolve normally after a
 > short period of time, when the heavy load on the client ends. Sometimes
 it
 > stays a bit longer because an rsync is running and copying data on the
 > cluster(rsync is not good at releasing the caps).
 >
 > Could it be a problem if scrubs run most of the time in the background?
 Can
 > this block any other tasks or generate new data itself?
 >
 > Best regards,
 > Lars
 >
 >
 > global  basic mds_cache_memory_limit
 > 17179869184
 > global  advanced  mds_max_caps_per_client
 >16384
 > global  advanced
 mds_recall_global_max_decay_threshold
 >262144
 > global  advanced  mds_recall_max_decay_rate
 >1.00
 > global  advanced  mds_recall_max_decay_threshold
 > 262144
 > mds advanced  mds_cache_trim_threshold
 > 131072
 > mds advanced  mds_heartbeat_grace
 >120.00
 > mds advanced  mds_heartbeat_reset_grace
 >7400
 > mds advanced  mds_tick_interval
 >3.00
 >
 >
 > [image: ariadne.ai Logo] Lars Köppel
 > Developer
 > Email: lars.koep...@ariadne.ai
 > Phone: +49 6221 5993580 <+4962215993580>
 > ariadne.ai (Germany) GmbH
 > Häusserstraße 3, 69115 Heidelberg
 > Amtsgericht Mannheim, HRB 744040
 > Geschäftsführer: Dr. Fabian Svara
 > https://ariadne.ai
 >
 >
 > On Tue, Jun 11, 2024 at 2:05 PM Eugen Block  wrote:
 >
 >> Hi,
 >>
 >> can you check if this thread [1] applies to your situation? You don't
 >> have multi-active MDS enabled, but maybe it's still some journal
 >> trimming, or maybe misbehaving clients? In your first post there were
 >> health warnings regarding cache pressure and cache size. Are those
 >> resolved?
 >>
 >> [1]
 >>
 >>
 https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/7U27L27FHHPDYGA6VNNVWGLTXCGP7X23/#VOOV235D4TP5TEOJUWHF4AVXIOTHYQQE
 >>
 >> Zitat von Lars Köppel :
 >>
 >> > Hello everyone,
 >> >
 >> > short update to this problem.
 >> > The zapped OSD is rebuilt and it has now 1.9 TiB (the expected size
 >> ~50%).
 >> > The other 2 OSDs are now at 2.8 respectively 3.2 TiB. They jumped up
 and
 >> > down a lot but the higher one has now

[ceph-users] Re: Patching Ceph cluster

2024-06-12 Thread Daniel Brown


There’s also a Maintenance mode that you can set for each server, as you’re 
doing updates, so that the cluster doesn’t try to move data from affected 
OSD’s, while the server being updated is offline or down. I’ve worked some on 
automating this with Ansible, but have found my process (and/or my cluster) 
still requires some manual intervention while it’s running to get things done 
cleanly. 



> On Jun 12, 2024, at 8:49 AM, Anthony D'Atri  wrote:
> 
> Do you mean patching the OS?
> 
> If so, easy -- one node at a time, then after it comes back up, wait until 
> all PGs are active+clean and the mon quorum is complete before proceeding.
> 
> 
> 
>> On Jun 12, 2024, at 07:56, Michael Worsham  
>> wrote:
>> 
>> What is the proper way to patch a Ceph cluster and reboot the servers in 
>> said cluster if a reboot is necessary for said updates? And is it possible 
>> to automate it via Ansible? This message and its attachments are from Data 
>> Dimensions and are intended only for the use of the individual or entity to 
>> which it is addressed, and may contain information that is privileged, 
>> confidential, and exempt from disclosure under applicable law. If the reader 
>> of this message is not the intended recipient, or the employee or agent 
>> responsible for delivering the message to the intended recipient, you are 
>> hereby notified that any dissemination, distribution, or copying of this 
>> communication is strictly prohibited. If you have received this 
>> communication in error, please notify the sender immediately and permanently 
>> delete the original email and destroy any copies or printouts of this email 
>> as well as any attachments.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS metadata pool size

2024-06-12 Thread Lars Köppel
I am happy to help you with as much information as possible. I probably
just don't know where to look for it.
Below are the requested information. The cluster is rebuilding the
zapped OSD at the moment. This will probably take the next few days.


sudo ceph pg ls-by-pool metadata
PG OBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES OMAP_BYTES*
 OMAP_KEYS*  LOG   LOG_DUPS  STATE
 SINCE  VERSION  REPORTED UP ACTING
SCRUB_STAMP  DEEP_SCRUB_STAMP
LAST_SCRUB_DURATION  SCRUB_SCHEDULING
10.0   5217325   4994695  00   4194304   5880891340
9393865  1885  3000  active+undersized+degraded+remapped+backfill_wait
2h  79875'180849582  79875:391519635  [0,1,2]p0  [1,2]p1
 2024-06-11T09:08:09.829362+  2024-05-28T05:52:59.321589+
   627  periodic scrub scheduled @ 2024-06-17T08:21:31.808348+
10.1   5214785   5193424  00 0   5843682713
9410150  1912  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180914288  79875:342746928  [2,1,0]p2  [2,1]p2
 2024-06-01T15:56:28.927288+  2024-05-27T03:31:37.682966+
   966  queued for scrub
10.2   5218432   5187168  00 0   6402011266
9812513  1874  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180970531  79875:341340204  [0,1,2]p0  [1,2]p1
 2024-06-11T13:40:58.994256+  2024-06-11T13:40:58.994256+
  1942  periodic scrub scheduled @ 2024-06-17T06:07:15.329675+
10.3   5217413   5217413  00   8388788   5766005023
9271787  1923  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'181012233  79875:388295881  [1,0,2]p1  [1,2]p1
 2024-06-12T00:35:56.965547+  2024-05-23T19:54:56.121729+
   492  periodic scrub scheduled @ 2024-06-18T06:39:31.103864+
10.4   5220069   5220069  00  12583466   6027548724
9537290  1959  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'181576075  79875:405295868  [1,2,0]p1  [1,2]p1
 2024-06-11T17:47:22.923514+  2024-05-31T02:06:55.339574+
   581  periodic scrub scheduled @ 2024-06-17T00:59:37.214420+
10.5   5216162   5211999  00   4194304   5941347251
9542764  1930  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180455793  79875:338418517  [2,1,0]p2  [2,1]p2
 2024-06-11T22:50:16.170708+  2024-05-30T23:49:54.316379+
   528  periodic scrub scheduled @ 2024-06-17T04:39:25.905185+
10.6   5216100   4980459  00   4521984   6428088514
9850762  1911  3000  active+undersized+degraded+remapped+backfill_wait
2h  79875'184045876  79875:396809795  [0,2,1]p0  [1,2]p1
 2024-06-11T22:24:05.102716+  2024-06-11T22:24:05.102716+
  1082  periodic scrub scheduled @ 2024-06-17T07:58:44.289885+
10.7   5218232   5218232  00   4194304   6377065363
9849360  1919  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'182672562  79875:342449062  [1,0,2]p1  [1,2]p1
 2024-06-11T06:22:15.689422+  2024-06-11T06:22:15.689422+
  8225  periodic scrub scheduled @ 2024-06-17T13:05:59.225052+
10.8   5219620   5182816  00 0   6167304290
9691796  1896  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'179628377  79875:378022884  [2,1,0]p2  [2,1]p2
 2024-06-11T22:06:01.386763+  2024-06-11T22:06:01.386763+
  1286  periodic scrub scheduled @ 2024-06-17T07:54:54.133093+
10.9   5219448   5164591  00   8388698   5796048346
9338312  1868  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'181739392  79875:387412389  [2,1,0]p2  [2,1]p2
 2024-06-12T05:21:00.586747+  2024-05-26T11:10:59.780673+
   539  periodic scrub scheduled @ 2024-06-18T15:32:59.155092+
10.a   5219861   5163635  00  12582912   5841839055
9387200  1916  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180205688  79875:379381294  [1,2,0]p1  [1,2]p1
 2024-06-11T12:35:05.571200+  2024-05-22T11:07:16.041773+
  1093  periodic deep scrub scheduled @ 2024-06-17T05:21:40.136463+
10.b   5217949   5217949  00  16777216   5935863260
9462127  1881  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'181655745  79875:343806807  [0,1,2]p0  [1,2]p1
 2024-06-11T22:41:28.976920+  2024-05-26T08:43:29.217457+
   520  periodic scrub scheduled @ 2024-06-17T17:44:32.764093+
10.c   5221697   5217118  00   4194304   6015217841
9574445  1928  3000  active+undersized+degraded+remapped+backfill_wait
3h  79875'180892826  79875:341490398  [2,1,0]p2  [2,1]p2
 2024-06-11T09:20:58.443473+  2024-05-30T00:13:50.306507+
   768  periodic scrub scheduled @ 2024-06-16T19:41:21.977436+
10.d   5217727   4908764  00 0   5825598519