| |
黄浩然
|
|
huanghaoran_1...@163.com
|


---- Replied Message ----
| From | <ceph-users-requ...@ceph.io> |
| Date | 04/25/2025 17:02 |
| To | <ceph-users@ceph.io> |
| Subject | ceph-users Digest, Vol 130, Issue 155 |
Send ceph-users mailing list submissions to
ceph-users@ceph.io

To subscribe or unsubscribe via email, send a message with subject or
body 'help' to
ceph-users-requ...@ceph.io

You can reach the person managing the list at
ceph-users-ow...@ceph.io

When replying, please edit your Subject line so it is more specific
than "Re: Contents of ceph-users digest..."

Today's Topics:

1. Re: Patching Ceph cluster (Sake Ceph)
2. Re: Patching Ceph cluster (Sake Ceph)


----------------------------------------------------------------------

Date: Fri, 25 Apr 2025 10:59:52 +0200 (CEST)
From: Sake Ceph <c...@paulusma.eu>
Subject: [ceph-users] Re: Patching Ceph cluster
To: Michael Worsham <mwors...@datadimensions.com>,
"ceph-users@ceph.io" <ceph-users@ceph.io>
Message-ID: <683056728.264684.1745571592...@webmail.strato.com>
Content-Type: text/plain; charset=UTF-8

The tiebreaker is for a stretch cluster, which we deployed. It's only used to 
assign the host to a group.

The Playbook is indeed written for RHEL, because that's the OS we use. It can 
be improved a lot, but it's a start for someone else. I know I still need to 
share this on GitHub, but to busy at the moment at work and at home.

Op 25-04-2025 06:05 CEST schreef Michael Worsham <mwors...@datadimensions.com>:


I've been reading over the playbook code, and it's nicely written. I know it's 
primarily RHEL focused, but I think it could be modified for Ubuntu/Debian 
platforms as well.


A couple of questions though...


In the test example hosts file, what is the tiebreaker?


I know there isn't a role in the roles folder, but do you have an example of 
one, just so we know what it does?


Thanks.


-- Michael




Get Outlook for Android (https://aka.ms/AAb9ysg)
------------------------------
From: Sake Ceph <c...@paulusma.eu>
Sent: Friday, June 14, 2024 4:28:34 AM
To: Michael Worsham <mwors...@datadimensions.com>; ceph-users@ceph.io 
<ceph-users@ceph.io>
Subject: Re: [ceph-users] Re: Patching Ceph cluster
This is an external email. Please take care when clicking links or opening 
attachments. When in doubt, check with the Help Desk or Security.


I needed to do some cleaning before I could share this :)
Maybe you or someone else can use it.

Kind regards,
Sake

Op 14-06-2024 03:53 CEST schreef Michael Worsham <mwors...@datadimensions.com>:


I'd love to see what your playbook(s) looks like for doing this.

-- Michael
________________________________
From: Sake Ceph <c...@paulusma.eu>
Sent: Thursday, June 13, 2024 4:05 PM
To: ceph-users@ceph.io <ceph-users@ceph.io>
Subject: [ceph-users] Re: Patching Ceph cluster

This is an external email. Please take care when clicking links or opening 
attachments. When in doubt, check with the Help Desk or Security.


Yeah we fully automated this with Ansible. In short we do the following.

1. Check if cluster is healthy before continuing (via REST-API) only health_ok 
is good
2. Disable scrub and deep-scrub
3. Update all applications on all the hosts in the cluster
4. For every host, one by one, do the following:
4a. Check if applications got updated
4b. Check via reboot-hint if a reboot is necessary
4c. If applications got updated or reboot is necessary, do the following :
4c1. Put host in maintenance
4c2. Reboot host if necessary
4c3. Check and wait via 'ceph orch host ls' if status of the host is maintance 
and nothing else
4c4. Get host out of maintenance
4d. Check if cluster is healthy before continuing (via Rest-API) only warning 
about scrub and deep-scrub is allowed, but no pg's should be degraded
5. Enable scrub and deep-scrub when all hosts are done
6. Check if cluster is healthy (via Rest-API) only health_ok is good
7. Done

For upgrade the OS we have something similar, but exiting maintenance mode is 
broken (with 17.2.7) :(
I need to check the tracker for similar issues and if I can't find anything, I 
will create a ticket.

Kind regards,
Sake

Op 12-06-2024 19:02 CEST schreef Daniel Brown <daniel.h.brown@thermify.cloud>:


I have two ansible roles, one for enter, one for exit. There’s likely better 
ways to do this — and I’ll not be surprised if someone here lets me know. 
They’re using orch commands via the cephadm shell. I’m using Ansible for other 
configuration management in my environment, as well, including setting up 
clients of the ceph cluster.


Below excerpts from main.yml in the “tasks” for the enter/exit roles. The host 
I’m running ansible from is one of my CEPH servers - I’ve limited which process 
run there though so it’s in the cluster but not equal to the others.


—————
Enter
—————

- name: Ceph Maintenance Mode Enter
shell:

cmd: ' cephadm shell ceph orch host maintenance enter {{ 
(ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} --force 
--yes-i-really-mean-it ‘
become: True



—————
Exit
—————


- name: Ceph Maintenance Mode Exit
shell:
cmd: 'cephadm shell ceph orch host maintenance exit {{ 
(ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} ‘
become: True
connection: local


- name: Wait for Ceph to be available
ansible.builtin.wait_for:
delay: 60
host: '{{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) 
}}’
port: 9100
connection: local






On Jun 12, 2024, at 11:28 AM, Michael Worsham <mwors...@datadimensions.com> 
wrote:

Interesting. How do you set this "maintenance mode"? If you have a series of 
documented steps that you have to do and could provide as an example, that 
would be beneficial for my efforts.

We are in the process of standing up both a dev-test environment consisting of 
3 Ceph servers (strictly for testing purposes) and a new production environment 
consisting of 20+ Ceph servers.

We are using Ubuntu 22.04.

-- Michael
From: Daniel Brown <daniel.h.brown@thermify.cloud>
Sent: Wednesday, June 12, 2024 9:18 AM
To: Anthony D'Atri <anthony.da...@gmail.com>
Cc: Michael Worsham <mwors...@datadimensions.com>; ceph-users@ceph.io 
<ceph-users@ceph.io>
Subject: Re: [ceph-users] Patching Ceph cluster
This is an external email. Please take care when clicking links or opening 
attachments. When in doubt, check with the Help Desk or Security.


There’s also a Maintenance mode that you can set for each server, as you’re 
doing updates, so that the cluster doesn’t try to move data from affected 
OSD’s, while the server being updated is offline or down. I’ve worked some on 
automating this with Ansible, but have found my process (and/or my cluster) 
still requires some manual intervention while it’s running to get things done 
cleanly.



On Jun 12, 2024, at 8:49 AM, Anthony D'Atri <anthony.da...@gmail.com> wrote:

Do you mean patching the OS?

If so, easy -- one node at a time, then after it comes back up, wait until all 
PGs are active+clean and the mon quorum is complete before proceeding.



On Jun 12, 2024, at 07:56, Michael Worsham <mwors...@datadimensions.com> wrote:

What is the proper way to patch a Ceph cluster and reboot the servers in said 
cluster if a reboot is necessary for said updates? And is it possible to 
automate it via Ansible? This message and its attachments are from Data 
Dimensions and are intended only for the use of the individual or entity to 
which it is addressed, and may contain information that is privileged, 
confidential, and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution, or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately and permanently delete the 
original email and destroy any copies or printouts of this email as well as any 
attachments.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

This message and its attachments are from Data Dimensions and are intended only 
for the use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify the 
sender immediately and permanently delete the original email and destroy any 
copies or printouts of this email as well as any attachments.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
This message and its attachments are from Data Dimensions and are intended only 
for the use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify the 
sender immediately and permanently delete the original email and destroy any 
copies or printouts of this email as well as any attachments.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

This message and its attachments are from Data Dimensions and are intended only 
for the use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify the 
sender immediately and permanently delete the original email and destroy any 
copies or printouts of this email as well as any attachments.

------------------------------

Date: Fri, 25 Apr 2025 11:02:28 +0200 (CEST)
From: Sake Ceph <c...@paulusma.eu>
Subject: [ceph-users] Re: Patching Ceph cluster
To: Lukasz Borek <luk...@borek.org.pl>
Cc: "ceph-users@ceph.io" <ceph-users@ceph.io>
Message-ID: <796303375.265011.1745571748...@webmail.strato.com>
Content-Type: text/plain; charset=UTF-8

Nope, it was really broken in 17.2.7. When RHEL 10 comes available, I will look 
into this part again :)

Op 25-04-2025 07:22 CEST schreef Lukasz Borek <luk...@borek.org.pl>:


For upgrade the OS we have something similar, but exiting maintenance mode is 
broken (with 17.2.7) :(
I need to check the tracker for similar issues and if I can't find anything, I 
will create a ticket
For 18.2.2 first maint exit command threw an exception for some reason. In my 
patching script I execute commands in a loop and the 2nd shoot usually works.

exit maint 1/3
Error EINVAL: Traceback (most recent call last):
File "/usr/share/ceph/mgr/mgr_module.py", line 1809, in _handle_command
return self.handle_command(inbuf, cmd)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 183, in 
handle_command
return dispatch[cmd['prefix']].call(self, cmd, inbuf)
File "/usr/share/ceph/mgr/mgr_module.py", line 474, in call
return self.func(mgr, **kwargs)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 119, in <lambda>
wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) # noqa: 
E731
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 108, in wrapper
return func(*args, **kwargs)
File "/usr/share/ceph/mgr/orchestrator/module.py", line 778, in 
_host_maintenance_exit
raise_if_exception(completion)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 237, in 
raise_if_exception
e = pickle.loads(c.serialized_exception)
TypeError: __init__() missing 2 required positional arguments: 'hostname' and 
'addr'

exit maint 2/3
Ceph cluster f3e63d9e-2f4c-11ef-87a2-0f1170f55ed5 on cephbackup-osd1 has exited 
maintenance mode
exit maint 3/3
Error EINVAL: Host cephbackup-osd1 is not in maintenance mode
Fri Apr 25 07:17:58 CEST 2025 cluster state is HEALTH_WARN
Fri Apr 25 07:18:02 CEST 2025 cluster state is HEALTH_WARN
[...]





On Thu, 13 Jun 2024 at 22:07, Sake Ceph <c...@paulusma.eu> wrote:


For upgrade the OS we have something similar, but exiting maintenance mode is 
broken (with 17.2.7) :(
I need to check the tracker for similar issues and if I can't find anything, I 
will create a ticket.

Kind regards,
Sake

Op 12-06-2024 19:02 CEST schreef Daniel Brown <daniel.h.brown@thermify.cloud>:


I have two ansible roles, one for enter, one for exit. There’s likely better 
ways to do this — and I’ll not be surprised if someone here lets me know. 
They’re using orch commands via the cephadm shell. I’m using Ansible for other 
configuration management in my environment, as well, including setting up 
clients of the ceph cluster.


Below excerpts from main.yml in the “tasks” for the enter/exit roles. The host 
I’m running ansible from is one of my CEPH servers - I’ve limited which process 
run there though so it’s in the cluster but not equal to the others.


—————
Enter
—————

- name: Ceph Maintenance Mode Enter
shell:

cmd: ' cephadm shell ceph orch host maintenance enter {{ 
(ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} --force 
--yes-i-really-mean-it ‘
become: True



—————
Exit
—————


- name: Ceph Maintenance Mode Exit
shell:
cmd: 'cephadm shell ceph orch host maintenance exit {{ 
(ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} ‘
become: True
connection: local


- name: Wait for Ceph to be available
ansible.builtin.wait_for:
delay: 60
host: '{{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) 
}}’
port: 9100
connection: local






On Jun 12, 2024, at 11:28 AM, Michael Worsham <mwors...@datadimensions.com> 
wrote:

Interesting. How do you set this "maintenance mode"? If you have a series of 
documented steps that you have to do and could provide as an example, that 
would be beneficial for my efforts.

We are in the process of standing up both a dev-test environment consisting of 
3 Ceph servers (strictly for testing purposes) and a new production environment 
consisting of 20+ Ceph servers.

We are using Ubuntu 22.04.

-- Michael
From: Daniel Brown <daniel.h.brown@thermify.cloud>
Sent: Wednesday, June 12, 2024 9:18 AM
To: Anthony D'Atri <anthony.da...@gmail.com>
Cc: Michael Worsham <mwors...@datadimensions.com>; ceph-users@ceph.io 
<ceph-users@ceph.io>
Subject: Re: [ceph-users] Patching Ceph cluster
This is an external email. Please take care when clicking links or opening 
attachments. When in doubt, check with the Help Desk or Security.


There’s also a Maintenance mode that you can set for each server, as you’re 
doing updates, so that the cluster doesn’t try to move data from affected 
OSD’s, while the server being updated is offline or down. I’ve worked some on 
automating this with Ansible, but have found my process (and/or my cluster) 
still requires some manual intervention while it’s running to get things done 
cleanly.



On Jun 12, 2024, at 8:49 AM, Anthony D'Atri <anthony.da...@gmail.com> wrote:

Do you mean patching the OS?

If so, easy -- one node at a time, then after it comes back up, wait until all 
PGs are active+clean and the mon quorum is complete before proceeding.



On Jun 12, 2024, at 07:56, Michael Worsham <mwors...@datadimensions.com> wrote:

What is the proper way to patch a Ceph cluster and reboot the servers in said 
cluster if a reboot is necessary for said updates? And is it possible to 
automate it via Ansible? This message and its attachments are from Data 
Dimensions and are intended only for the use of the individual or entity to 
which it is addressed, and may contain information that is privileged, 
confidential, and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, or the employee or agent 
responsible for delivering the message to the intended recipient, you are 
hereby notified that any dissemination, distribution, or copying of this 
communication is strictly prohibited. If you have received this communication 
in error, please notify the sender immediately and permanently delete the 
original email and destroy any copies or printouts of this email as well as any 
attachments.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

This message and its attachments are from Data Dimensions and are intended only 
for the use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify the 
sender immediately and permanently delete the original email and destroy any 
copies or printouts of this email as well as any attachments.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



--

Łukasz Borek
luk...@borek.org.pl

------------------------------

Subject: Digest Footer

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s


------------------------------

End of ceph-users Digest, Vol 130, Issue 155
********************************************
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to