I needed to do some cleaning before I could share this :) Maybe you or someone else can use it.
Kind regards, Sake > Op 14-06-2024 03:53 CEST schreef Michael Worsham > <mwors...@datadimensions.com>: > > > I'd love to see what your playbook(s) looks like for doing this. > > -- Michael > ________________________________ > From: Sake Ceph <c...@paulusma.eu> > Sent: Thursday, June 13, 2024 4:05 PM > To: ceph-users@ceph.io <ceph-users@ceph.io> > Subject: [ceph-users] Re: Patching Ceph cluster > > This is an external email. Please take care when clicking links or opening > attachments. When in doubt, check with the Help Desk or Security. > > > Yeah we fully automated this with Ansible. In short we do the following. > > 1. Check if cluster is healthy before continuing (via REST-API) only > health_ok is good > 2. Disable scrub and deep-scrub > 3. Update all applications on all the hosts in the cluster > 4. For every host, one by one, do the following: > 4a. Check if applications got updated > 4b. Check via reboot-hint if a reboot is necessary > 4c. If applications got updated or reboot is necessary, do the following : > 4c1. Put host in maintenance > 4c2. Reboot host if necessary > 4c3. Check and wait via 'ceph orch host ls' if status of the host is > maintance and nothing else > 4c4. Get host out of maintenance > 4d. Check if cluster is healthy before continuing (via Rest-API) only warning > about scrub and deep-scrub is allowed, but no pg's should be degraded > 5. Enable scrub and deep-scrub when all hosts are done > 6. Check if cluster is healthy (via Rest-API) only health_ok is good > 7. Done > > For upgrade the OS we have something similar, but exiting maintenance mode is > broken (with 17.2.7) :( > I need to check the tracker for similar issues and if I can't find anything, > I will create a ticket. > > Kind regards, > Sake > > > Op 12-06-2024 19:02 CEST schreef Daniel Brown > > <daniel.h.brown@thermify.cloud>: > > > > > > I have two ansible roles, one for enter, one for exit. There’s likely > > better ways to do this — and I’ll not be surprised if someone here lets me > > know. They’re using orch commands via the cephadm shell. I’m using Ansible > > for other configuration management in my environment, as well, including > > setting up clients of the ceph cluster. > > > > > > Below excerpts from main.yml in the “tasks” for the enter/exit roles. The > > host I’m running ansible from is one of my CEPH servers - I’ve limited > > which process run there though so it’s in the cluster but not equal to the > > others. > > > > > > ————— > > Enter > > ————— > > > > - name: Ceph Maintenance Mode Enter > > shell: > > > > cmd: ' cephadm shell ceph orch host maintenance enter {{ > > (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} > > --force --yes-i-really-mean-it ‘ > > become: True > > > > > > > > ————— > > Exit > > ————— > > > > > > - name: Ceph Maintenance Mode Exit > > shell: > > cmd: 'cephadm shell ceph orch host maintenance exit {{ > > (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} ‘ > > become: True > > connection: local > > > > > > - name: Wait for Ceph to be available > > ansible.builtin.wait_for: > > delay: 60 > > host: '{{ > > (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }}’ > > port: 9100 > > connection: local > > > > > > > > > > > > > > > On Jun 12, 2024, at 11:28 AM, Michael Worsham > > > <mwors...@datadimensions.com> wrote: > > > > > > Interesting. How do you set this "maintenance mode"? If you have a series > > > of documented steps that you have to do and could provide as an example, > > > that would be beneficial for my efforts. > > > > > > We are in the process of standing up both a dev-test environment > > > consisting of 3 Ceph servers (strictly for testing purposes) and a new > > > production environment consisting of 20+ Ceph servers. > > > > > > We are using Ubuntu 22.04. > > > > > > -- Michael > > > From: Daniel Brown <daniel.h.brown@thermify.cloud> > > > Sent: Wednesday, June 12, 2024 9:18 AM > > > To: Anthony D'Atri <anthony.da...@gmail.com> > > > Cc: Michael Worsham <mwors...@datadimensions.com>; ceph-users@ceph.io > > > <ceph-users@ceph.io> > > > Subject: Re: [ceph-users] Patching Ceph cluster > > > This is an external email. Please take care when clicking links or > > > opening attachments. When in doubt, check with the Help Desk or Security. > > > > > > > > > There’s also a Maintenance mode that you can set for each server, as > > > you’re doing updates, so that the cluster doesn’t try to move data from > > > affected OSD’s, while the server being updated is offline or down. I’ve > > > worked some on automating this with Ansible, but have found my process > > > (and/or my cluster) still requires some manual intervention while it’s > > > running to get things done cleanly. > > > > > > > > > > > > > On Jun 12, 2024, at 8:49 AM, Anthony D'Atri <anthony.da...@gmail.com> > > > > wrote: > > > > > > > > Do you mean patching the OS? > > > > > > > > If so, easy -- one node at a time, then after it comes back up, wait > > > > until all PGs are active+clean and the mon quorum is complete before > > > > proceeding. > > > > > > > > > > > > > > > >> On Jun 12, 2024, at 07:56, Michael Worsham > > > >> <mwors...@datadimensions.com> wrote: > > > >> > > > >> What is the proper way to patch a Ceph cluster and reboot the servers > > > >> in said cluster if a reboot is necessary for said updates? And is it > > > >> possible to automate it via Ansible? This message and its attachments > > > >> are from Data Dimensions and are intended only for the use of the > > > >> individual or entity to which it is addressed, and may contain > > > >> information that is privileged, confidential, and exempt from > > > >> disclosure under applicable law. If the reader of this message is not > > > >> the intended recipient, or the employee or agent responsible for > > > >> delivering the message to the intended recipient, you are hereby > > > >> notified that any dissemination, distribution, or copying of this > > > >> communication is strictly prohibited. If you have received this > > > >> communication in error, please notify the sender immediately and > > > >> permanently delete the original email and destroy any copies or > > > >> printouts of this email as well as any attachments. > > > >> _______________________________________________ > > > >> ceph-users mailing list -- ceph-users@ceph.io > > > >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > > _______________________________________________ > > > > ceph-users mailing list -- ceph-users@ceph.io > > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > This message and its attachments are from Data Dimensions and are > > > intended only for the use of the individual or entity to which it is > > > addressed, and may contain information that is privileged, confidential, > > > and exempt from disclosure under applicable law. If the reader of this > > > message is not the intended recipient, or the employee or agent > > > responsible for delivering the message to the intended recipient, you are > > > hereby notified that any dissemination, distribution, or copying of this > > > communication is strictly prohibited. If you have received this > > > communication in error, please notify the sender immediately and > > > permanently delete the original email and destroy any copies or printouts > > > of this email as well as any attachments. > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > This message and its attachments are from Data Dimensions and are intended > only for the use of the individual or entity to which it is addressed, and > may contain information that is privileged, confidential, and exempt from > disclosure under applicable law. If the reader of this message is not the > intended recipient, or the employee or agent responsible for delivering the > message to the intended recipient, you are hereby notified that any > dissemination, distribution, or copying of this communication is strictly > prohibited. If you have received this communication in error, please notify > the sender immediately and permanently delete the original email and destroy > any copies or printouts of this email as well as any attachments. > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io