I needed to do some cleaning before I could share this :) 
Maybe you or someone else can use it. 

Kind regards, 
Sake 

> Op 14-06-2024 03:53 CEST schreef Michael Worsham 
> <mwors...@datadimensions.com>:
> 
>  
> I'd love to see what your playbook(s) looks like for doing this.
> 
> -- Michael
> ________________________________
> From: Sake Ceph <c...@paulusma.eu>
> Sent: Thursday, June 13, 2024 4:05 PM
> To: ceph-users@ceph.io <ceph-users@ceph.io>
> Subject: [ceph-users] Re: Patching Ceph cluster
> 
> This is an external email. Please take care when clicking links or opening 
> attachments. When in doubt, check with the Help Desk or Security.
> 
> 
> Yeah we fully automated this with Ansible. In short we do the following.
> 
> 1. Check if cluster is healthy before continuing (via REST-API) only 
> health_ok is good
> 2. Disable scrub and deep-scrub
> 3. Update all applications on all the hosts in the cluster
> 4. For every host, one by one, do the following:
> 4a. Check if applications got updated
> 4b. Check via reboot-hint if a reboot is necessary
> 4c. If applications got updated or reboot is necessary, do the following :
> 4c1. Put host in maintenance
> 4c2. Reboot host if necessary
> 4c3. Check and wait via 'ceph orch host ls' if status of the host is 
> maintance and nothing else
> 4c4. Get host out of maintenance
> 4d. Check if cluster is healthy before continuing (via Rest-API) only warning 
> about scrub and deep-scrub is allowed, but no pg's should be degraded
> 5. Enable scrub and deep-scrub when all hosts are done
> 6. Check if cluster is healthy (via Rest-API) only health_ok is good
> 7. Done
> 
> For upgrade the OS we have something similar, but exiting maintenance mode is 
> broken (with 17.2.7) :(
> I need to check the tracker for similar issues and if I can't find anything, 
> I will create a ticket.
> 
> Kind regards,
> Sake
> 
> > Op 12-06-2024 19:02 CEST schreef Daniel Brown 
> > <daniel.h.brown@thermify.cloud>:
> >
> >
> > I have two ansible roles, one for enter, one for exit. There’s likely 
> > better ways to do this — and I’ll not be surprised if someone here lets me 
> > know. They’re using orch commands via the cephadm shell. I’m using Ansible 
> > for other configuration management in my environment, as well, including 
> > setting up clients of the ceph cluster.
> >
> >
> > Below excerpts from main.yml in the “tasks” for the enter/exit roles. The 
> > host I’m running ansible from is one of my CEPH servers - I’ve limited 
> > which process run there though so it’s in the cluster but not equal to the 
> > others.
> >
> >
> > —————
> > Enter
> > —————
> >
> > - name: Ceph Maintenance Mode Enter
> >   shell:
> >
> >     cmd: ' cephadm shell ceph orch host maintenance enter {{ 
> > (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} 
> > --force --yes-i-really-mean-it ‘
> >   become: True
> >
> >
> >
> > —————
> > Exit
> > —————
> >
> >
> > - name: Ceph Maintenance Mode Exit
> >   shell:
> >     cmd: 'cephadm shell ceph orch host maintenance exit {{ 
> > (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }} ‘
> >   become: True
> >   connection: local
> >
> >
> > - name: Wait for Ceph to be available
> >   ansible.builtin.wait_for:
> >     delay: 60
> >     host: '{{ 
> > (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }}’
> >     port: 9100
> >   connection: local
> >
> >
> >
> >
> >
> >
> > > On Jun 12, 2024, at 11:28 AM, Michael Worsham 
> > > <mwors...@datadimensions.com> wrote:
> > >
> > > Interesting. How do you set this "maintenance mode"? If you have a series 
> > > of documented steps that you have to do and could provide as an example, 
> > > that would be beneficial for my efforts.
> > >
> > > We are in the process of standing up both a dev-test environment 
> > > consisting of 3 Ceph servers (strictly for testing purposes) and a new 
> > > production environment consisting of 20+ Ceph servers.
> > >
> > > We are using Ubuntu 22.04.
> > >
> > > -- Michael
> > > From: Daniel Brown <daniel.h.brown@thermify.cloud>
> > > Sent: Wednesday, June 12, 2024 9:18 AM
> > > To: Anthony D'Atri <anthony.da...@gmail.com>
> > > Cc: Michael Worsham <mwors...@datadimensions.com>; ceph-users@ceph.io 
> > > <ceph-users@ceph.io>
> > > Subject: Re: [ceph-users] Patching Ceph cluster
> > >  This is an external email. Please take care when clicking links or 
> > > opening attachments. When in doubt, check with the Help Desk or Security.
> > >
> > >
> > > There’s also a Maintenance mode that you can set for each server, as 
> > > you’re doing updates, so that the cluster doesn’t try to move data from 
> > > affected OSD’s, while the server being updated is offline or down. I’ve 
> > > worked some on automating this with Ansible, but have found my process 
> > > (and/or my cluster) still requires some manual intervention while it’s 
> > > running to get things done cleanly.
> > >
> > >
> > >
> > > > On Jun 12, 2024, at 8:49 AM, Anthony D'Atri <anthony.da...@gmail.com> 
> > > > wrote:
> > > >
> > > > Do you mean patching the OS?
> > > >
> > > > If so, easy -- one node at a time, then after it comes back up, wait 
> > > > until all PGs are active+clean and the mon quorum is complete before 
> > > > proceeding.
> > > >
> > > >
> > > >
> > > >> On Jun 12, 2024, at 07:56, Michael Worsham 
> > > >> <mwors...@datadimensions.com> wrote:
> > > >>
> > > >> What is the proper way to patch a Ceph cluster and reboot the servers 
> > > >> in said cluster if a reboot is necessary for said updates? And is it 
> > > >> possible to automate it via Ansible? This message and its attachments 
> > > >> are from Data Dimensions and are intended only for the use of the 
> > > >> individual or entity to which it is addressed, and may contain 
> > > >> information that is privileged, confidential, and exempt from 
> > > >> disclosure under applicable law. If the reader of this message is not 
> > > >> the intended recipient, or the employee or agent responsible for 
> > > >> delivering the message to the intended recipient, you are hereby 
> > > >> notified that any dissemination, distribution, or copying of this 
> > > >> communication is strictly prohibited. If you have received this 
> > > >> communication in error, please notify the sender immediately and 
> > > >> permanently delete the original email and destroy any copies or 
> > > >> printouts of this email as well as any attachments.
> > > >> _______________________________________________
> > > >> ceph-users mailing list -- ceph-users@ceph.io
> > > >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> > > This message and its attachments are from Data Dimensions and are 
> > > intended only for the use of the individual or entity to which it is 
> > > addressed, and may contain information that is privileged, confidential, 
> > > and exempt from disclosure under applicable law. If the reader of this 
> > > message is not the intended recipient, or the employee or agent 
> > > responsible for delivering the message to the intended recipient, you are 
> > > hereby notified that any dissemination, distribution, or copying of this 
> > > communication is strictly prohibited. If you have received this 
> > > communication in error, please notify the sender immediately and 
> > > permanently delete the original email and destroy any copies or printouts 
> > > of this email as well as any attachments.
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> This message and its attachments are from Data Dimensions and are intended 
> only for the use of the individual or entity to which it is addressed, and 
> may contain information that is privileged, confidential, and exempt from 
> disclosure under applicable law. If the reader of this message is not the 
> intended recipient, or the employee or agent responsible for delivering the 
> message to the intended recipient, you are hereby notified that any 
> dissemination, distribution, or copying of this communication is strictly 
> prohibited. If you have received this communication in error, please notify 
> the sender immediately and permanently delete the original email and destroy 
> any copies or printouts of this email as well as any attachments.
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to