It appears the VMs that were offline may or may not have been related to the backend storage issues but all seems to be working now as far as I can tell. We had several VMs reboot around the same time the Ceph storage issue happened however the hypervisors were all OK. Nothing in the logs provided details on why those were rebooted.
We had an SSD fail on one of the Ceph nodes which is used for caching metadata which caused issues to some of the other OSD's on the same host. The replacement SSD was supposed to be delivered on Friday but got delayed and wasn't going to arrive until tomorrow. Unfortunately the clock ran out on the SSD and decided to fail overnight. I'm going in today to replace the failing SSD with a temporary drive until we can get the proper replacement in tomorrow. If any of you are still having issues with your VMs, please don't hesitate to email powerdev-requ...@osuosl.org to open a ticket. Thanks for your patience! On Sun, Dec 19, 2021 at 10:21 AM Lance Albertson <la...@osuosl.org> wrote: > All, > > It appears we had an issue with our backend storage (Ceph) overnight that > has caused some of the VMs to be offline. I'm currently working on > resolving the issue right now and will send an update once I have > everything back online. > > I'll reply to any of your related tickets directly once I have an update > as well. > > Thanks for your patience! > > -- > Lance Albertson > Director > Oregon State University | Open Source Lab > -- Lance Albertson Director Oregon State University | Open Source Lab
_______________________________________________ openpower mailing list openpo...@osuosl.org https://lists.osuosl.org/mailman/listinfo/openpower