Happy to report I got everything up to Luminous, used your tip to keep the OSDs running, David, thanks again for that.
I'd say this is a potential gotcha for people collocating MONs. It appears that if you're running selinux, even in permissive mode, upgrading the ceph-selinux packages forces a restart on all the OSDs. You're left with a load of OSDs down that you can't start as you don't have a Luminous mon quorum yet. On 15 Sep 2017 4:54 p.m., "David" <dclistsli...@gmail.com> wrote: Hi David I like your thinking! Thanks for the suggestion. I've got a maintenance window later to finish the update so will give it a try. On Thu, Sep 14, 2017 at 6:24 PM, David Turner <drakonst...@gmail.com> wrote: > This isn't a great solution, but something you could try. If you stop all > of the daemons via systemd and start them all in a screen as a manually > running daemon in the foreground of each screen... I don't think that yum > updating the packages can stop or start the daemons. You could copy and > paste the running command (viewable in ps) to know exactly what to run in > the screens to start the daemons like this. > > On Wed, Sep 13, 2017 at 6:53 PM David <dclistsli...@gmail.com> wrote: > >> Hi All >> >> I did a Jewel -> Luminous upgrade on my dev cluster and it went very >> smoothly. >> >> I've attempted to upgrade on a small production cluster but I've hit a >> snag. >> >> After installing the ceph 12.2.0 packages with "yum install ceph" on the >> first node and accepting all the dependencies, I found that all the OSD >> daemons, the MON and the MDS running on that node were terminated. Systemd >> appears to have attempted to restart them all but the daemons didn't start >> successfully (not surprising as first stage of upgrading all mons in >> cluster not completed). I was able to start the MON and it's running. The >> OSDs are all down and I'm reluctant to attempt to start them without >> upgrading the other MONs in the cluster. I'm also reluctant to attempt >> upgrading the remaining 2 MONs without understanding what happened. >> >> The cluster is on Jewel 10.2.5 (as was the dev cluster) >> Both clusters running on CentOS 7.3 >> >> The only obvious difference I can see between the dev and production is >> the production has selinux running in permissive mode, the dev had it >> disabled. >> >> Any advice on how to proceed at this point would be much appreciated. The >> cluster is currently functional, but I have 1 node out 4 with all OSDs >> down. I had noout set before the upgrade and I've left it set for now. >> >> Here's the journalctl right after the packages were installed (hostname >> changed): >> >> https://pastebin.com/fa6NMyjG >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com