Thanks David for the suggestion, let me try that :) On Fri, Dec 8, 2017 at 9:28 PM, David Turner <drakonst...@gmail.com> wrote:
> Why are you rebooting the node? You should only need to restart the ceph > services. You need all of your MONs to be running Luminous before any > Luminous OSDs will be accepted by the cluster. So you should update the > packages on each server, restart the MONs, then restart your OSDs. After > you restart all of the MONs and have a Luminous quorum of MONs, then you > can start restarting OSDs and/or servers. > > If you want to, you can start your MGR daemons before doing the OSDs as > well, but that step isn't required to have the OSDs come back up. To get > out of this situation, you should update the packages on your remaining > MONs and restart the MON service to get all of your MONs running Luminous. > After that, your 24 down OSDs should come back up. > > On Fri, Dec 8, 2017 at 10:51 AM nokia ceph <nokiacephus...@gmail.com> > wrote: > >> Hello Team, >> >> I having a 5 node cluster running with kraken 11.2.0 EC 4+1. >> >> My plan is to upgrade all 5 nodes to 12.2.2 Luminous without any >> downtime. I tried on first node, below procedure. >> >> commented below directive from ceph.conf >> enable experimental unrecoverable data corrupting features = bluestore >> rocksdb >> >> Then start and enabled ceph-mgr and then hit a reboot. >> >> ## ceph -s >> cluster b2f1b9b9-eecc-4c17-8b92-cfa60b31c121 >> health HEALTH_WARN >> 2048 pgs degraded >> 2048 pgs stuck degraded >> 2048 pgs stuck unclean >> 2048 pgs stuck undersized >> 2048 pgs undersized >> recovery 1091151/1592070 objects degraded (68.537%) >> 24/120 in osds are down >> monmap e2: 5 mons at {PL8-CN1=10.50.11.41:6789/0, >> PL8-CN2=10.50.11.42:6789/0,PL8-CN3=10.50.11.43:6789/0, >> PL8-CN4=10.50.11.44:6789/0,PL8-CN5=10.50.11.45:6789/0} >> election epoch 18, quorum 0,1,2,3,4 >> PL8-CN1,PL8-CN2,PL8-CN3,PL8-CN4,PL8-CN5 >> mgr active: PL8-CN1 >> osdmap e243: 120 osds: 96 up, 120 in; 2048 remapped pgs >> flags sortbitwise,require_jewel_osds,require_kraken_osds >> pgmap v1099: 2048 pgs, 1 pools, 84304 MB data, 310 kobjects >> 105 GB used, 436 TB / 436 TB avail >> 1091151/1592070 objects degraded (68.537%) >> 2048 active+undersized+degraded >> client io 107 MB/s wr, 0 op/s rd, 860 op/s wr >> >> After reboot I can see that all the 24 OSD's in the first node showing >> down state. I can see the 24 osd process is running. >> >> #ps -ef | grep -c ceph-osd >> 24 >> >> Even If i tried parallely on 5 nodes this procedure and hit a reboot >> then it will come successfully without any issues, but for parallel >> execution time, I would require downtime, which is not accepted by our >> management at the moment. Please help and share your views. >> >> I read this https://ceph.com/releases/v12-2-0-luminous-released/ >> upgrade section. but this didn't help me at the moment. >> >> >> >> Here my question what is the best method to update machine without any >> downtime? >> >> Thanks >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com