I do it in production On Thu, Apr 26, 2018, 2:47 AM John Hearns <hear...@googlemail.com> wrote:
> Ronny, talking about reboots, has anyone had experience of live kernel > patching with CEPH? I am asking out of simple curiosity. > > > On 25 April 2018 at 19:40, Ronny Aasen <ronny+ceph-us...@aasen.cx> wrote: > >> the difference in cost between 2 and 3 servers are not HUGE. but the >> reliability difference between a size 2/1 pool and a 3/2 pool is massive. >> a 2/1 pool is just a single fault during maintenance away from dataloss. >> but you need multiple simultaneous faults, and have very bad luck to break >> a 3/2 pool >> >> I would recommend rather using 2/2 pools if you are willing to accept a >> little downtime when a disk dies. the cluster io would stop until the >> disks backfill to cover for the lost disk. >> but it is better then having inconsistent pg's or dataloss because a disk >> crashed during a routine reboot, or 2 disks >> >> also worth to read this link >> https://www.spinics.net/lists/ceph-users/msg32895.html a good >> explanation. >> >> you have good backups and are willing to restore the whole pool. And it >> is of course your privilege to run 2/1 pools but be mind full of the risks >> of doing so. >> >> >> kind regards >> Ronny Aasen >> >> BTW: i did not know ubuntu automagically rebooted after a upgrade. you >> can probably avoid that reboot somehow in ubuntu. and do the restarts of >> services manually. if you wish to maintain service during upgrade >> >> >> >> >> >> On 25.04.2018 11:52, Ranjan Ghosh wrote: >> >>> Thanks a lot for your detailed answer. The problem for us, however, was >>> that we use the Ceph packages that come with the Ubuntu distribution. If >>> you do a Ubuntu upgrade, all packages are upgraded in one go and the server >>> is rebooted. You cannot influence anything or start/stop services >>> one-by-one etc. This was concering me, because the upgrade instructions >>> didn't mention anything about an alternative or what to do in this case. >>> But someone here enlightened me that - in general - it all doesnt matter >>> that much *if you are just accepting a downtime*. And, indeed, it all >>> worked nicely. We stopped all services on all servers, upgraded the Ubuntu >>> version, rebooted all servers and were ready to go again. Didn't encounter >>> any problems there. The only problem turned out to be our own fault and >>> simply a firewall misconfiguration. >>> >>> And, yes, we're running a "size:2 min_size:1" because we're on a very >>> tight budget. If I understand correctly, this means: Make changes of files >>> to one server. *Eventually* copy them to the other server. I hope this >>> *eventually* means after a few minutes. Up until now I've never experienced >>> *any* problems with file integrity with this configuration. In fact, Ceph >>> is incredibly stable. Amazing. I have never ever had any issues whatsoever >>> with broken files/partially written files, files that contain garbage etc. >>> Even after starting/stopping services, rebooting etc. With GlusterFS and >>> other Cluster file system I've experienced many such problems over the >>> years, so this is what makes Ceph so great. I have now a lot of trust in >>> Ceph, that it will eventually repair everything :-) And: If a file that has >>> been written a few seconds ago is really lost it wouldnt be that bad for >>> our use-case. It's a web-server. Most important stuff is in the DB. We have >>> hourly backups of everything. In a huge emergency, we could even restore >>> the backup from an hour ago if we really had to. Not nice, but if it >>> happens every 6 years or sth due to some freak hardware failure, I think it >>> is manageable. I accept it's not the recommended/perfect solution if you >>> have infinite amounts of money at your hands, but in our case, I think it's >>> not extremely audacious either to do it like this, right? >>> >>> >>> Am 11.04.2018 um 19:25 schrieb Ronny Aasen: >>> >>>> ceph upgrades are usualy not a problem: >>>> ceph have to be upgraded in the right order. normally when each service >>>> is on its own machine this is not difficult. >>>> but when you have mon, mgr, osd, mds, and klients on the same host you >>>> have to do it a bit carefully.. >>>> >>>> i tend to have a terminal open with "watch ceph -s" running, and i >>>> never do another service until the health is ok again. >>>> >>>> first apt upgrade the packages on all the hosts. This only update the >>>> software on disk and not the running services. >>>> then do the restart of services in the right order. and only on one >>>> host at the time >>>> >>>> mons: first you restart the mon service on all mon running hosts. >>>> all the 3 mons are active at the same time, so there is no "shifting >>>> around" but make sure the quorum is ok again before you do the next mon. >>>> >>>> mgr: then restart mgr on all hosts that run mgr. there is only one >>>> active mgr at the time now, so here there will be a bit of shifting around. >>>> but it is only for statistics/management so it may affect your ceph -s >>>> command, but not the cluster operation. >>>> >>>> osd: restart osd processes one osd at the time, make sure health_ok >>>> before doing the next osd process. do this for all hosts that have osd's >>>> >>>> mds: restart mds's one at the time. you will notice the standby mds >>>> taking over for the mds that was restarted. do both. >>>> >>>> klients: restart clients, that means remount filesystems, migrate or >>>> restart vm's. or restart whatever process uses the old ceph libraries. >>>> >>>> >>>> about pools: >>>> since you only have 2 osd's you can obviously not be running the >>>> recommended 3 replication pools. ? this makes me worry that you may be >>>> running size:2 min_size:1 pools. and are daily running risk of dataloss due >>>> to corruption and inconsistencies. especially when you restart osd's >>>> >>>> if your pools are size:2 min_size:2 then your cluster will fail when >>>> any osd is restarted, until the osd is up and healthy again. but you have >>>> less chance for dataloss then 2/1 pools. >>>> >>>> if you added a osd on a third host you can run size:3 min_size:2 . the >>>> recommended config when you can have both redundancy and high >>>> availabillity. >>>> >>>> >>>> kind regards >>>> Ronny Aasen >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On 11.04.2018 17:42, Ranjan Ghosh wrote: >>>> >>>>> Ah, nevermind, we've solved it. It was a firewall issue. The only >>>>> thing that's weird is that it became an issue immediately after an update. >>>>> Perhaps it has sth. to do with monitor nodes shifting around or anything. >>>>> Well, thanks again for your quick support, though. It's much appreciated. >>>>> >>>>> BR >>>>> >>>>> Ranjan >>>>> >>>>> >>>>> Am 11.04.2018 um 17:07 schrieb Ranjan Ghosh: >>>>> >>>>>> Thank you for your answer. Do you have any specifics on which thread >>>>>> you're talking about? Would be very interested to read about a success >>>>>> story, because I fear that if I update the other node that the whole >>>>>> cluster comes down. >>>>>> >>>>>> >>>>>> Am 11.04.2018 um 10:47 schrieb Marc Roos: >>>>>> >>>>>>> I think you have to update all osd's, mon's etc. I can remember >>>>>>> running >>>>>>> into similar issue. You should be able to find more about this in >>>>>>> mailing list archive. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Ranjan Ghosh [mailto:gh...@pw6.de] >>>>>>> Sent: woensdag 11 april 2018 16:02 >>>>>>> To: ceph-users >>>>>>> Subject: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => >>>>>>> 12.2.2 >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> We have a two-cluster-node (with a third "monitoring-only" node). >>>>>>> Over >>>>>>> the last months, everything ran *perfectly* smooth. Today, I did an >>>>>>> Ubuntu "apt-get upgrade" on one of the two servers. Among others, the >>>>>>> ceph packages were upgraded from 12.2.1 to 12.2.2. A minor release >>>>>>> update, one might think. But, to my surprise, after restarting the >>>>>>> services, Ceph is now in degraded state :-( (see below). Only the >>>>>>> first >>>>>>> node - which ist still on 12.2.1 - seems to be running. I did a bit >>>>>>> of >>>>>>> research and found this: >>>>>>> >>>>>>> https://ceph.com/community/new-luminous-pg-overdose-protection/ >>>>>>> >>>>>>> I did set "mon_max_pg_per_osd = 300" to no avail. Don't know if this >>>>>>> is >>>>>>> the problem at all. >>>>>>> >>>>>>> Looking at the status it seems we have 264 pgs, right? When I enter >>>>>>> "ceph osd df" (which I found on another website claiming it should >>>>>>> print >>>>>>> the number of PGs per OSD), it just hangs (need to abort with >>>>>>> Ctrl+C). >>>>>>> >>>>>>> Hope anybody can help me. The cluster know works with the single >>>>>>> node, >>>>>>> but it is definively quite worrying because we don't have redundancy. >>>>>>> >>>>>>> Thanks in advance, >>>>>>> >>>>>>> Ranjan >>>>>>> >>>>>>> >>>>>>> root@tukan2 /var/www/projects # ceph -s >>>>>>> cluster: >>>>>>> id: 19895e72-4a0c-4d5d-ae23-7f631ec8c8e4 >>>>>>> health: HEALTH_WARN >>>>>>> insufficient standby MDS daemons available >>>>>>> Reduced data availability: 264 pgs inactive >>>>>>> Degraded data redundancy: 264 pgs unclean >>>>>>> >>>>>>> services: >>>>>>> mon: 3 daemons, quorum tukan1,tukan2,tukan0 >>>>>>> mgr: tukan0(active), standbys: tukan2 >>>>>>> mds: cephfs-1/1/1 up {0=tukan2=up:active} >>>>>>> osd: 2 osds: 2 up, 2 in >>>>>>> >>>>>>> data: >>>>>>> pools: 3 pools, 264 pgs >>>>>>> objects: 0 objects, 0 bytes >>>>>>> usage: 0 kB used, 0 kB / 0 kB avail >>>>>>> pgs: 100.000% pgs unknown >>>>>>> >>>>>> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com