I do it in production

On Thu, Apr 26, 2018, 2:47 AM John Hearns <hear...@googlemail.com> wrote:

> Ronny, talking about reboots, has anyone had experience of live kernel
> patching with CEPH?  I am asking out of simple curiosity.
>
>
> On 25 April 2018 at 19:40, Ronny Aasen <ronny+ceph-us...@aasen.cx> wrote:
>
>> the difference in cost between 2 and 3 servers are not HUGE. but the
>> reliability  difference between a size 2/1 pool and a 3/2 pool is massive.
>> a 2/1 pool is just a single fault during maintenance away from dataloss.
>> but you need multiple simultaneous faults, and have very bad luck to break
>> a 3/2 pool
>>
>> I would recommend rather using 2/2 pools if you are willing to accept a
>> little downtime when a disk dies.  the cluster io would stop until the
>> disks backfill to cover for the lost disk.
>> but it is better then having inconsistent pg's or dataloss because a disk
>> crashed during a routine reboot, or 2 disks
>>
>> also worth to read this link
>> https://www.spinics.net/lists/ceph-users/msg32895.html   a good
>> explanation.
>>
>> you have good backups and are willing to restore the whole pool. And it
>> is of course your privilege to run 2/1 pools but be mind full of the risks
>> of doing so.
>>
>>
>> kind regards
>> Ronny Aasen
>>
>> BTW: i did not know ubuntu automagically rebooted after a upgrade. you
>> can probably avoid that reboot somehow in ubuntu. and do the restarts of
>> services manually. if you wish to maintain service during upgrade
>>
>>
>>
>>
>>
>> On 25.04.2018 11:52, Ranjan Ghosh wrote:
>>
>>> Thanks a lot for your detailed answer. The problem for us, however, was
>>> that we use the Ceph packages that come with the Ubuntu distribution. If
>>> you do a Ubuntu upgrade, all packages are upgraded in one go and the server
>>> is rebooted. You cannot influence anything or start/stop services
>>> one-by-one etc. This was concering me, because the upgrade instructions
>>> didn't mention anything about an alternative or what to do in this case.
>>> But someone here enlightened me that - in general - it all doesnt matter
>>> that much *if you are just accepting a downtime*. And, indeed, it all
>>> worked nicely. We stopped all services on all servers, upgraded the Ubuntu
>>> version, rebooted all servers and were ready to go again. Didn't encounter
>>> any problems there. The only problem turned out to be our own fault and
>>> simply a firewall misconfiguration.
>>>
>>> And, yes, we're running a "size:2 min_size:1" because we're on a very
>>> tight budget. If I understand correctly, this means: Make changes of files
>>> to one server. *Eventually* copy them to the other server. I hope this
>>> *eventually* means after a few minutes. Up until now I've never experienced
>>> *any* problems with file integrity with this configuration. In fact, Ceph
>>> is incredibly stable. Amazing. I have never ever had any issues whatsoever
>>> with broken files/partially written files, files that contain garbage etc.
>>> Even after starting/stopping services, rebooting etc. With GlusterFS and
>>> other Cluster file system I've experienced many such problems over the
>>> years, so this is what makes Ceph so great. I have now a lot of trust in
>>> Ceph, that it will eventually repair everything :-) And: If a file that has
>>> been written a few seconds ago is really lost it wouldnt be that bad for
>>> our use-case. It's a web-server. Most important stuff is in the DB. We have
>>> hourly backups of everything. In a huge emergency, we could even restore
>>> the backup from an hour ago if we really had to. Not nice, but if it
>>> happens every 6 years or sth due to some freak hardware failure, I think it
>>> is manageable. I accept it's not the recommended/perfect solution if you
>>> have infinite amounts of money at your hands, but in our case, I think it's
>>> not extremely audacious either to do it like this, right?
>>>
>>>
>>> Am 11.04.2018 um 19:25 schrieb Ronny Aasen:
>>>
>>>> ceph upgrades are usualy not a problem:
>>>> ceph have to be upgraded in the right order. normally when each service
>>>> is on its own machine this is not difficult.
>>>> but when you have mon, mgr, osd, mds, and klients on the same host you
>>>> have to do it a bit carefully..
>>>>
>>>> i tend to have a terminal open with "watch ceph -s" running, and i
>>>> never do another service until the health is ok again.
>>>>
>>>> first apt upgrade the packages on all the hosts. This only update the
>>>> software on disk and not the running services.
>>>> then do the restart of services in the right order.  and only on one
>>>> host at the time
>>>>
>>>> mons: first you restart the mon service on all mon running hosts.
>>>> all the 3 mons are active at the same time, so there is no "shifting
>>>> around" but make sure the quorum is ok again before you do the next mon.
>>>>
>>>> mgr: then restart mgr on all hosts that run mgr. there is only one
>>>> active mgr at the time now, so here there will be a bit of shifting around.
>>>> but it is only for statistics/management so it may affect your ceph -s
>>>> command, but not the cluster operation.
>>>>
>>>> osd: restart osd processes one osd at the time, make sure health_ok
>>>> before doing the next osd process. do this for all hosts that have osd's
>>>>
>>>> mds: restart mds's one at the time. you will notice the standby mds
>>>> taking over for the mds that was restarted. do both.
>>>>
>>>> klients: restart clients, that means remount filesystems, migrate or
>>>> restart vm's. or restart whatever process uses the old ceph libraries.
>>>>
>>>>
>>>> about pools:
>>>> since you only have 2 osd's you can obviously not be running the
>>>> recommended 3 replication pools. ? this makes me worry that you may be
>>>> running size:2 min_size:1 pools. and are daily running risk of dataloss due
>>>> to corruption and inconsistencies. especially when you restart osd's
>>>>
>>>> if your pools are size:2 min_size:2 then your cluster will fail when
>>>> any osd is restarted, until the osd is up and healthy again. but you have
>>>> less chance for dataloss then 2/1 pools.
>>>>
>>>> if you added a osd on a third host you can run size:3 min_size:2 . the
>>>> recommended config when you can have both redundancy and high 
>>>> availabillity.
>>>>
>>>>
>>>> kind regards
>>>> Ronny Aasen
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 11.04.2018 17:42, Ranjan Ghosh wrote:
>>>>
>>>>> Ah, nevermind, we've solved it. It was a firewall issue. The only
>>>>> thing that's weird is that it became an issue immediately after an update.
>>>>> Perhaps it has sth. to do with monitor nodes shifting around or anything.
>>>>> Well, thanks again for your quick support, though. It's much appreciated.
>>>>>
>>>>> BR
>>>>>
>>>>> Ranjan
>>>>>
>>>>>
>>>>> Am 11.04.2018 um 17:07 schrieb Ranjan Ghosh:
>>>>>
>>>>>> Thank you for your answer. Do you have any specifics on which thread
>>>>>> you're talking about? Would be very interested to read about a success
>>>>>> story, because I fear that if I update the other node that the whole
>>>>>> cluster comes down.
>>>>>>
>>>>>>
>>>>>> Am 11.04.2018 um 10:47 schrieb Marc Roos:
>>>>>>
>>>>>>> I think you have to update all osd's, mon's etc. I can remember
>>>>>>> running
>>>>>>> into similar issue. You should be able to find more about this in
>>>>>>> mailing list archive.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Ranjan Ghosh [mailto:gh...@pw6.de]
>>>>>>> Sent: woensdag 11 april 2018 16:02
>>>>>>> To: ceph-users
>>>>>>> Subject: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 =>
>>>>>>> 12.2.2
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> We have a two-cluster-node (with a third "monitoring-only" node).
>>>>>>> Over
>>>>>>> the last months, everything ran *perfectly* smooth. Today, I did an
>>>>>>> Ubuntu "apt-get upgrade" on one of the two servers. Among others, the
>>>>>>> ceph packages were upgraded from 12.2.1 to 12.2.2. A minor release
>>>>>>> update, one might think. But, to my surprise, after restarting the
>>>>>>> services, Ceph is now in degraded state :-( (see below). Only the
>>>>>>> first
>>>>>>> node - which ist still on 12.2.1 - seems to be running. I did a bit
>>>>>>> of
>>>>>>> research and found this:
>>>>>>>
>>>>>>> https://ceph.com/community/new-luminous-pg-overdose-protection/
>>>>>>>
>>>>>>> I did set "mon_max_pg_per_osd = 300" to no avail. Don't know if this
>>>>>>> is
>>>>>>> the problem at all.
>>>>>>>
>>>>>>> Looking at the status it seems we have 264 pgs, right? When I enter
>>>>>>> "ceph osd df" (which I found on another website claiming it should
>>>>>>> print
>>>>>>> the number of PGs per OSD), it just hangs (need to abort with
>>>>>>> Ctrl+C).
>>>>>>>
>>>>>>> Hope anybody can help me. The cluster know works with the single
>>>>>>> node,
>>>>>>> but it is definively quite worrying because we don't have redundancy.
>>>>>>>
>>>>>>> Thanks in advance,
>>>>>>>
>>>>>>> Ranjan
>>>>>>>
>>>>>>>
>>>>>>> root@tukan2 /var/www/projects # ceph -s
>>>>>>>     cluster:
>>>>>>>       id:     19895e72-4a0c-4d5d-ae23-7f631ec8c8e4
>>>>>>>       health: HEALTH_WARN
>>>>>>>               insufficient standby MDS daemons available
>>>>>>>               Reduced data availability: 264 pgs inactive
>>>>>>>               Degraded data redundancy: 264 pgs unclean
>>>>>>>
>>>>>>>     services:
>>>>>>>       mon: 3 daemons, quorum tukan1,tukan2,tukan0
>>>>>>>       mgr: tukan0(active), standbys: tukan2
>>>>>>>       mds: cephfs-1/1/1 up  {0=tukan2=up:active}
>>>>>>>       osd: 2 osds: 2 up, 2 in
>>>>>>>
>>>>>>>     data:
>>>>>>>       pools:   3 pools, 264 pgs
>>>>>>>       objects: 0 objects, 0 bytes
>>>>>>>       usage:   0 kB used, 0 kB / 0 kB avail
>>>>>>>       pgs:     100.000% pgs unknown
>>>>>>>
>>>>>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to