Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

David Turner Wed, 14 Jun 2017 14:10:31 -0700

I've used the kernel client and the ceph-fuse driver for mapping the cephfs
volume.  I didn't notice any network hiccups while failing over, but I was
reading large files during my tests (and live) and some caching may have
hidden hidden network hiccups for my use case.


Going back to the memory potentially being a problem.  Ceph has a tendency
to start using 2-3x more memory while it's in a degraded state as opposed
to when everything is health_ok.  Always plan for over-provisioning your
memory to account for a minimum of 2x.  I've seen clusters stuck in an OOM
killer death spiral because it kept killing OSDs for running out of memory,
that caused more peering and backfilling, ... which caused more OSDs to be
killed by OOM killer.

On Wed, Jun 14, 2017 at 5:01 PM Daniel Carrasco <d.carra...@i2tic.com>
wrote:

> Is strange because on my test cluster (three nodes) with two nodes with
> OSD, and all with MON and MDS, I've configured the size to 2 and min_size
> to 1, I've restarted all nodes one by one and the client loose the
> connection for about 5 seconds until connect to other MDS.
>
> Are you using ceph client or kernel client?
> I forgot to say that I'm using Debian 8.
>
> Anyway, maybe the problem was what I've said before, the clients
> connection with that node started to fail, but the node was not officially
> down. And it wasn't a client problem, because it happened on both clients
> and on my monitoring service at same time.
>
> Just now I'm not on the office, so I can't post the config file. Tomorrow
> I'll send it.
> Anyway, is the basic file generated by ceph-deploy with client network and
> min_size configurations. Just like my test config.
>
> Thanks!!, and greetings!!
>
> El 14 jun. 2017 10:38 p. m., "David Turner" <drakonst...@gmail.com>
> escribió:
>
> I have 3 ceph nodes, size 3, min_size 2, and I can restart them all 1 at a
> time to do ceph and kernel upgrades.  The VM's running out of ceph, the
> clients accessing MDS, etc all keep working fine without any problem during
> these restarts.  What is your full ceph configuration?  There must be
> something not quite right in there.
>
> On Wed, Jun 14, 2017 at 4:26 PM Daniel Carrasco <d.carra...@i2tic.com>
> wrote:
>
>>
>>
>> El 14 jun. 2017 10:08 p. m., "David Turner" <drakonst...@gmail.com>
>> escribió:
>>
>> Not just the min_size of your cephfs data pool, but also your
>> cephfs_metadata pool.
>>
>>
>> Both were at 1. I don't know why because I don't remember to have changed
>> the min_size and the cluster has 3 odd from beginning (I did it on
>> another cluster for testing purposes, but I don't remember to have changed
>> on this). I've changed both to two, but after the fail.
>>
>> About the size, I use 50Gb because it's for a single webpage and I don't
>> need more space.
>>
>> I'll try to increase the memory to 3Gb.
>>
>> Greetings!!
>>
>>
>> On Wed, Jun 14, 2017 at 4:07 PM David Turner <drakonst...@gmail.com>
>> wrote:
>>
>>> Ceph recommends 1GB of RAM for ever 1TB of OSD space.  Your 2GB nodes
>>> are definitely on the low end.  50GB OSDs... I don't know what that will
>>> require, but where you're running the mon and mds on the same node, I'd
>>> still say that 2GB is low.  The Ceph OSD daemon using 1GB of RAM is not
>>> surprising, even at that size.
>>>
>>> When you say you increased the size of the pools to 3, what did you do
>>> to the min_size?  Is that still set to 2?
>>>
>>> On Wed, Jun 14, 2017 at 3:17 PM Daniel Carrasco <d.carra...@i2tic.com>
>>> wrote:
>>>
>>>> Finally I've created three nodes, I've increased the size of pools to 3
>>>> and I've created 3 MDS (active, standby, standby).
>>>>
>>>> Today the server has decided to fail and I've noticed that failover is
>>>> not working... The ceph -s command shows like everything was OK but the
>>>> clients weren't able to connect and I had to restart the failing node and
>>>> reconect the clients manually to make it work again (even I think that the
>>>> active MDS was in another node).
>>>>
>>>> I don't know if maybe is because the server was not fully down, and
>>>> only some connections were failing. I'll do some tests too see.
>>>>
>>>> Another question: How many memory needs a node to work?, because I've
>>>> nodes with 2GB of RAM (one MDS, one MON and one OSD), and they have an high
>>>> memory usage (more than 1GB on the OSD).
>>>> The OSD size is 50GB and the data that contains is less than 3GB.
>>>>
>>>> Thanks, and Greetings!!
>>>>
>>>> 2017-06-12 23:33 GMT+02:00 Mazzystr <mazzy...@gmail.com>:
>>>>
>>>>> Since your app is an Apache / php app is it possible for you to
>>>>> reconfigure the app to use S3 module rather than a posix open file()?  
>>>>> Then
>>>>> with Ceph drop CephFS and configure Civetweb S3 gateway?  You can have
>>>>> "active-active" endpoints with round robin dns or F5 or something.  You
>>>>> would also have to repopulate objects into the rados pools.
>>>>>
>>>>> Also increase that size parameter to 3.  ;-)
>>>>>
>>>>> Lots of work for active-active but the whole stack will be much more
>>>>> resilient coming from some with a ClearCase / NFS / stale file handles up
>>>>> the wazoo background
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 10:41 AM, Daniel Carrasco <
>>>>> d.carra...@i2tic.com> wrote:
>>>>>
>>>>>> 2017-06-12 16:10 GMT+02:00 David Turner <drakonst...@gmail.com>:
>>>>>>
>>>>>>> I have an incredibly light-weight cephfs configuration.  I set up an
>>>>>>> MDS on each mon (3 total), and have 9TB of data in cephfs.  This data 
>>>>>>> only
>>>>>>> has 1 client that reads a few files at a time.  I haven't noticed any
>>>>>>> downtime when it fails over to a standby MDS.  So it definitely depends 
>>>>>>> on
>>>>>>> your workload as to how a failover will affect your environment.
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 9:59 AM John Petrini <jpetr...@coredial.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> We use the following in our ceph.conf for MDS failover. We're
>>>>>>>> running one active and one standby. Last time it failed over there was
>>>>>>>> about 2 minutes of downtime before the mounts started responding again 
>>>>>>>> but
>>>>>>>> it did recover gracefully.
>>>>>>>>
>>>>>>>> [mds]
>>>>>>>> max_mds = 1
>>>>>>>> mds_standby_for_rank = 0
>>>>>>>> mds_standby_replay = true
>>>>>>>>
>>>>>>>> ___
>>>>>>>>
>>>>>>>> John Petrini
>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list
>>>>>>>> ceph-users@lists.ceph.com
>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Thanks to both.
>>>>>> Just now i'm working on that because I needs a very fast failover.
>>>>>> For now the tests give me a very fast response when an OSD fails (about 5
>>>>>> seconds), but a very slow response when the main MDS fails (I've not 
>>>>>> tested
>>>>>> the real time, but was not working for a long time). Maybe was because I
>>>>>> created the other MDS after mount, because I've done some test just 
>>>>>> before
>>>>>> send this email and now looks very fast (i've not noticed the downtime).
>>>>>>
>>>>>> Greetings!!
>>>>>>
>>>>>>
>>>>>> --
>>>>>> _________________________________________
>>>>>>
>>>>>>       Daniel Carrasco Marín
>>>>>>       Ingeniería para la Innovación i2TIC, S.L.
>>>>>>       Tlf:  +34 911 12 32 84 Ext: 223
>>>>>>       www.i2tic.com
>>>>>> _________________________________________
>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> _________________________________________
>>>>
>>>>       Daniel Carrasco Marín
>>>>       Ingeniería para la Innovación i2TIC, S.L.
>>>>       Tlf:  +34 911 12 32 84 Ext: 223
>>>>       www.i2tic.com
>>>> _________________________________________
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

Reply via email to