Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

Daniel Carrasco Thu, 15 Jun 2017 07:45:55 -0700

I forgot to say that after upgrade the machine RAM to 4Gb, the OSD daemons
has started to use only a 5% (about 200MB). Is like magic, and now I've
about 3.2Gb of free RAM.


Greetings!!

2017-06-15 15:08 GMT+02:00 Daniel Carrasco <d.carra...@i2tic.com>:

> Finally, the problem was W3Total Cache, that seems to be unable to manage
> HA and when the master redis host is down, it stop working without try the
> slave.
>
> I've added some options to make it faster to detect a down OSD and the
> page is online again in about 40s.
>
> [global]
> fsid = Hidden
> mon_initial_members = alantra_fs-01, alantra_fs-02, alantra_fs-03
> mon_host = 10.20.1.109,10.20.1.97,10.20.1.216
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> osd mon heartbeat interval = 5
> osd mon report interval max = 10
> mon osd report timeout = 15
> osd fast fail on connection refused = True
>
> public network = 10.20.1.0/24
> osd pool default size = 2
>
>
> Greetings and thanks for all your help.
>
> 2017-06-14 23:09 GMT+02:00 David Turner <drakonst...@gmail.com>:
>
>> I've used the kernel client and the ceph-fuse driver for mapping the
>> cephfs volume.  I didn't notice any network hiccups while failing over, but
>> I was reading large files during my tests (and live) and some caching may
>> have hidden hidden network hiccups for my use case.
>>
>> Going back to the memory potentially being a problem.  Ceph has a
>> tendency to start using 2-3x more memory while it's in a degraded state as
>> opposed to when everything is health_ok.  Always plan for over-provisioning
>> your memory to account for a minimum of 2x.  I've seen clusters stuck in an
>> OOM killer death spiral because it kept killing OSDs for running out of
>> memory, that caused more peering and backfilling, ... which caused more
>> OSDs to be killed by OOM killer.
>>
>> On Wed, Jun 14, 2017 at 5:01 PM Daniel Carrasco <d.carra...@i2tic.com>
>> wrote:
>>
>>> Is strange because on my test cluster (three nodes) with two nodes with
>>> OSD, and all with MON and MDS, I've configured the size to 2 and min_size
>>> to 1, I've restarted all nodes one by one and the client loose the
>>> connection for about 5 seconds until connect to other MDS.
>>>
>>> Are you using ceph client or kernel client?
>>> I forgot to say that I'm using Debian 8.
>>>
>>> Anyway, maybe the problem was what I've said before, the clients
>>> connection with that node started to fail, but the node was not officially
>>> down. And it wasn't a client problem, because it happened on both clients
>>> and on my monitoring service at same time.
>>>
>>> Just now I'm not on the office, so I can't post the config file.
>>> Tomorrow I'll send it.
>>> Anyway, is the basic file generated by ceph-deploy with client network
>>> and min_size configurations. Just like my test config.
>>>
>>> Thanks!!, and greetings!!
>>>
>>> El 14 jun. 2017 10:38 p. m., "David Turner" <drakonst...@gmail.com>
>>> escribió:
>>>
>>> I have 3 ceph nodes, size 3, min_size 2, and I can restart them all 1 at
>>> a time to do ceph and kernel upgrades.  The VM's running out of ceph, the
>>> clients accessing MDS, etc all keep working fine without any problem during
>>> these restarts.  What is your full ceph configuration?  There must be
>>> something not quite right in there.
>>>
>>> On Wed, Jun 14, 2017 at 4:26 PM Daniel Carrasco <d.carra...@i2tic.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> El 14 jun. 2017 10:08 p. m., "David Turner" <drakonst...@gmail.com>
>>>> escribió:
>>>>
>>>> Not just the min_size of your cephfs data pool, but also your
>>>> cephfs_metadata pool.
>>>>
>>>>
>>>> Both were at 1. I don't know why because I don't remember to have
>>>> changed the min_size and the cluster has 3 odd from beginning (I did
>>>> it on another cluster for testing purposes, but I don't remember to have
>>>> changed on this). I've changed both to two, but after the fail.
>>>>
>>>> About the size, I use 50Gb because it's for a single webpage and I
>>>> don't need more space.
>>>>
>>>> I'll try to increase the memory to 3Gb.
>>>>
>>>> Greetings!!
>>>>
>>>>
>>>> On Wed, Jun 14, 2017 at 4:07 PM David Turner <drakonst...@gmail.com>
>>>> wrote:
>>>>
>>>>> Ceph recommends 1GB of RAM for ever 1TB of OSD space.  Your 2GB nodes
>>>>> are definitely on the low end.  50GB OSDs... I don't know what that will
>>>>> require, but where you're running the mon and mds on the same node, I'd
>>>>> still say that 2GB is low.  The Ceph OSD daemon using 1GB of RAM is not
>>>>> surprising, even at that size.
>>>>>
>>>>> When you say you increased the size of the pools to 3, what did you do
>>>>> to the min_size?  Is that still set to 2?
>>>>>
>>>>> On Wed, Jun 14, 2017 at 3:17 PM Daniel Carrasco <d.carra...@i2tic.com>
>>>>> wrote:
>>>>>
>>>>>> Finally I've created three nodes, I've increased the size of pools to
>>>>>> 3 and I've created 3 MDS (active, standby, standby).
>>>>>>
>>>>>> Today the server has decided to fail and I've noticed that failover
>>>>>> is not working... The ceph -s command shows like everything was OK but 
>>>>>> the
>>>>>> clients weren't able to connect and I had to restart the failing node and
>>>>>> reconect the clients manually to make it work again (even I think that 
>>>>>> the
>>>>>> active MDS was in another node).
>>>>>>
>>>>>> I don't know if maybe is because the server was not fully down, and
>>>>>> only some connections were failing. I'll do some tests too see.
>>>>>>
>>>>>> Another question: How many memory needs a node to work?, because I've
>>>>>> nodes with 2GB of RAM (one MDS, one MON and one OSD), and they have an 
>>>>>> high
>>>>>> memory usage (more than 1GB on the OSD).
>>>>>> The OSD size is 50GB and the data that contains is less than 3GB.
>>>>>>
>>>>>> Thanks, and Greetings!!
>>>>>>
>>>>>> 2017-06-12 23:33 GMT+02:00 Mazzystr <mazzy...@gmail.com>:
>>>>>>
>>>>>>> Since your app is an Apache / php app is it possible for you to
>>>>>>> reconfigure the app to use S3 module rather than a posix open file()?  
>>>>>>> Then
>>>>>>> with Ceph drop CephFS and configure Civetweb S3 gateway?  You can have
>>>>>>> "active-active" endpoints with round robin dns or F5 or something.  You
>>>>>>> would also have to repopulate objects into the rados pools.
>>>>>>>
>>>>>>> Also increase that size parameter to 3.  ;-)
>>>>>>>
>>>>>>> Lots of work for active-active but the whole stack will be much more
>>>>>>> resilient coming from some with a ClearCase / NFS / stale file handles 
>>>>>>> up
>>>>>>> the wazoo background
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 12, 2017 at 10:41 AM, Daniel Carrasco <
>>>>>>> d.carra...@i2tic.com> wrote:
>>>>>>>
>>>>>>>> 2017-06-12 16:10 GMT+02:00 David Turner <drakonst...@gmail.com>:
>>>>>>>>
>>>>>>>>> I have an incredibly light-weight cephfs configuration.  I set up
>>>>>>>>> an MDS on each mon (3 total), and have 9TB of data in cephfs.  This 
>>>>>>>>> data
>>>>>>>>> only has 1 client that reads a few files at a time.  I haven't 
>>>>>>>>> noticed any
>>>>>>>>> downtime when it fails over to a standby MDS.  So it definitely 
>>>>>>>>> depends on
>>>>>>>>> your workload as to how a failover will affect your environment.
>>>>>>>>>
>>>>>>>>> On Mon, Jun 12, 2017 at 9:59 AM John Petrini <
>>>>>>>>> jpetr...@coredial.com> wrote:
>>>>>>>>>
>>>>>>>>>> We use the following in our ceph.conf for MDS failover. We're
>>>>>>>>>> running one active and one standby. Last time it failed over there 
>>>>>>>>>> was
>>>>>>>>>> about 2 minutes of downtime before the mounts started responding 
>>>>>>>>>> again but
>>>>>>>>>> it did recover gracefully.
>>>>>>>>>>
>>>>>>>>>> [mds]
>>>>>>>>>> max_mds = 1
>>>>>>>>>> mds_standby_for_rank = 0
>>>>>>>>>> mds_standby_replay = true
>>>>>>>>>>
>>>>>>>>>> ___
>>>>>>>>>>
>>>>>>>>>> John Petrini
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ceph-users mailing list
>>>>>>>>>> ceph-users@lists.ceph.com
>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks to both.
>>>>>>>> Just now i'm working on that because I needs a very fast failover.
>>>>>>>> For now the tests give me a very fast response when an OSD fails 
>>>>>>>> (about 5
>>>>>>>> seconds), but a very slow response when the main MDS fails (I've not 
>>>>>>>> tested
>>>>>>>> the real time, but was not working for a long time). Maybe was because 
>>>>>>>> I
>>>>>>>> created the other MDS after mount, because I've done some test just 
>>>>>>>> before
>>>>>>>> send this email and now looks very fast (i've not noticed the 
>>>>>>>> downtime).
>>>>>>>>
>>>>>>>> Greetings!!
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> _________________________________________
>>>>>>>>
>>>>>>>>       Daniel Carrasco Marín
>>>>>>>>       Ingeniería para la Innovación i2TIC, S.L.
>>>>>>>>       Tlf:  +34 911 12 32 84 Ext: 223
>>>>>>>>       www.i2tic.com
>>>>>>>> _________________________________________
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list
>>>>>>>> ceph-users@lists.ceph.com
>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> _________________________________________
>>>>>>
>>>>>>       Daniel Carrasco Marín
>>>>>>       Ingeniería para la Innovación i2TIC, S.L.
>>>>>>       Tlf:  +34 911 12 32 84 Ext: 223
>>>>>>       www.i2tic.com
>>>>>> _________________________________________
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>
>
> --
> _________________________________________
>
>       Daniel Carrasco Marín
>       Ingeniería para la Innovación i2TIC, S.L.
>       Tlf:  +34 911 12 32 84 Ext: 223
>       www.i2tic.com
> _________________________________________
>



-- 
_________________________________________

      Daniel Carrasco Marín
      Ingeniería para la Innovación i2TIC, S.L.
      Tlf:  +34 911 12 32 84 Ext: 223
      www.i2tic.com
_________________________________________

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] HA Filesystem mode (MON, OSD, MDS) with Ceph and HAof MDS daemon.

Reply via email to