On 30/09/2013, at 10:49 PM, Moullé Alain <[email protected]> wrote:

> Hi,
> 
> sorry for the delay on this thread, I was unavailable a few weeks, but just 
> FYI,  I wanted to share some results I got a few weeks ago:
> 
> I've tried some tests on a configuration and start/stop of 500 Dummy 
> resources, and I got these time values :
> 
> 1/ configuration with successive crm commands  "crm configure primitive ..." 
> :**it takes about 1H so it is not usable
> 2/ with a unique crm command "crm configure < File "  with all dummy 
> primitives in File : it takes 7s  / that's OK
> 3/ add just one location constraint for each dummy primitive with "crm 
> configure < File" with all constraints in File : it takes 27s / strange but 
> acceptable
> 4/ Start of the 500 primitives with successive crm commands "crm resource 
> start ..."   :it takes 7mn28 / seems not acceptable moreover for dummy 
> resources ...
> 5/ Start of the 500 primitives with parallel (background) crm commands "crm 
> resource start ... &" : not possible, lots of commands exit in errors and 
> anyway it takes also long time
> 6/ Start of the 500 primitives in parallel by setting all target-roles to 
> "Started" in Pacemaker:
>    => with crm configure edit : s/Stopped/Started on the  500 primitives
>    Result : around6 mn for all primitives to be started. Seems not acceptable 
> moreover for dummy resources , and it will take let's say about 3 mn for a 
> failover if primitives are
>    well located half on one node and half on the other.
> 
> These results are with dummy resources, and we can imagine that with real 
> resources it will take much longer, not speaking about the periodic 
> monitoring of 500 primitives ...
> 
> So, based on these results, I think that the limit in number of resources is 
> far below 500 resources ...

Its not so much the pure number of resources, but how many are doing things at 
the same time.
As Dejan mentioned, there is batch-limit but finding some way to auto-tune this 
is on the short-term agenda.

> 
> But I wanted to give these results just to keep going on this subject and 
> perhaps get some ideas ...
> 
> Thanks
> Alain
> 
> 
> 
> Le 05/09/2013 10:58, Lars Marowsky-Bree a écrit :
>> On 2013-09-04T08:26:14, Ulrich Windl <[email protected]> 
>> wrote:
>> 
>>> In my experience network traffic grows somewhat linear with the size
>>> of the CIB. At some point you probably have to change communication
>>> parameters to keep the cluster in a happy comminication state.
>> Yes, I wish corosync would "auto-tune" to a higher degree. Apparently
>> though, that's a slightly harder problem.
>> 
>> We welcome any feedback on required tunables. Those that we ship on SLE
>> HA worked for us (and even for rather largeish configurations), but they
>> may not be appropriate everywhere.
>> 
>>> Despite of the cluster internals, there may be problems if a node goes
>>> online and hundreds of resources are started in parallel, specifically
>>> if those resources weren't designed for it. I suspect IP addresses,
>>> MD-RAIDs, LVM stuff, drbd, filesystems, exportfs, etc.
>> No, most of these resource scripts *are* supposed to be
>> concurrency-safe. If you find something that breaks, please share the
>> feedback.
>> 
>> It's true that the way how concurrent load limitation is implemented in
>> Pacemaker/LRM isn't perfect yet. batch-limit is rather coarse. The
>> per-node LRM child limit is probably the best bet right now. But it
>> doesn't differentiate between starting many light-weight resources in
>> parallel (such as IPaddr) versus heavy-weights (VMs with Oracle
>> databases).
>> 
>> (migration-threshold goes in the same direction.)
>> 
>> Historical context matters. Pacemaker comes from the HA world; we still
>> believe 3-7 node clusters are the largest anyone ought to reasonably
>> build, considering the failure/admin/security domain issues with single
>> point of failures and the increasing likelihood of double failures etc.
>> 
>> But there's several trends -
>> 
>> Even those 3-7 nodes become increasingly powerful multi-core kick-ass
>> boxes. 7 nodes might well host hundreds of resources nowadays (say,
>> above 70 VMs with all their supporting resources).
>> 
>> People build much larger clusters because there's no good way to "divide
>> and conquer" yet - e.g., if you build several 3 or 5 node clusters,
>> there's no support for managing those clusters-of-clusters.
>> 
>> And people use Pacemaker for HPC style deployments (e.g., private
>> clouds with tons of VMs) - because while our HPC support is suboptimal,
>> it is better than the HA support in most of the Cloud offerings.
>> 
>> 
>>> As a note: Just recently we had a failure in MD-RAID activation with no real
>>> reason to be found in syslog, and the cluster got quite confused.
>>> (I had reported this to my favourite supporter (SR 10851868591), but haven't
>>> heard anything since then...)
>> I'll try to dig that out of the support system and give it a look.
>> 
>> 
>> Regards,
>>     Lars
>> 
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to