On 30/09/2013, at 10:49 PM, Moullé Alain <[email protected]> wrote:
> Hi, > > sorry for the delay on this thread, I was unavailable a few weeks, but just > FYI, I wanted to share some results I got a few weeks ago: > > I've tried some tests on a configuration and start/stop of 500 Dummy > resources, and I got these time values : > > 1/ configuration with successive crm commands "crm configure primitive ..." > :**it takes about 1H so it is not usable > 2/ with a unique crm command "crm configure < File " with all dummy > primitives in File : it takes 7s / that's OK > 3/ add just one location constraint for each dummy primitive with "crm > configure < File" with all constraints in File : it takes 27s / strange but > acceptable > 4/ Start of the 500 primitives with successive crm commands "crm resource > start ..." :it takes 7mn28 / seems not acceptable moreover for dummy > resources ... > 5/ Start of the 500 primitives with parallel (background) crm commands "crm > resource start ... &" : not possible, lots of commands exit in errors and > anyway it takes also long time > 6/ Start of the 500 primitives in parallel by setting all target-roles to > "Started" in Pacemaker: > => with crm configure edit : s/Stopped/Started on the 500 primitives > Result : around6 mn for all primitives to be started. Seems not acceptable > moreover for dummy resources , and it will take let's say about 3 mn for a > failover if primitives are > well located half on one node and half on the other. > > These results are with dummy resources, and we can imagine that with real > resources it will take much longer, not speaking about the periodic > monitoring of 500 primitives ... > > So, based on these results, I think that the limit in number of resources is > far below 500 resources ... Its not so much the pure number of resources, but how many are doing things at the same time. As Dejan mentioned, there is batch-limit but finding some way to auto-tune this is on the short-term agenda. > > But I wanted to give these results just to keep going on this subject and > perhaps get some ideas ... > > Thanks > Alain > > > > Le 05/09/2013 10:58, Lars Marowsky-Bree a écrit : >> On 2013-09-04T08:26:14, Ulrich Windl <[email protected]> >> wrote: >> >>> In my experience network traffic grows somewhat linear with the size >>> of the CIB. At some point you probably have to change communication >>> parameters to keep the cluster in a happy comminication state. >> Yes, I wish corosync would "auto-tune" to a higher degree. Apparently >> though, that's a slightly harder problem. >> >> We welcome any feedback on required tunables. Those that we ship on SLE >> HA worked for us (and even for rather largeish configurations), but they >> may not be appropriate everywhere. >> >>> Despite of the cluster internals, there may be problems if a node goes >>> online and hundreds of resources are started in parallel, specifically >>> if those resources weren't designed for it. I suspect IP addresses, >>> MD-RAIDs, LVM stuff, drbd, filesystems, exportfs, etc. >> No, most of these resource scripts *are* supposed to be >> concurrency-safe. If you find something that breaks, please share the >> feedback. >> >> It's true that the way how concurrent load limitation is implemented in >> Pacemaker/LRM isn't perfect yet. batch-limit is rather coarse. The >> per-node LRM child limit is probably the best bet right now. But it >> doesn't differentiate between starting many light-weight resources in >> parallel (such as IPaddr) versus heavy-weights (VMs with Oracle >> databases). >> >> (migration-threshold goes in the same direction.) >> >> Historical context matters. Pacemaker comes from the HA world; we still >> believe 3-7 node clusters are the largest anyone ought to reasonably >> build, considering the failure/admin/security domain issues with single >> point of failures and the increasing likelihood of double failures etc. >> >> But there's several trends - >> >> Even those 3-7 nodes become increasingly powerful multi-core kick-ass >> boxes. 7 nodes might well host hundreds of resources nowadays (say, >> above 70 VMs with all their supporting resources). >> >> People build much larger clusters because there's no good way to "divide >> and conquer" yet - e.g., if you build several 3 or 5 node clusters, >> there's no support for managing those clusters-of-clusters. >> >> And people use Pacemaker for HPC style deployments (e.g., private >> clouds with tons of VMs) - because while our HPC support is suboptimal, >> it is better than the HA support in most of the Cloud offerings. >> >> >>> As a note: Just recently we had a failure in MD-RAID activation with no real >>> reason to be found in syslog, and the cluster got quite confused. >>> (I had reported this to my favourite supporter (SR 10851868591), but haven't >>> heard anything since then...) >> I'll try to dig that out of the support system and give it a look. >> >> >> Regards, >> Lars >> > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
