-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm trying to put together a quick summary of what I've found out so far with testing juju in an environment with thousands (5000+) agents.
1) I didn't ever run into problems with connection failures due to socket exhaustion. The default upstart script we write for jujud has "limit nofile 20000 20000" and we seem to properly handle that 1 agent == 1 connection. (vs the old 1 agent = >=2 mongodb connections). 2) Agents seem to consume about 17MB resident according to 'top'. That should mean we can run ~450 agents on an m1.large. Though in my testing I was running ~450 and still had free memory, so I'm guessing there might be some copy-on-write pages (17MB is very close to the size of the jujud binary). 3) On the API server, with 5k active connections resident memory was 2.2G for jujud (about 400kB/conn), and only about 55MB for mongodb. DB size on disk was about 650MB. The log file could grow pretty big (up to 2.5GB once everything was up and running though it does compress to 200MB), but I'll come back to that later. Once all the agents are up and running, they actually are very quiet (almost 0 log statements). 4) If I bring up the units one by one (for i in `seq 500`; do for j in `seq 10` do juju add-unit --to $j &; time wait; done), it ends up triggering O(N^2) behavior in the system. Each unit agent seems to have a watcher for other units of the same service. So when you add 1 unit, it wakes up all existing units to let them know about it. In theory this is on a 5s rate limit (only 1 wakeup per 5 seconds). In practice it was taking >3s per add unit call [even when requesting them in parallel]. I think this was because of the load on the API server of all the other units waking up and asking for details at the same time. - From what I can tell, all units take out a watch on their service so that they can monitor its Life and CharmURL. However, adding a unit to a service triggers a change on that service, even though Life and CharmURL haven't changed. If we split out Watching the units-on-a-service from the lifetime and URL of a service, we could avoid the thundering N^2 herd problem while starting up a bunch of units. Though UpgradeCharm is still going to thundering herd. Response in log from last "AddServiceUnits" call: http://paste.ubuntu.com/6329753/ Essentially it triggers 700 calls to Service.Life and CharmURL (I think at this point one of the 10 machines wasn't responding, so it was <1k Units running) 5) Along with load, we weren't caching the IP address of the API machine, which caused us to read the provider-state file from object storage and then ask EC2 for the IP address of that machine. Log of 1 unit agent's connection: http://paste.ubuntu.com/6329661/ Eventually while starting up the Unit agent would make a request for APIAddresses (I believe it puts that information into the context for hooks that it runs). Occasionally that request gets rate limited by EC2. When that request fails it triggers us to stop the "WatchServiceRelations" "WatchConfigSettings" "Watch(unit-ubuntu-4073)" # itself "Watch(service-ubuntu)" # the service it is running It then seems to restart the Unit agent, which goes through the steps of making all the same requests again. (Get the Life of my Unit, get the Life of my service, get the UUID of this environment, etc., there are 41 requests before it gets to APIAddress) 6) If you restart jujud (say after an upgrade) it causes all unit agents to restart the 41 requests for startup. This seems to be rate limited by the jujud process (up to 600% CPU) and a little bit Mongo (almost 100% CPU). It seems to take a while but with enough horsepower and GOMAXPROCS enabled it does seem to recover (IIRC it took about 20minutes). 7) If I "juju deploy nrpe-external-master; juju add-relation ubuntu nrpe-external-master", very shortly thereafter "juju status" reports all agents (machine and unit agents) as "agent-state: down". Even the machine-0 agent. Given I was already close to capacity for even the unit machines there could be any sort of problem here. I would like to try another test where we are a bit farther away from capacity. 8) We do end up CPU throttled fairly often (especially if we don't set GOMAXPROCS). It is probably worth spending some time profiling what jujud is doing. I have the feeling all of those calls to CharmURL are triggering DB reads from Mongo, which is a bit inefficient. I would be fine doing max(1, NumCPUs()-1) or something similar. I'd rather do it inside jujud rather than in the cloud-init script, because computing NumCPUs is easier there. But we should have *a* way to scale up the central node that isn't just scaling out to more API servers. 9) We also do seem to hit MongoDB limits. I ended up at 100% CPU for mongod, and I certainly was never above 100%. I didn't see any way to configure mongo to use more CPU. I wonder if it is limited to 1 CPU per connection, or if it is just always 1 CPU. I certainly think we need a way to scale Mongo as well. If it is just 1 CPU per connection then scaling horizontally with API servers should get us around that limit. 10) Allowing "juju add-unit -n 100 --to X" did make things a lot easier to bring up. Though it still takes a while for the request to finish. It felt like the api call triggered work to start happening in the background which made the current api call take longer to finally complete. (as in, minutes once we had >1000 units). I generally went juju deploy ubuntu -n 10 # grow to 100 for i in `seq 10`; do juju add-unit -n 9 --to $i & done; time wait # grow to 1000 for i in `seq 10`; do juju add-unit -n 90 --to ... # grow to 5000 for i in `seq 10`; do juju add-unit -n 400 --to ... The branch with my patches is available at: lp:~jameinel/juju-core/scale-testing Not everything in there is worth landing in trunk (rudimentary API caching, etc). That's all I can think of for now, though I think there is more to be explored. John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (Cygwin) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlJxCF8ACgkQJdeBCYSNAAOL1gCeNWP1G7a6UaJ1iNxT8HB7RpQo IiUAniGX4CGLwXFUBFNwbFojubvpXUER =4dAx -----END PGP SIGNATURE----- -- Juju-dev mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
