With appropriate tuning of various parameters, we have seen customers get
over 1000 executors on-line (with an average of 2-3 per slave)

With a large number of slaves, you are usually better using one of the
on-demand retention strategies.

Getting all 1000 executors running builds can be problematic though.

I am currently working on a scalability framework for testing how Jenkins
scales. It will be here
https://github.com/jenkinsci/scalability-test-framework once I finish some
refactoring that I identified as required - I want people to have a mostly
stable API when I publish this framework.

Some other tooling I have built:

* https://github.com/jenkinsci/mock-load-builder-plugin which will create
build jobs that should load up the remoting channel with load
representative of a chatty build

* https://github.com/jenkinsci/random-job-builder-plugin which will build
jobs selected at random at a specified rate.

With that tooling you can set up a Jenkins instance with a load of jobs and
have those jobs queue up in a semi-realistic - if stressed - way.

Using all the above I can report that:

A 1.553 Jenkins master on an m3.large can support:

* Connecting 60 JNLP slaves and having them idle (but the system will be
unresponsive for 2-3 minutes after startup); OR
* Connecting 60 SSH slaves and having them idle (but the system will be
unresponsive for 5-6 minutes after startup); OR
* Connecting 60 CloudBees NIO SSH slaves and having them idle

All with 2 executors per slave

For both of the SSH slave options above, you need to tell the JVM to use
/dev/./urandom as the entropy source. I suspect the newer NIO JNLP mode
will have removed / reduced the unresponsivity of the JNLP master after
startup... also that is a side-effect of me starting all the JNLP slaves at
the same time. Real systems will likely have JNLP slaves connect in a more
staggered way and not suffer the thread contention that locks up the Web UI.

On each of those test systems I created 3000 mock jobs organized in
folders. I then upped the rate of builds until the Web UI became
unresponsive.

* JNLP hits system load > 5 and web UI is unusable at somewhere between 50
and 55 concurrent builds on an m3.large
* Traditional SSH hits system load > 5 and web UI is unusable at between 10
and 12 concurrent builds on an m3.large
* CloudBees NIO SSH hits system load of 4 at 15 concurrent builds, but Web
UI remains usable all the way up to 120 concurrent build. The build
duration - however - is increased once you go past 15 concurrent builds. So
the cause and effect here is that back-pressure is being forced on the
build in order to allow the master to remain usable...

I picked a m3.large as being a reasonably cost-effective machine type that
would let me scale up to a size where you should start seeing problems of
scalability but not so large that I need a massive army of machines to
saturate it.

HTH


On 28 July 2014 23:51, Maureen Barger <mobar...@gmail.com> wrote:

> Hi - I am wondering if there is a limit to how many slaves can connect to
> one master.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to jenkinsci-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to