Good practices to scale, IMO: * don't build on the master * Yes. Add agents/slaves, see below. * Never put more than one executor per agent (slave term now deprecated). Engineering time is far more expensive than having N agents, preventing builds to step over each other's toes. ** We initially used static agents, running in a corporate ESX. And now it's mostly running using a docker swarm cluster, for about 60 hours a day and ~1000 active jobs. * IMO directly go to two or three agents and not just one. This way you'll maybe avoid (your users) designing builds to depend on a specific machine. ** Corollary: never use node names as labels.
My 2 cents -- Baptiste Le 27 juil. 2016 8:42 PM, "Bruce Epstein" <goo...@zeusprod.com> a écrit : > Hi - > > I'm an experienced Jenkins user (writing Ant scripts, using plugins, etc.) > but not an IT/administrator, and my IT dept is not that familiar with > Jenkins scaling. > > If anyone can point me to a comprehensive discussion of the best way to > scale, please provide a url. > > Current architecture: > > Only one master with just a single executor. > All jobs are run on the master > Running jenkins 1.652 > The load is not the heavily. We probably never have more than 2 or 3 users > needing Jenkins at the same time, and usually it is just one. 95% of the > time, we don't have a scale issue, so I don't want to over-engineer the > solution. > We have three or four development teams, and sometimes queue conflicts > arise. We want to scale up a bit for future growth. > > Current problems: > 1. Some jobs (with three or four sub-jobs) monopolize the queue for 30+ > minutes, preventing other jobs from running. One in particular is a library > built in response to an svn change, which then triggers four other apps to > rebuild. These are separate Jenkins jobs and yet they hog the queue > preventing other users from running any jobs, even "in between" each app > being rebuilt. > > 2. Some multiconfiguration jobs (that build, say, 30 war files), can take > about 90 minutes to run (3 minutes per iteration). We'd like to cut that > down, but at least they allow other jobs to run (i.e. don't monopolize the > queue). These wars can be built in parallel (no need to run in series, > which is the default for multiconfiguration jobs, I assume). > > Things I've tried: > 1. No matter how I've tried to configure the queue-hogging job, I can't > get it to "play nicely". Once it starts, it runs all the way through (say, > 4 subjobs, each taking about 8 minutes). So, configuring the master to use, > say, 2 or 3 executors seems to be one way to allow other jobs to run > without being shut out. > > 2. Increasing the number of executors "works" for some use cases, but it > also seems to cause jobs to run in parallel that I need to run in sequence. > I'm unclear on how to prevent multiple executors from being used when I > want one job to wait for another. Is this just how executors work? How do I > ensure the extra executors are assigned to other jobs and not just used in > parallel for the queue-hogging job? > > Possible solutions: > 1. Add slaves? (see below) > 2. Use multiple executors with BuildFlow or similar plugins to prevent > jobs being triggered to run in parallel? Even BuildFlow seems to require at > least two executors, or it hangs up trying to launch the first subjob in > the flow. > > Proposed solution: > > 1. Stick with only one master. Creating multiple masters seems unnecessary > at our size. > 2. Don't build jobs on the master...leave that to the slaves. (This seems > to be the best practice?) > 3. Create two slaves eventually (one is enough for now while we are still > performing builds on master too) > 4. Configure one slave to use only one executor. Configure the second > slave to use multiple executors. > 5. Configure certain jobs to run on the appropriate slave (single-executor > or multi-executor) depending on the job's needs. > > 6. Should I be looking at CloudBees or plugins like EC2, Heavy Job, or > One-Shot Executor? > > > I need someone who has "been there, done that" to give me a reality check > or alert me to any blindspots before I ask IT to acquire more hardware and > configure it. I want to have some confidence this will solve the problem > without being overkill. > > > Any insights appreciated. > > > In gratitude, I'm happy to answer any Flex questions. :-) > > > Thanks, > > Bruce > > > -- > You received this message because you are subscribed to the Google Groups > "Jenkins Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to jenkinsci-users+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/jenkinsci-users/3a038f29-5221-4830-80ae-ca5a70c7ccc7%40googlegroups.com > <https://groups.google.com/d/msgid/jenkinsci-users/3a038f29-5221-4830-80ae-ca5a70c7ccc7%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/CANWgJS7ZULUVegsxiHKNohP3k7ZqoU8MMFA5F3TFHujN9pcZAg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.