New Windows Jenkins Nodes
Greetings all! This might seem like a very quick update/change, but we needed to get some new Windows 2016 boxes into Jenkins rotation. I've swapped out windows-2012-2, 3, windows-2016-2, 3 for new nodes. There are currently builds running on 2 of the other nodes, which I'll let finish before move those out of rotation. The new nodes are jenkins-win-he-de-1 through 6. They are better, faster, stronger, so the upgrade should see improved build speeds. If you have any issues with builds on those nodes, please create a JIRA ticket or ping us/me on Slack. Thanks! -Chris T. #asfinfra
Re: Many Maven builds hanging on Jenkins
I was offline for a week, is it still an issue? On 22-6-2019 14:12:29, Stefan Seelmann wrote: Seems it happened again. Ctrl-F "Maven TLP" shows 335 builds running or waiting in the queue. On 6/17/19 7:31 PM, Tibor Digana wrote: > Who can rework the Jenkins plugin we use, so that the build won't be > triggered after Groovy libs have changed? > Somebody changes [1] and [2] and then all 100 Maven projects run all bunch > of branches. > The queue is huge in Jenkins, and this is the blocker for the entire > organization and not only for us Maven! > > [1]: https://gitbox.apache.org/repos/asf?p=maven-jenkins-env.git > [2]: https://gitbox.apache.org/repos/asf?p=maven-jenkins-lib.git > > > On Mon, Jun 17, 2019 at 7:16 PM Robert Scholte wrote: > >> Sure, will investigate. >> >> thanks, >> Robert >> >> On Mon, 17 Jun 2019 15:21:30 +0200, Robert Munteanu >> wrote: >> >>> Hi, >>> >>> I noticed today that Jenkins is taking more time to start building >>> various jobs. Looking at the executors, I think there is a >>> misconfiguration/problem with some Maven-related jobs >>> >>> - >>> https://builds.apache.org/job/maven-box/job/maven/job/MNG-6672/6/console >>> >>> This was started 3d17h ago. >>> >>> - >>> >> https://builds.apache.org/job/maven-box/job/maven-resolver/job/MRESOLVER-12/6/console >>> >>> This was started 4d4h ago >>> >>> - >>> https://builds.apache.org/job/maven-box/job/maven/job/master/226/console >>> >>> This was started 1d16h ago. >>> >>> And quite some more, but I guess you get the point :-) It would be >>> great if someon from the Maven project could look into this. >>> >>> Thanks! >>> >>> Robert >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org >> For additional commands, e-mail: dev-h...@maven.apache.org >> >> >
External CI Service Limitations
Hi folks, As this seems to be a hot topic as of late, I'll provide some information about our usage of external CI services. Travis CI: The foundation has an agreement with Travis CI to provide our projects with external CI services through them. At current, we have approximately 40 executors there, half of which are currently (build-time-wise) being occupied by three projects; Flink (21%), Arrow (18%) and Airflow (13%). The foundation is currently not looking at increasing the number of executors there, as we are assessing long-term costs and benefits, and we advise projects that have higher immediate needs for CI to either use our Jenkins CI system or figure out a budget/plan (whether that be internal or external to the foundation) for other options, and to also assess whether the number of builds and their duration fit with the overall goal of the CI need and the resources available at our disposal. Currently, all projects have, as of this week, been capped at 5 concurrent builds with Travis. AppVeyor and CircleCI: The foundation makes use of the free tier of these, as we have not received any requests for an increase, nor assessed whether this is beneficial, and as there are, as things are right now, permission issues that go against our standard policies for repository access. With regards, Daniel on behalf of ASF Infra.
Re: External CI Service Limitations
We also experience huge delays for Airflow (seems that we are the third "whale" according to https://lists.apache.org/thread.html/af52e2a3e865c01596d46374e8b294f2740587dbd59d85e132429b6c@%3Cbuilds.apache.org%3E) We are evaluating other options for funding as well (including getting some credits from Google for Google Cloud Build / GCP) but it will take time to get resources and to switch. In the meantime maybe INFRA can help to coordinate some effort between Flink/Arrow/Airflow to decrease pressure on Travis? We considered few options (and are going to implement some of them shortly I think). Some of them are not direct changes in Travis CI builds but some other workflow/infrastructure changes that will decrease pressure on Travis: * We are going to decrease the matrix of builds we run - currently we have several combinations of Airflow builds (postgres/mysql/sqlite) x (python3.5/ python 3.6) - but we will only run subset of those rather than full matrix * we are going to combine several of our jobs into one using parallel processing. This is mainly for static code analysis - currently we have one job for each analysis which makes them run in parallel. After the change - when you include machine boot times and use all processors, the overall build time might be even faster than today - AND there will be far less vms to start for the builds. * we have separate kubernetes-related job. It currently runs only one suite of tests specific to Kubernetes as it requires special setup of the environment, but we are looking into possibility of merging Kubernetes tests into main tests (and faster environment setup with docker-compose) and save 1 job (25% of our test jobs). The main jobs will run a bit longer, but the whole overhead for starting extra job will be gone. * We introduce (PR is in the final stages of review) an easy way for contributors to run static code analysis on their own environment. A lot of our builds are PR failing because of static code analysis that is run on Travis. Currently it was a bit convoluted and not easily reproducible to run full analysis locally , but we are moving to a fully dockerised setup for builds that will allow contributors to easily run such checks on their machines and we will encourage people to run it locally, rather than submit PRs just to check if the code is right. * Even more - we are introducing and encouraging easy-to-use "pre-commit" framework in our developer workflow where the analysis will be run at commit time for only the changes being committed - this might further decrease the number of builds submitted by the contributors. * Lastly - we are introducing an easy to use "simplified development environment" where developers will be able to run all or subset of test suites easily on their machine. Currently our setup is fairly convoluted as well but we have a PR in progress to address it and have a very easy way (again - fully dockerised) to reproduce the test environment. Maybe the committers from Flink and Arrow can also take a look at non-obvious ways how their projects can decrease pressure on Travis (at least for the time being). Maybe there are some quick wins we can apply in short time in coordinated way and buy more time for switching the infrastructure ?
Re: External CI Service Limitations
On Tue, Jul 2, 2019 at 11:56 PM Jarek Potiuk wrote: >... > In the meantime maybe INFRA can help to coordinate some effort between > Flink/Arrow/Airflow to decrease pressure on Travis? We considered few > options (and are going to implement some of them shortly I think). Some of > them are not direct changes in Travis CI builds but some other > workflow/infrastructure changes that will decrease pressure on Travis: > >... > Maybe the committers from Flink and Arrow can also take a look at > non-obvious ways how their projects can decrease pressure on Travis (at > least for the time being). Maybe there are some quick wins we can apply in > short time in coordinated way and buy more time for switching the > infrastructure ? > The above is fabulous. Please continue trading thoughts and working to reduce your Travis loads, for your own benefit and for your fellow projects at the Foundation. This list is the best space to trade such ideas. I'm not sure what Infra can do, as our skillset is quite a bit different from what your projects need, for reducing load. We'll keep this list apprised of anything we find. If anybody knows of, and/or can recommend a similar type of outsourced build service ... we *absolutely* would welcome pointers. We're gonna keep Jenkins and buildbot around for the foreseeable future, and are interested in outsourced solutions. Cheers, Greg Stein Infrastructure Administrator, ASF
Re: External CI Service Limitations
> On Jul 2, 2019, at 10:21 PM, Greg Stein wrote: > > We'll keep this list apprised of anything we find. If anybody knows of, > and/or can recommend a similar type of outsourced build service ... we > *absolutely* would welcome pointers. FWIW, we’ve been collecting them bit by bit into Apache Yetus ( http://yetus.apache.org/documentation/in-progress/precommit-robots/ ): * Azure Pipelines * Circle CI * Cirrus CI * Gitlab CI * Semaphore CI * Travis CI They all have some pros and cons. I’m not going to rank them or anything. I will say, however, it really feels like Gitlab CI is the best bet to pursue since one can add their own runners to the Gitlab CI infrastructure dedicated to their own projects. That ultimately means that replacing Jenkins slaves is a very real possibility. (Also, I’ve requested access to the Github Actions beta, but haven’t received anything yet. I have a hunch that the reworking of the OAuth permission model is related, which may make some of these more viable for the ASF.)
Re: External CI Service Limitations
Azure pipeline vas the big plus of supporting Linux Windows and macos nodes And i think you can add you nodes to the pools Jeff Le mer. 3 juil. 2019 à 08:04, Allen Wittenauer a écrit : > > > On Jul 2, 2019, at 10:21 PM, Greg Stein wrote: > > > > We'll keep this list apprised of anything we find. If anybody knows of, > > and/or can recommend a similar type of outsourced build service ... we > > *absolutely* would welcome pointers. > > FWIW, we’ve been collecting them bit by bit into Apache Yetus ( > http://yetus.apache.org/documentation/in-progress/precommit-robots/ ): > > * Azure Pipelines > * Circle CI > * Cirrus CI > * Gitlab CI > * Semaphore CI > * Travis CI > > They all have some pros and cons. I’m not going to rank them or > anything. > > I will say, however, it really feels like Gitlab CI is the best > bet to pursue since one can add their own runners to the Gitlab CI > infrastructure dedicated to their own projects. That ultimately means that > replacing Jenkins slaves is a very real possibility. > > (Also, I’ve requested access to the Github Actions beta, but > haven’t received anything yet. I have a hunch that the reworking of the > OAuth permission model is related, which may make some of these more viable > for the ASF.)
Re: External CI Service Limitations
> On Jul 2, 2019, at 11:12 PM, Jeff MAURY wrote: > > Azure pipeline vas the big plus of supporting Linux Windows and macos nodes There’s a few that support various combinations of non-Linux. Gitlab CI has been there for a while. Circle CI has had OS X and is in beta with Windows. Cirrus CI has all those plus FreeBSD. etc, etc. It’s quickly becoming required that cloud-based CI systems do more than just throw up a Linux box. > And i think you can add you nodes to the pools I think they are limited to being on Azure tho, IIRC. But I’m probably not. I pretty much gave up on doing anything serious with it. I really wanted to like pipelines. The UI is nice. But in the end, Pipelines was one of the more frustrating ones to work with in my experience—and that was with some help from the MS folks. It suffers by a death of a thousand cuts (lack of complex, real-world examples, custom docker binary, pre-populated bits here and there, a ton of env vars, artifact system is a total disaster, etc, etc). Lots of small problems that add up to just not being worth the effort. Hopefully it’s improved since I last looked at it months and months ago though.