Just for your information Greg - we are now experiencing another issue with Travis CI. Suddenly our jobs are resource-strapped. Our builds started to fail with Out of Memory errors and Not enough CPU (to run minikube). I opened a critical infrastructure ticket - https://issues.apache.org/jira/browse/INFRA-18787 , I disabled automated PR building (they are all failing anyway) and remove the limit on number of concurrent jobs temporarily so that I can test potential solutions or retest it if Travis fixes the problem (big if).
Now we are REALLY blocked. J. On Mon, Jul 22, 2019 at 3:40 PM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > Harsh "no" indeed - I understand where it comes from as I am looking from > my project perspective. > > But maybe instead of binary yes/no answer - we can think about some > compromise/temporary solution (for a week or so while we run the release > and try to migrate out to another CI). > Maybe Increasing just our project's capacity (if possible selectively) to > 10 or 15 for a week might help us in the event of the coming release and > migration to another CI. > > I feel really sad and angry about it, that in this case Apache Inra is not > a bit more helpful. > > Seems that we are suddenly - without a real warning - being punished by > having very well tested code with good engineering practices (small > commits, one commit per PR), strong expectations that every commit goes > through extensive full testing on all environments and very active > community with a lot of contributors. Our workflow really depends on those > tests to work and such sudden limitation is not really nice signal to the > community. > > We do whatever we can to limit the pressure on Travis: > > - We already decreased the build matrix to very limited set of tests. > - I've already asked all contributors to run most of the testing - on > static analysis especially - locally (and we even merged a big change last > week to make it super-easy). > - We are adding pre-commit hook framework to make it fully automated > and even easier for the contributors to run those checks before they hit > Travis > - We are going to merge some of those jobs in one so that they limit > number of jobs. > - We merged a change that makes it easy to migrate out of Travis > - We already are in a process of moving to the GKE-provided > infrastructure, we initiated discussions with GitLabCI and we managed to > secure resources from Google in GCP to run our workflows. I hope I can have > a working POC this week and be able to migrate soon after. > > The problem is that with the current limitation even our effort to move > out is hampered because we have a number of changes pending to be able to > finally migrate but it takes hours for those changes to even go through, > not mentioning merging. > Is there anything Apache Infra can help with it? > > > J. > > On Mon, Jul 22, 2019 at 2:31 PM Greg Stein <gst...@gmail.com> wrote: > >> On Mon, Jul 22, 2019 at 2:20 AM Jarek Potiuk <jarek.pot...@polidea.com> >> wrote: >> >> > Hello Everyone (especially the infrastructure), >> > >> > Can we increase a number of workers/jobs we have per project now? >> > Decreasing it to 5 (which I believe is the case) is terrible for us now >> > We are nearing 1.10.4 release with Airflow and if we have more than >> one PR >> > in the queue it waits for several hours to run! >> > >> > Can we increase the limits to 15 or so (3 parallell builds for Airflow >> as >> > we are running 5 jobs per stage). >> > >> >> Sorry to say, but "no". Travis is a *shared* resource. >> >> As noted elsethread, before we applied limits, Airflow used about 77,000 >> minutes in a month. That is tying up two executors full-time for the >> entire >> month. We have a hundred projects using Travis, and Airflow consumed a >> 20th >> of our entire capacity. >> >> The limit for all projects shall remain at five (5). >> >> You can always run your tests locally, to prepare for your upcoming >> release. The Foundation's paid resources need to remain shared/available >> for all of our projects. >> >> Regards, >> Greg Stein >> Infrastructure Administrator, ASF >> > > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>