Merged. I hope things will be more stable for quite a while :). Let me know if you see some instabilities - there are still at least a few occasional flaky tests, but they should be few and far between and I hope we can efficiently get rid of them now.
J. On Thu, Dec 8, 2022 at 2:05 AM Jarek Potiuk <ja...@potiuk.com> wrote: > Two PRs merged. Two more to go: > > * https://github.com/apache/airflow/pull/28209 > * https://github.com/apache/airflow/pull/28207 > > I run quite a few public runs and I have not seen more memory problems :) > - so once we merge those, we should be back in green for Public runners as > well (plus the builds should be a bit faster). > > J. > > > On Wed, Dec 7, 2022 at 6:36 PM Oliveira, Niko <oniko...@amazon.com.invalid> > wrote: > >> Awesome to hear this! >> >> I was really battling this issue last week, very excited for these >> improvements, let me know if I can help. >> >> Cheers, >> Niko >> ------------------------------ >> *From:* Jarek Potiuk <ja...@potiuk.com> >> *Sent:* Tuesday, December 6, 2022 5:54:07 AM >> *To:* dev@airflow.apache.org >> *Subject:* [EXTERNAL] [PROPOSAL] Dealing with public runner test failues >> (Integration tests restructuring) >> >> CAUTION: This email originated from outside of the organization. Do not >> click links or open attachments unless you can confirm the sender and know >> the content is safe. >> >> >> >> Hey everyone, >> >> I think many contributors (non-committers) started to suffer from >> often failing (disappearing) test runs (mostly for sqlite). >> >> Together with @Taragolis, we looked at those recent stability issues >> with "public runners". They all boil down to the integration tests >> taking too much memory. >> >> Example screenshot from a debug run that I run when trying to "catch >> the problem in the act" with debugging enabled is attached. Seems that >> just before such failure we had just 55 M (out of 7G available in the >> public runners) - just before the runner "disappeared". Looks like the >> writing is on the wall. >> >> There are two ways we will be addressing it shortly (unless someone >> objects or have more/ other ideas to improve it): >> >> 1. Improving the ways how integration tests are structured and running >> >> * We will reorganize our integration tests to be (similar to system >> tests) in a separate subfolder of the "tests' ' - this will allow for >> easier discovery and a better structured approach to all integration >> tests. >> >> * We will STOP running integration tests in regular test jobs of ours. >> Instead we will introduce a separate "Integration Test" job that will >> run only integration tests and that will run the integrations >> ``one-by-one" - i.e. we will not be starting kerberos, mongo, redis >> all together, but will only start minimal set of integrations needed >> for the tests that are using them >> >> 2. Arranging for bigger public runners >> >> I am discussing - in the Apache Infrastructure meetings - (next >> meeting is on Wednesday) using more powerful Public runners. This is >> possible, and we just need to make sure INFRA/Apache is not overusing >> the free runners the Apache Software Foundation gets as a generous >> sponsorship from GitHub. This might actually vastly decrease the >> feedback time you get as non-committers as we can get up to 4x times >> faster builds this way. >> >> J. >> >