Why was a workflow change that impacts every. single. PR. not discussed in this list before turning it on? I can’t think of a more impactful change to committers.
Just last month weren’t you saying we needed to have more discussion about impactful change on this list? > On 30 Apr 2025, at 10:04, Jarek Potiuk <ja...@potiuk.com> wrote: > Yeah, And I would encourage everyone to try it and provide feedback while > it is enabled. > > So far we identified a few things (and fix them) that made it borderline > unusable (bug in workflow for UI-only changes, GitHub starting to 429-us > with too many requests, and the mysterious "hanging" of the latest > botocore/celery (?) "special tests" -> all that is already addressed in > main, and those issues should not happen (hopefully), so I'd say we can now > "truly" see how it might work. > > And one comment from my side - indeed, I find it nice actually, but it's > definitely not a "deal breaker"- so if others find it too disruptive, I am > definitely not going to die on this hill, I just thought it does improve > the workflow in the way that it allows for mostly "fire and forget" when > you approve the workflow, one thing that it definitely improves is that you > do not have to remember about coming back to merge a request when CI > succeeds. > > J. > > On Wed, Apr 30, 2025 at 10:58 AM Kaxil Naik <kaxiln...@gmail.com> wrote: > >> Forgot to note an additional point in Summary: If we find anything blocking >> us in that period, we will merge >> https://github.com/apache/airflow/pull/50009 to disable auto-merge. >> >> On Wed, 30 Apr 2025 at 14:26, Kaxil Naik <kaxiln...@gmail.com> wrote: >> >>> Jarek & I discussed it on Slack in #internal-airflow-ci-cd. Summary >> below: >>> I have a PR to disable it: https://github.com/apache/airflow/pull/50009. >>> However, given that many countries will be on holiday on May 1 due to >>> Labour Day, and some teething issues were fixed yesterday, we will let it >>> run for a few more days so other committers and contributors can get a >>> chance to try it out and share their experience after the >> experiment/trial >>> is concluded. >>> On Wed, 30 Apr 2025 at 13:59, Kaxil Naik <kaxiln...@gmail.com> wrote: >>>> Whoops yeah. >>>>> Yep. Because it did not have all conversations resolved. We also have >>>> "require resolved conversation" as one of the branch protection >>>> conditions. >>>> I resolved the conversation and it got merged automatically. >>>> Let's adapt it when ready though, I don't see any urgency of getting >>>> enabling auto-merge or getting it contributed immediately to asf INFRA >> when >>>> it isn't critical. It is about priortization >>>> I'd say - if that is really bothering us - let's shape the feature >> rather >>>>> than outright reject it. >>>> On Wed, 30 Apr 2025 at 12:37, Jarek Potiuk <ja...@potiuk.com> wrote: >>>>> On Wed, Apr 30, 2025 at 7:55 AM Kaxil Naik <kaxiln...@gmail.com> >> wrote: >>>>>> To the point that the original PR is still not merged even after I >> had >>>>>> re-triggered the failed tests yesterday: >>>>>> https://github.com/apache/airflow/pull/49727 >>>>> Yep. Because it did not have all conversations resolved. We also have >>>>> "require resolved conversation" as one of the branch protection >>>>> conditions. >>>>> I resolved the conversation and it got merged automatically. >>>>>> On Wed, 30 Apr 2025 at 11:20, Kaxil Naik <kaxiln...@gmail.com> >> wrote: >>>>>>> The gitbox escape hatch isn't it though -- if we are to allow that >>>>> why >>>>>> not >>>>>>> just allow people to merge it directly from github that to go via >> an >>>>>>> "escape hatch". >>>>> Generally speaking GitHub has this option. Currently "admins" have a >>>>> possibility of overriding branch protection (via UI). And it would be >>>>> possible - if INFRA will allow it - to possibly add an .asf.yaml >> feature >>>>> to >>>>> also allow branch protection override for all committers or a subset of >>>>> the >>>>> committers (PMC Members ? ). This is more of a limitation of the >> current >>>>> implementation of permissions than a missing feature. If we all feel >> that >>>>> the gitbox escape hatch is not enough, we can likely even contribute >>>>> such a >>>>> feature to .asf.yaml - if INFRA will be ok with the option. It's very >>>>> easy >>>>> to contribute to - INFRA made it possible, we have a new framework: >>>>> https://github.com/apache/infrastructure-asfyaml - we can even >> implement >>>>> "airflow-only" .asf.yaml feature, that will not be initially available >> to >>>>> other ASF projects and later we can promote it to be available to >>>>> everyone. >>>>> I'd say - if that is really bothering us - let's shape the feature >> rather >>>>> than outright reject it. >>>>>>> I am -1 on this auto-merge feature >>>>> Understood :). But let's give it a bit more time as well and maybe >>>>> improve >>>>> it (see above) - unless we really feel we are blocked now - then it >>>>> should >>>>> be as easy as merging an .asf.yaml change to disable it. >>>>>>> On Wed, 30 Apr 2025 at 11:18, Kaxil Naik <kaxiln...@gmail.com> >>>>> wrote: >>>>>>>> That’s not a single person, it impacts the committers and the PR >>>>> author >>>>>>>> involved too. I don’t see how team productivity soars here. >>>>>>>> On Wed, 30 Apr 2025 at 02:39, Jarek Potiuk <ja...@potiuk.com> >>>>> wrote: >>>>>>>>> But yes, I also miss the previous "merge because I think it's >> safe" >>>>>>>>> workflow. >>>>>>>>> I badly miss it. Personally, It hurts my productivity. >>>>>>>>> But I think the "require status check" to be green is great for >>>>> "team >>>>>>>>> productivity". Usually when single person is impacted more than >>>>> team in >>>>>>>>> general, it's worse for the person impacted, but team >> productivity >>>>>> soars. >>>>>>>>> J. >>>>>>>>> On Tue, Apr 29, 2025 at 11:03 PM Jarek Potiuk <ja...@potiuk.com> >>>>>> wrote: >>>>>>>>>> Just to add comment: >>>>>>>>>> a) there was some instability of "celery/boto" hanging tests >>>>> today >>>>>>>>> that is >>>>>>>>>> rather difficult to address - but we worked around it by just >>>>>> removing >>>>>>>>>> "special-tests" from pre-requisite of merging >>>>>>>>>> b) GitHub today (like literally today!) started to be picky on >>>>> "too >>>>>>>>> many >>>>>>>>>> requests" - I addressed it today for both helm chart and our >>>>> release >>>>>>>>> tests >>>>>>>>>> (we are using bearer-token to authenticate and increase the >>>>> limit - >>>>>>>>> and we >>>>>>>>>> added cache on downloading json schema that was downloaded >>>>> "per-test" >>>>>>>>>> c) in cases like the one mentioned above with intermittent >>>>> failures - >>>>>>>>>> simple "rerun failed jobs" (assuming it will succeed after >>>>> rerun) - >>>>>> is >>>>>>>>>> essentially equivalent of "merge" (unless it fails again which >>>>> for me >>>>>>>>> is a >>>>>>>>>> signal of "DO NOT MERGE") >>>>>>>>>> d) we always have the "gitbox" escape hatch - that allows any >>>>>>>>> committer to >>>>>>>>>> push the fix directly, bypassing the limits: >>>>>>>>>> This is a simple thing for committers: >>>>>>>>>> git add remote gitbox >>>>>> https://gitbox.apache.org/repos/asf/airflow.git >>>>>>>>>> git fetch gitbox >>>>>>>>>> git commit --amend ("add #PR number") >>>>>>>>>> git push gitbox BRANCH_NAME:main (you need to provide your >>>>> apache id >>>>>>>>> and >>>>>>>>>> password) >>>>>>>>>> This is a nice escape hatch that we can use as "exceptional >>>>>> workflow" - >>>>>>>>>> and it works - I did it quite a few times over the last few >>>>> days. Not >>>>>>>>> UI >>>>>>>>>> controlled, but IMHO exceptional workflow should be - well - >>>>>>>>> exceptional. >>>>>>>>>> J. >>>>>>>>>> On Tue, Apr 29, 2025 at 8:52 PM Kaxil Naik < >> kaxiln...@gmail.com> >>>>>>>>> wrote: >>>>>>>>>>> Similar experience as Elad, I am in favor of disabling it tbh. >>>>> For >>>>>>>>>>> example, >>>>>>>>>>> https://github.com/apache/airflow/pull/49727 has a failing >>>>> test as >>>>>>>>> below >>>>>>>>>>> -- >>>>>>>>>>> which is not an issue, and test passes locally so I would want >>>>> to >>>>>>>>> merge it >>>>>>>>>>> but I can't. >>>>>>>>>>> FAILED >> helm-tests/tests/helm_tests/airflow_aux/test_basic_helm_chart.py::TestBaseChartTest::test_priority_classes >>>>>>>>>>> - requests.exceptions.HTTPError: 429 Client Error: Too Many >>>>> Requests >>>>>>>>> for >>>>>>>>>>> url: >> https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/v1.29.1-standalone-strict/priorityclass-scheduling-v1.json >>>>>>>>>>> On Tue, 29 Apr 2025 at 18:29, Jarek Potiuk <ja...@potiuk.com> >>>>>> wrote: >>>>>>>>>>>> On Tue, Apr 29, 2025 at 1:46 PM Elad Kalif < >>>>> elad...@apache.org> >>>>>>>>> wrote: >>>>>>>>>>>>> Thanks for that Jarek! >>>>>>>>>>>>> I find the lack of ability to merge PRs fast very limiting >>>>> but >>>>>> it >>>>>>>>>>> might >>>>>>>>>>>> be >>>>>>>>>>>>> just something to get used to. >>>>>>>>>>>> Indeed. I also see it, but also I got a few manually pushed >>>>> "must >>>>>>>>> fix >>>>>>>>>>>> quickly" to gitbox, and actually I find it really nice - >>>>> because >>>>>> it >>>>>>>>> is >>>>>>>>>>>> still possible, but it require some extra effort and >>>>> deliberate >>>>>> "ok >>>>>>>>> that >>>>>>>>>>>> one really should be pushed to unblock everyone" - as long >> as >>>>> we >>>>>> all >>>>>>>>>>>> (especially those people that are active in the >>>>>>>>>>>> #internal-airfow-ci-cd channel) know how to do it and can >> fix >>>>>> things >>>>>>>>>>>> quickly, this is actually a nice way to make it into >>>>> "exceptional" >>>>>>>>>>> workflow >>>>>>>>>>>> - which will push us more in making sure airflow main is >>>>> really >>>>>>>>> "green" >>>>>>>>>>> - >>>>>>>>>>>> which ultimately is our goal to make it as green as possible >>>>> all >>>>>> the >>>>>>>>>>> time. >>>>>>>>>>>> What **might help with that** (and also keeping the "enable >>>>> auto >>>>>>>>> merge" >>>>>>>>>>>> might motivate it more to) is to: >>>>>>>>>>>> * speed up the builds - we MUST prioritise now ARC (K8S >>>>>> self-hosted >>>>>>>>>>>> runners) to make our builds simply faster - I started a >>>>> discussion >>>>>>>>> and a >>>>>>>>>>>> small group of people who can work together to complete it >>>>> after >>>>>>>>>>> Hussein's >>>>>>>>>>>> POC) >>>>>>>>>>>> * speed up the image release - with ARM runners (which we >>>>> might be >>>>>>>>> able >>>>>>>>>>> to >>>>>>>>>>>> do independently as recently I think we have >>>>> hypervisor-enabled >>>>>> ARM >>>>>>>>>>> images >>>>>>>>>>>> available as public runners as github made it generally >>>>>> available). >>>>>>>>>>>> * speed up the doc builds for airflow-site - we (mainly >>>>> Pavan) are >>>>>>>>>>> close to >>>>>>>>>>>> complete offloading of the historical release docs to S3 >> and I >>>>>> hope >>>>>>>>> it >>>>>>>>>>> will >>>>>>>>>>>> cut down a lot on doc publishing workflows. >>>>>>>>>>>> J, --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org