We are going to need to take actions to fix our problems. See 
https://issues.apache.org/jira/browse/INFRA-23633?focusedCommentId=17600749&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17600749

Jarek has done a large amount of GitHub Action work with Apache Airflow and his 
suggestions might be helpful. One of his suggestions was Apache Yetus. I think 
he means using the Maven plugins - 
https://yetus.apache.org/documentation/0.14.0/yetus-maven-plugin/


> On Sep 6, 2022, at 4:48 AM, Lari Hotari <lhot...@apache.org> wrote:
> 
> The Apache Infra ticket is https://issues.apache.org/jira/browse/INFRA-23633 
> . 
> 
> -Lari
> 
> On 2022/09/06 11:36:46 Lari Hotari wrote:
>> I asked for an update on the Apache org GitHub Actions usage stats from 
>> Gavin McDonald on the-asf slack in this thread: 
>> https://the-asf.slack.com/archives/CBX4TSBQ8/p1662464113873539?thread_ts=1661512133.913279&cid=CBX4TSBQ8
>>  .
>> 
>> I hope we get this issue resolved since it delays PR processing a lot.
>> 
>> -Lari
>> 
>> On 2022/09/06 11:16:07 Lari Hotari wrote:
>>> Pulsar CI continues to be congested, and the build queue [1] is very long 
>>> at the moment. There are 147 build jobs in the queue and 16 jobs in 
>>> progress at the moment.
>>> 
>>> I would strongly advice everyone to use "personal CI" to mitigate the issue 
>>> of the long delay of CI feedback. You can simply open a PR to your own 
>>> personal fork of apache/pulsar to run the builds in your "personal CI". 
>>> There's more details in the previous emails in this thread.
>>> 
>>> -Lari
>>> 
>>> [1] - build queue: 
>>> https://github.com/apache/pulsar/actions?query=is%3Aqueued
>>> 
>>> On 2022/08/30 12:39:19 Lari Hotari wrote:
>>>> Pulsar CI continues to be congested, and the build queue is long.
>>>> 
>>>> I would strongly advice everyone to use "personal CI" to mitigate the 
>>>> issue of the long delay of CI feedback. You can simply open a PR to your 
>>>> own personal fork of apache/pulsar to run the builds in your "personal 
>>>> CI". There's more details in the previous email in this thread.
>>>> 
>>>> Some updates:
>>>> 
>>>> There has been a discussion with Gavin McDonald from ASF infra on the-asf 
>>>> slack about getting usage reports from GitHub to support the 
>>>> investigation. Slack thread is the same one mentioned in the previous 
>>>> email, https://the-asf.slack.com/archives/CBX4TSBQ8/p1661512133913279 . 
>>>> Gavin already requested the usage report in GitHub UI, but it produced 
>>>> invalid results.
>>>> 
>>>> I made a change to mitigate a source of additional GitHub Actions 
>>>> overhead. 
>>>> In the past, each cherry-picked commit to a maintenance branch of Pulsar 
>>>> has triggered a lot of workflow runs. 
>>>> 
>>>> The solution for cancelling duplicate builds automatically is to add this 
>>>> definition to the workflow definition:
>>>> concurrency:
>>>>  group: ${{ github.workflow }}-${{ github.ref }}
>>>>  cancel-in-progress: true
>>>> 
>>>> I added this to all maintenance branch GitHub Actions workflows:
>>>> 
>>>> branch-2.10 change:
>>>> https://github.com/apache/pulsar/commit/5d2c9851f4f4d70bfe74b1e683a41c5a040a6ca7
>>>> branch-2.9 change:
>>>> https://github.com/apache/pulsar/commit/3ea124924fecf636cc105de75c62b3a99050847b
>>>> branch-2.8 change:
>>>> https://github.com/apache/pulsar/commit/48187bb5d95e581f8322a019b61d986e18a31e54
>>>> branch-2.7:
>>>> https://github.com/apache/pulsar/commit/744b62c99344724eacdbe97c881311869d67f630
>>>> 
>>>> branch-2.11 already contains the necessary config for cancelling duplicate 
>>>> builds.
>>>> 
>>>> The benefit of the above change is that when multiple commits are 
>>>> cherry-picked to a branch at once, only the build of the last commit will 
>>>> get run eventually. The builds for the intermediate commits will get 
>>>> cancelled. Obviously there's a tradeoff here that we don't get the 
>>>> information if one of the earlier commits breaks the build. It's the cost 
>>>> that we need to pay. Nevertheless our build is so flaky that it's hard to 
>>>> determine whether a failed build result is only caused by bad flaky test 
>>>> or whether it's an actual failure. Because of this we don't lose anything 
>>>> by cancelling builds. It's more important to save build resources. In the 
>>>> maintenance branches for 2.10 and older, the average total build time 
>>>> consumed is around 20 hours which is a lot.
>>>> 
>>>> At this time, the overhead of maintenance branch builds doesn't seem to be 
>>>> the source of the problems. There must be some other issue which is 
>>>> possibly related to exceeding a usage quota. Hopefully we get the CI 
>>>> slowness issue solved asap.
>>>> 
>>>> BR,
>>>> 
>>>> Lari
>>>> 
>>>> 
>>>> On 2022/08/26 12:00:20 Lari Hotari wrote:
>>>>> Hi,
>>>>> 
>>>>> GitHub Actions builds have been piling up in the build queue in the last 
>>>>> few days.
>>>>> I posted on bui...@apache.org 
>>>>> https://lists.apache.org/thread/6lbqr0f6mqt9s8ggollp5kj2nv7rlo9s and 
>>>>> created INFRA ticket https://issues.apache.org/jira/browse/INFRA-23633 
>>>>> about this issue.
>>>>> There's also a thread on the-asf slack, 
>>>>> https://the-asf.slack.com/archives/CBX4TSBQ8/p1661512133913279 . 
>>>>> 
>>>>> It seems that our build queue is finally getting picked up, but it would 
>>>>> be great to see if we hit quota and whether that is the cause of pauses. 
>>>>> 
>>>>> Another issue is that the master branch broke after merging 2 conflicting 
>>>>> PRs. 
>>>>> The fix is in https://github.com/apache/pulsar/pull/17300 . 
>>>>> 
>>>>> Merging PRs will be slow until we have these 2 problems solved and 
>>>>> existing PRs rebased over the changes. Let's prioritize merging #17300 
>>>>> before pushing more changes.
>>>>> 
>>>>> I'd like to point out that a good way to get build feedback before 
>>>>> sending a PR, is to run builds on your personal GitHub Actions CI. The 
>>>>> benefit of this is that it doesn't consume the shared quota and builds 
>>>>> usually start instantly.
>>>>> There are instructions in the contributors guide about this. 
>>>>> https://pulsar.apache.org/contributing/#ci-testing-in-your-fork
>>>>> You simply open PRs to your own fork of apache/pulsar to run builds on 
>>>>> your personal GitHub Actions CI.
>>>>> 
>>>>> BR,
>>>>> 
>>>>> Lari
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Reply via email to