Hi all,

There's a small limitation in re-running failed jobs (builds that fail because 
of flaky tests) in the refactored Pulsar CI workflow which combines multiple 
jobs into a single workflow.

The limitation is that you need to wait for all jobs to complete before failed 
jobs can be re-run.
Yesterday there was some issue with GitHub Actions and the build queue was 
several hours long. When there's enough build capacity and no build queue, the 
new workflow finishes in about 1 hour 20 minutes.

Re-running failed jobs can be requested by commenting "/pulsarbot 
rerun-failure-checks" on the  PR. This won't do anything if one of the jobs in 
the workflow is still executing.

Another confusion has been the new test reporting, which shows all test results 
and test failures as checks and annotations in the GitHub UI. 

Here's an example:
https://github.com/apache/pulsar/pull/14805/checks?check_run_id=5777139002

There's a limitation in GitHub Actions that the test reports get attached to 
the first workflow when a PR triggers more than one workflow. We still have 
multiple workflows and the test reports get attached to the "CI - CPP, Python 
Tests" workflow. Failed tests will show up as red check marks and in the case 
of retries, the test might have succeeded in a later attempt, but the check 
shows as failed. This won't prevent merging the PR. Please keep this small 
detail in mind when interpreting the build results.

The test reports are very verbose at the moment. This is a problem when 
checking the PR build results on GitHub Mobile app. I have created a PR to 
reduce test reporting to GitHub Actions UI in this PR: 
https://github.com/apache/pulsar/pull/14959

Please let me know if there are any other questions or problems that have come 
up with the new refactored Pulsar CI GitHub Actions workflow.

-Lari

Reply via email to