Hi Boschi I support this plan and I think we need take some effort to make the Pulsar CI more stable.
Thanks, Xiaoyu Hou Nicolò Boschi <boschi1...@gmail.com> 于2022年7月11日周一 22:14写道: > Hi all, > > I'd like to start a discussion about the stability of Pulsar CI. > > It is common that some tests suite in our CI times out. This is because > when a test fails the entire suite is retried from the beginning (max 3 > times). (example: > https://github.com/apache/pulsar/runs/7281063499?check_suite_focus=true) > > The command-line retries may sound helpful in making the CI green for a > given pull but they actually hide test failures (that may be flakies or > real issues!!). > > Another issue is that you can't easily see the failed test and most of the > time the quickest solution is just to blindly restart the failed jobs. This > is not the correct behaviour and it will make the CI less stable over time. > > The plan would be: > - Remove the retries (see https://github.com/apache/pulsar/pull/16524) > - Create issue for flaky tests > - Fix them / move to quarantine > > WDYT? > > Thanks, > Nicolò Boschi >