+1 The retry that repeats the whole maven job is often hiding the real test failures that were in the 1st failed job. -- Matteo Merli <mme...@apache.org>
On Mon, Jul 11, 2022 at 9:06 PM Anon Hxy <anonhx...@gmail.com> wrote: > > Hi Boschi > > I support this plan and I think we need take some effort to make the Pulsar > CI more stable. > > Thanks, > Xiaoyu Hou > > Nicolò Boschi <boschi1...@gmail.com> 于2022年7月11日周一 22:14写道: > > > Hi all, > > > > I'd like to start a discussion about the stability of Pulsar CI. > > > > It is common that some tests suite in our CI times out. This is because > > when a test fails the entire suite is retried from the beginning (max 3 > > times). (example: > > https://github.com/apache/pulsar/runs/7281063499?check_suite_focus=true) > > > > The command-line retries may sound helpful in making the CI green for a > > given pull but they actually hide test failures (that may be flakies or > > real issues!!). > > > > Another issue is that you can't easily see the failed test and most of the > > time the quickest solution is just to blindly restart the failed jobs. This > > is not the correct behaviour and it will make the CI less stable over time. > > > > The plan would be: > > - Remove the retries (see https://github.com/apache/pulsar/pull/16524) > > - Create issue for flaky tests > > - Fix them / move to quarantine > > > > WDYT? > > > > Thanks, > > Nicolò Boschi > >