Agree the split will be a good idea. Anyway I think the "bigger" problem is currently with the Java unit tests job. This job takes ~90min and has currently the higher chances of failure. Most of time (and all flakyness) is in the broker tests module which is essentially used as a mid-way between unit and integration tests. I'm trying to come up with some reasonable plan to gradually improve the situation there, without losing sanity...
Regarding integration, I think that most of time is being spent in starting the containers for each new test class. Since the integration tests are already in a (very) much better state, compared to our unit tests, I think that perhaps we could have a single cluster, statically initialized that can be shared across many tests, each using its own namespaces/topics. This way we can still have multiple test classes (which makes understanding the code and the results easier) and only pay the startup price once. Finally, this will be easier to start with only tests that don't involve killing broker containers, so probably the Jenkins jobs could be split based on that. On Sat, Feb 9, 2019 at 10:47 AM Eren Avsarogullari <erenavsarogull...@gmail.com> wrote: > > +1 > > Also, i have suggested 'quarantined-list' feature for Apache Heron to > manage flaky integration-tests as follow. As the long-term, it can also be > useful for Pulsar if similar feature is already not used. > https://github.com/apache/incubator-heron/issues/2865 > > On Sat, 9 Feb 2019 at 04:54, Sanjeev Kulkarni <sanjee...@gmail.com> wrote: > > > +1 > > > > On Fri, Feb 8, 2019 at 8:52 PM Sijie Guo <guosi...@gmail.com> wrote: > > > > > `Hi all, > > > > > > Integration job has been a pain point for merging pull requests. The > > total > > > run time of the integration job typically is around an hour. If an > > > integration test is failing, retrigger the job requires another hour to > > > run. rerunning the job will take up the docker resources on jenkins > > nodes. > > > > > > I am thinking of breaking down the current integration job into multiple > > > smaller jobs. This can achieve by specifying different test suite files > > > using system property. I have an outstanding pull request to introduce > > an ` > > > integrationTestSuiteFile` system property. > > > https://github.com/apache/pulsar/pull/3558/ > > > > > > The initial set of test suites I can think of are: > > > > > > - cli test suite: all cli related tests > > > - function thread test suite: all tests related functions in thread mode > > > - function process test suite: all tests related functions in process > > mode > > > - sql test suite: pulsar sql related tests > > > - storage test suite: tiered storage related tests. > > > > > > Any thoughts? If this is a good direction to go, I can break down the > > > integration job once PR#3558 is merged. > > > > > > - Sijie > > > > >