Hey Viktor, I am all up for the idea of speeding up the tests. Running the `:core:integrationTest` command takes an absurd amount of time as is and is continuously going to go up if we don't do anything about it. Having said that, I am very scared that your proposal might significantly increase the test flakiness of current and future tests - test flakiness is a huge problem we're battling. We don't get green PR builds too often - it is very common that one or two flaky tests fail in each PR. We have also found it hard to get a green build for the 2.2 release ( https://jenkins.confluent.io/job/apache-kafka-test/job/2.2/).
On Wed, Feb 27, 2019 at 11:09 AM Viktor Somogyi-Vass < viktorsomo...@gmail.com> wrote: > Hi Folks, > > I've been observing lately that unit tests usually take 2.5 hours to run > and a very big portion of these are the core tests where a new cluster is > spun up for every test. This takes most of the time. I ran a test > (TopicCommandWithAdminClient with 38 test inside) through the profiler and > it shows for instance that running the whole class itself took 10 minutes > and 37 seconds where the useful time was 5 minutes 18 seconds. That's a > 100% overhead. Without profiler the whole class takes 7 minutes and 48 > seconds, so the useful time would be between 3-4 minutes. This is a bigger > test though, most of them won't take this much. > There are 74 classes that implement KafkaServerTestHarness and just running > :core:integrationTest takes almost 2 hours. > > I think we could greatly speed up these integration tests by just creating > the cluster once per class and perform the tests on separate methods. I > know that this a little bit contradicts to the principle that tests should > be independent but it seems like recreating clusters for each is a very > expensive operation. Also if the tests are acting on different resources > (different topics, etc.) then it might not hurt their independence. There > might be cases of course where this is not possible but I think there could > be a lot where it is. > > In the optimal case we could cut the testing time back by approximately an > hour. This would save resources and give quicker feedback for PR builds. > > What are your thoughts? > Has anyone thought about this or were there any attempts made? > > Best, > Viktor > -- Best, Stanislav