On Sat, 2 Nov 2024 at 02:32, Heesung Sohn <hees...@apache.org> wrote: > > Hi, > > I agree with your points that there are tradeoffs between in-memory vs > container test clusters, > and that we should add more "unit" tests to cover behavior and avoid > deep state checks. > We should smartly move/add integ tests in in-memory or container test > clusters, without increasing the test execution time. >
Documenting the target state in a PIP would be useful. One of the roles of PIPs is for documenting designs. However, doing this doesn't have to happen before starting the work to extract the integration test libraries. It will be helpful for designing future improvements and making conscious decisions about the direction. > Also, I think there are some rooms to optimize our integ tests. > For example, many of the integ tests can run in parallel, under the > same cluster. Yes, there's room for improvement. For pure integration tests, the test execution time hasn't been a huge issue for Pulsar CI for some time. There's a lot of resources available in Apache's GitHub account and there is the possibility to split test jobs even further to run in parallel in independent test jobs. I'm not sure if parallel execution in a single test job would be extremely helpful due to the limited resources of each GitHub Actions runner VM. The resources have increased. The GitHub Actions runner VM specs are at https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories . The runner VM has 4 CPUs, 16 GB RAM, 14 GB SSD . It used to be 2 CPUs and 7GB RAM before. The resources have doubled since then. We could run more tests in parallel. However I don't think that pointing tests to a single cluster would be a useful solution due to test state management. Tests become brittle if there isn't sufficient isolation of state between test runs. The results are very nondeterministic. Sharing environments is a very different direction than the move towards deterministic simulation where all parameters are controlled in each test run. In deterministic testing, state management is completely controlled and managed to always produce the same results with the same input. > We probably need to add test cases more often instead of creating new > test classes. > Also, we should reuse PulsarClient and PulsarAdmin as much as > possible, as initiating them in each test can add several seconds > more. > Some of the test classes can be merged, and the clusters can be shared > among them too. I don't agree directly on this point. Technical details shouldn't impact test design. If there's a need to cache some resources across multiple tests, that could be done in other ways. For example, Spring's testing framework has the "DirtiesContext" annotation to control Spring's application context caching. (https://docs.spring.io/spring-framework/reference/testing/annotations/integration-spring/annotation-dirtiescontext.html) . For Pulsar, our unit tests are integration tests in many cases. The improvements should go in PulsarTestContext and possibly building a caching solution for Junit 5 so that a context could be shared when the state of the in-memory Pulsar broker isn't relevant for a particular test and state isolation can be handled in other ways such as unique topic names for each test case as we already do. For the actual integration tests, we should decouple the Pulsar cluster initialization from the test classes. This way we won't have to start merging test classes and other tricks because of technical reasons. I think that a PIP to document the design would be useful so that we can find the proper direction which will be feasible also in the future. > > I consider my proposal received positive feedback in this thread, so I > will move on and raise a PR for this change. Yes, extracting the integration test libraries is something that will be useful. I'd also recommend taking it to the direction where test resource initialization is decoupled from the base test classes. That's something that has been done with the PulsarTestContext solution for the "unit tests" (which are integration tests too). This could happen iteratively so that the first step is to simply decouple the test framework and the test classes in tests/integration. -Lari