Igniters,

I am working on the stability of our TC test runs.

Some of our execution timeouts (hangings, unexpected stops) happen because
of issues in source code: test itself, test runners, configurations, bug,
Linux OOM killer and so on.

We could fix them by changing code.

But almost all of the last issues with timeouts have happened because many
tests ran disk-intensive operations on one machine.

Examples:

https://ci.ignite.apache.org/viewLog.html?buildId=1543562&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ZooKeeperDiscovery2
https://ci.ignite.apache.org/viewLog.html?buildId=1543518&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_Basic1

and so on.

To fix this problem I propose to extract from "Run Basic" and "Run Cache"
new
dedicated ones for persistent tests TC configurations.

Also, I would add some checking to not allow add new tests with persistent
to other TC configurations in future.

It would allow us to run almost all TC configuration on any agent while
configurations with persistent would have agent rules to not get a timeout.

Thoughts?

Reply via email to