Depending on the amount of "flaky" tests, this should not increase the
time too much.
I forsee these "flaky" tests to be few and far in between. Over time I
imagine this would be a last resort if we cannot fix the test or even
improve the test harness to have a clean test space for each test.
--Udo
On 27/04/2016 6:42 am, Jens Deppe wrote:
By running the Flakes with forkEvery 1 won't it extend precheckin by a fair
bit? I'd prefer to see two separate builds running.
On Tue, Apr 26, 2016 at 11:53 AM, Kirk Lund <[email protected]> wrote:
I'm in favor of running the FlakyTests together at the end of precheckin
using forkEvery 1 on them too.
What about running two nightly builds? One that runs all the non-flaky
UnitTests, IntegrationTests and DistributedTests. Plus another nightly
build that runs only FlakyTests? We can run Jenkins jobs on our local
machines that separates FlakyTests out into its own job too, but I'd like
to see the main nightly build go to 100% green (if that's even possible
without encounter many more flickering tests).
-Kirk
On Tue, Apr 26, 2016 at 11:02 AM, Dan Smith <[email protected]> wrote:
+1 for separating these out and running them with forkEvery 1.
I think they should probably still run as part of precheckin and the
nightly builds though. We don't want this to turn into essentially
disabling and ignoring these tests.
-Dan
On Tue, Apr 26, 2016 at 10:28 AM, Kirk Lund <[email protected]> wrote:
Also, I don't think there's much value continuing to use the "CI"
label.
If
a test fails in Jenkins, then run the test to see if it fails
consistently.
If it doesn't, it's flaky. The developer looking at it should try to
determine the cause of it failing (ie, "it uses thread sleeps or random
ports with BindExceptions or has short timeouts with probable GC
pause")
and include that info when adding the FlakyTest annotation and filing a
Jira bug with the Flaky label. If the test fails consistently, then
file
a
Jira bug without the Flaky label.
-Kirk
On Tue, Apr 26, 2016 at 10:24 AM, Kirk Lund <[email protected]> wrote:
There are quite a few test classes that have multiple test methods
which
are annotated with the FlakyTest category.
More thoughts:
In general, I think that if any given test fails intermittently then
it
is
a FlakyTest. A good test should either pass or fail consistently.
After
annotating a test method with FlakyTest, the developer should then add
the
Flaky label to corresponding Jira ticket. What we then do with the
Jira
tickets (ie, fix them) is probably more important than deciding if a
test
is flaky or not.
Rather than try to come up with some flaky process for determining if
a
given test is flaky (ie, "does it have thread sleeps?"), it would be
better
to have a wiki page that has examples of flakiness and how to fix them
("if
the test has thread sleeps, then switch to using Awaitility and do
this...").
-Kirk
On Mon, Apr 25, 2016 at 10:51 PM, Anthony Baker <[email protected]>
wrote:
Thanks Kirk!
~/code/incubator-geode (develop)$ grep -ro "FlakyTest.class" . | grep
-v
Binary | wc -l | xargs echo "Flake factor:"
Flake factor: 136
Anthony
On Apr 25, 2016, at 9:45 PM, William Markito <[email protected]>
wrote:
+1
Are we also planning to automate the additional build task somehow
?
I'd also suggest creating a wiki page with some stats (like how
many
FlakyTests we currently have) and the idea behind this effort so we
can
keep track and see how it's evolving over time.
On Mon, Apr 25, 2016 at 6:54 PM, Kirk Lund <[email protected]>
wrote:
After completing GEODE-1233, all currently known flickering tests
are
now
annotated with our FlakyTest JUnit Category.
In an effort to divide our build up into multiple build pipelines
that
are
sequential and dependable, we could consider excluding FlakyTests
from
the
primary integrationTest and distributedTest tasks. An additional
build
task
would then execute all of the FlakyTests separately. This would
hopefully
help us get to a point where we can depend on our primary testing
tasks
staying green 100% of the time. We would then prioritize fixing
the
FlakyTests and one by one removing the FlakyTest category from
them.
I would also suggest that we execute the FlakyTests with
"forkEvery
1"
to
give each test a clean JVM or set of DistributedTest JVMs. That
would
hopefully decrease the chance of a GC pause or test pollution
causing
flickering failures.
Having reviewed lots of test code and failure stacks, I believe
that
the
primary causes of FlakyTests are timing sensitivity (thread sleeps
or
nothing that waits for async activity, timeouts or sleeps that are
insufficient on busy CPU or I/O or during due GC pause) and random
ports
via AvailablePort (instead of using zero for ephemeral port).
Opinions or ideas? Hate it? Love it?
-Kirk
--
~/William