+1 On Mon, Apr 25, 2016 at 6:54 PM, Kirk Lund <[email protected]> wrote:
> After completing GEODE-1233, all currently known flickering tests are now > annotated with our FlakyTest JUnit Category. > > In an effort to divide our build up into multiple build pipelines that are > sequential and dependable, we could consider excluding FlakyTests from the > primary integrationTest and distributedTest tasks. An additional build task > would then execute all of the FlakyTests separately. This would hopefully > help us get to a point where we can depend on our primary testing tasks > staying green 100% of the time. We would then prioritize fixing the > FlakyTests and one by one removing the FlakyTest category from them. > > I would also suggest that we execute the FlakyTests with "forkEvery 1" to > give each test a clean JVM or set of DistributedTest JVMs. That would > hopefully decrease the chance of a GC pause or test pollution causing > flickering failures. > > Having reviewed lots of test code and failure stacks, I believe that the > primary causes of FlakyTests are timing sensitivity (thread sleeps or > nothing that waits for async activity, timeouts or sleeps that are > insufficient on busy CPU or I/O or during due GC pause) and random ports > via AvailablePort (instead of using zero for ephemeral port). > > Opinions or ideas? Hate it? Love it? > > -Kirk >
