Re: Next steps with flickering tests

Anthony Baker Mon, 02 May 2016 11:26:59 -0700

I have results from 10 runs of all the tests excluding @FlakyTest.  These are 
the only failures:


ubuntu@ip-172-31-44-240:~$ grep FAILED incubator-geode/nohup.out | grep gemfire
com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > 
testMultipleCacheServer FAILED
com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > 
testMultipleCacheServer FAILED
com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > 
testMultipleCacheServer FAILED
com.gemstone.gemfire.cache30.DistributedAckPersistentRegionCCEDUnitTest > 
testTombstones FAILED
com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > 
testMultipleCacheServer FAILED
com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > 
testMultipleCacheServer FAILED
com.gemstone.gemfire.internal.cache.wan.parallel.ParallelWANStatsDUnitTest > 
testParallelPropagationHA FAILED

Anthony

> On Apr 27, 2016, at 7:22 PM, Kirk Lund <[email protected]> wrote:
> 
> We currently have over 10,000 tests but only about 147 are annotated with
> FlakyTest. It probably wouldn't cause precheckin to take much longer. My
> main argument for separating the FlakyTests into their own Jenkins build
> job is to get the main build job 100% green while we know the FlakyTest
> build job might "flicker".
> 
> -Kirk
> 
> 
> On Tue, Apr 26, 2016 at 1:58 PM, Udo Kohlmeyer <[email protected]>
> wrote:
> 
>> Depending on the amount of "flaky" tests, this should not increase the
>> time too much.
>> I forsee these "flaky" tests to be few and far in between. Over time I
>> imagine this would be a last resort if we cannot fix the test or even
>> improve the test harness to have a clean test space for each test.
>> 
>> --Udo
>> 
>> 
>> On 27/04/2016 6:42 am, Jens Deppe wrote:
>> 
>>> By running the Flakes with forkEvery 1 won't it extend precheckin by a
>>> fair
>>> bit? I'd prefer to see two separate builds running.
>>> 
>>> On Tue, Apr 26, 2016 at 11:53 AM, Kirk Lund <[email protected]> wrote:
>>> 
>>> I'm in favor of running the FlakyTests together at the end of precheckin
>>>> using forkEvery 1 on them too.
>>>> 
>>>> What about running two nightly builds? One that runs all the non-flaky
>>>> UnitTests, IntegrationTests and DistributedTests. Plus another nightly
>>>> build that runs only FlakyTests? We can run Jenkins jobs on our local
>>>> machines that separates FlakyTests out into its own job too, but I'd like
>>>> to see the main nightly build go to 100% green (if that's even possible
>>>> without encounter many more flickering tests).
>>>> 
>>>> -Kirk
>>>> 
>>>> 
>>>> On Tue, Apr 26, 2016 at 11:02 AM, Dan Smith <[email protected]> wrote:
>>>> 
>>>> +1 for separating these out and running them with forkEvery 1.
>>>>> 
>>>>> I think they should probably still run as part of precheckin and the
>>>>> nightly builds though. We don't want this to turn into essentially
>>>>> disabling and ignoring these tests.
>>>>> 
>>>>> -Dan
>>>>> 
>>>>> On Tue, Apr 26, 2016 at 10:28 AM, Kirk Lund <[email protected]> wrote:
>>>>> 
>>>>>> Also, I don't think there's much value continuing to use the "CI"
>>>>>> 
>>>>> label.
>>>> 
>>>>> If
>>>>> 
>>>>>> a test fails in Jenkins, then run the test to see if it fails
>>>>>> 
>>>>> consistently.
>>>>> 
>>>>>> If it doesn't, it's flaky. The developer looking at it should try to
>>>>>> determine the cause of it failing (ie, "it uses thread sleeps or random
>>>>>> ports with BindExceptions or has short timeouts with probable GC
>>>>>> 
>>>>> pause")
>>>> 
>>>>> and include that info when adding the FlakyTest annotation and filing a
>>>>>> Jira bug with the Flaky label. If the test fails consistently, then
>>>>>> 
>>>>> file
>>>> 
>>>>> a
>>>>> 
>>>>>> Jira bug without the Flaky label.
>>>>>> 
>>>>>> -Kirk
>>>>>> 
>>>>>> 
>>>>>> On Tue, Apr 26, 2016 at 10:24 AM, Kirk Lund <[email protected]> wrote:
>>>>>> 
>>>>>> There are quite a few test classes that have multiple test methods
>>>>>>> 
>>>>>> which
>>>> 
>>>>> are annotated with the FlakyTest category.
>>>>>>> 
>>>>>>> More thoughts:
>>>>>>> 
>>>>>>> In general, I think that if any given test fails intermittently then
>>>>>>> 
>>>>>> it
>>>> 
>>>>> is
>>>>> 
>>>>>> a FlakyTest. A good test should either pass or fail consistently.
>>>>>>> 
>>>>>> After
>>>> 
>>>>> annotating a test method with FlakyTest, the developer should then add
>>>>>>> 
>>>>>> the
>>>>> 
>>>>>> Flaky label to corresponding Jira ticket. What we then do with the
>>>>>>> 
>>>>>> Jira
>>>> 
>>>>> tickets (ie, fix them) is probably more important than deciding if a
>>>>>>> 
>>>>>> test
>>>>> 
>>>>>> is flaky or not.
>>>>>>> 
>>>>>>> Rather than try to come up with some flaky process for determining if
>>>>>>> 
>>>>>> a
>>>> 
>>>>> given test is flaky (ie, "does it have thread sleeps?"), it would be
>>>>>>> 
>>>>>> better
>>>>> 
>>>>>> to have a wiki page that has examples of flakiness and how to fix them
>>>>>>> 
>>>>>> ("if
>>>>> 
>>>>>> the test has thread sleeps, then switch to using Awaitility and do
>>>>>>> this...").
>>>>>>> 
>>>>>>> -Kirk
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Apr 25, 2016 at 10:51 PM, Anthony Baker <[email protected]>
>>>>>>> 
>>>>>> wrote:
>>>>> 
>>>>>> Thanks Kirk!
>>>>>>>> 
>>>>>>>> ~/code/incubator-geode (develop)$ grep -ro "FlakyTest.class" . | grep
>>>>>>>> 
>>>>>>> -v
>>>>> 
>>>>>> Binary | wc -l | xargs echo "Flake factor:"
>>>>>>>> Flake factor: 136
>>>>>>>> 
>>>>>>>> Anthony
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Apr 25, 2016, at 9:45 PM, William Markito <[email protected]>
>>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> +1
>>>>>>>>> 
>>>>>>>>> Are we also planning to automate the additional build task somehow
>>>>>>>>> 
>>>>>>>> ?
>>>> 
>>>>> I'd also suggest creating a wiki page with some stats (like how
>>>>>>>>> 
>>>>>>>> many
>>>> 
>>>>> FlakyTests we currently have) and the idea behind this effort so we
>>>>>>>>> 
>>>>>>>> can
>>>>> 
>>>>>> keep track and see how it's evolving over time.
>>>>>>>>> 
>>>>>>>>> On Mon, Apr 25, 2016 at 6:54 PM, Kirk Lund <[email protected]>
>>>>>>>>> 
>>>>>>>> wrote:
>>>> 
>>>>> After completing GEODE-1233, all currently known flickering tests
>>>>>>>>>> 
>>>>>>>>> are
>>>>> 
>>>>>> now
>>>>>>>> 
>>>>>>>>> annotated with our FlakyTest JUnit Category.
>>>>>>>>>> 
>>>>>>>>>> In an effort to divide our build up into multiple build pipelines
>>>>>>>>>> 
>>>>>>>>> that
>>>>> 
>>>>>> are
>>>>>>>> 
>>>>>>>>> sequential and dependable, we could consider excluding FlakyTests
>>>>>>>>>> 
>>>>>>>>> from
>>>>> 
>>>>>> the
>>>>>>>> 
>>>>>>>>> primary integrationTest and distributedTest tasks. An additional
>>>>>>>>>> 
>>>>>>>>> build
>>>>> 
>>>>>> task
>>>>>>>> 
>>>>>>>>> would then execute all of the FlakyTests separately. This would
>>>>>>>>>> 
>>>>>>>>> hopefully
>>>>>>>> 
>>>>>>>>> help us get to a point where we can depend on our primary testing
>>>>>>>>>> 
>>>>>>>>> tasks
>>>>> 
>>>>>> staying green 100% of the time. We would then prioritize fixing
>>>>>>>>>> 
>>>>>>>>> the
>>>> 
>>>>> FlakyTests and one by one removing the FlakyTest category from
>>>>>>>>>> 
>>>>>>>>> them.
>>>> 
>>>>> I would also suggest that we execute the FlakyTests with
>>>>>>>>>> 
>>>>>>>>> "forkEvery
>>>> 
>>>>> 1"
>>>>> 
>>>>>> to
>>>>>>>> 
>>>>>>>>> give each test a clean JVM or set of DistributedTest JVMs. That
>>>>>>>>>> 
>>>>>>>>> would
>>>>> 
>>>>>> hopefully decrease the chance of a GC pause or test pollution
>>>>>>>>>> 
>>>>>>>>> causing
>>>>> 
>>>>>> flickering failures.
>>>>>>>>>> 
>>>>>>>>>> Having reviewed lots of test code and failure stacks, I believe
>>>>>>>>>> 
>>>>>>>>> that
>>>> 
>>>>> the
>>>>>>>> 
>>>>>>>>> primary causes of FlakyTests are timing sensitivity (thread sleeps
>>>>>>>>>> 
>>>>>>>>> or
>>>>> 
>>>>>> nothing that waits for async activity, timeouts or sleeps that are
>>>>>>>>>> insufficient on busy CPU or I/O or during due GC pause) and random
>>>>>>>>>> 
>>>>>>>>> ports
>>>>>>>> 
>>>>>>>>> via AvailablePort (instead of using zero for ephemeral port).
>>>>>>>>>> 
>>>>>>>>>> Opinions or ideas? Hate it? Love it?
>>>>>>>>>> 
>>>>>>>>>> -Kirk
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> 
>>>>>>>>> ~/William
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: Next steps with flickering tests

Reply via email to