Looks like those tickets were filed since GEODE-1233 was completed. -Kirk
On Mon, May 2, 2016 at 1:42 PM, Dan Smith <[email protected]> wrote: > testMultipleCacheServer *is* annotated as a flaky test. Maybe you aren't > actually excluding anything? > > I'm surprised testTomstones is not annotated with flaky test. We have at > least 3 bugs all related to this method that are still open - GEODE-1285, > GEODE-1332, GEODE-1287. > > -Dan > > > > > On Mon, May 2, 2016 at 11:25 AM, Anthony Baker <[email protected]> wrote: > > > I have results from 10 runs of all the tests excluding @FlakyTest. These > > are the only failures: > > > > ubuntu@ip-172-31-44-240:~$ grep FAILED incubator-geode/nohup.out | grep > > gemfire > > com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > > > testMultipleCacheServer FAILED > > com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > > > testMultipleCacheServer FAILED > > com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > > > testMultipleCacheServer FAILED > > com.gemstone.gemfire.cache30.DistributedAckPersistentRegionCCEDUnitTest > > > testTombstones FAILED > > com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > > > testMultipleCacheServer FAILED > > com.gemstone.gemfire.internal.cache.wan.CacheClientNotifierDUnitTest > > > testMultipleCacheServer FAILED > > > com.gemstone.gemfire.internal.cache.wan.parallel.ParallelWANStatsDUnitTest > > > testParallelPropagationHA FAILED > > > > Anthony > > > > > On Apr 27, 2016, at 7:22 PM, Kirk Lund <[email protected]> wrote: > > > > > > We currently have over 10,000 tests but only about 147 are annotated > with > > > FlakyTest. It probably wouldn't cause precheckin to take much longer. > My > > > main argument for separating the FlakyTests into their own Jenkins > build > > > job is to get the main build job 100% green while we know the FlakyTest > > > build job might "flicker". > > > > > > -Kirk > > > > > > > > > On Tue, Apr 26, 2016 at 1:58 PM, Udo Kohlmeyer <[email protected]> > > > wrote: > > > > > >> Depending on the amount of "flaky" tests, this should not increase the > > >> time too much. > > >> I forsee these "flaky" tests to be few and far in between. Over time I > > >> imagine this would be a last resort if we cannot fix the test or even > > >> improve the test harness to have a clean test space for each test. > > >> > > >> --Udo > > >> > > >> > > >> On 27/04/2016 6:42 am, Jens Deppe wrote: > > >> > > >>> By running the Flakes with forkEvery 1 won't it extend precheckin by > a > > >>> fair > > >>> bit? I'd prefer to see two separate builds running. > > >>> > > >>> On Tue, Apr 26, 2016 at 11:53 AM, Kirk Lund <[email protected]> > wrote: > > >>> > > >>> I'm in favor of running the FlakyTests together at the end of > > precheckin > > >>>> using forkEvery 1 on them too. > > >>>> > > >>>> What about running two nightly builds? One that runs all the > non-flaky > > >>>> UnitTests, IntegrationTests and DistributedTests. Plus another > nightly > > >>>> build that runs only FlakyTests? We can run Jenkins jobs on our > local > > >>>> machines that separates FlakyTests out into its own job too, but I'd > > like > > >>>> to see the main nightly build go to 100% green (if that's even > > possible > > >>>> without encounter many more flickering tests). > > >>>> > > >>>> -Kirk > > >>>> > > >>>> > > >>>> On Tue, Apr 26, 2016 at 11:02 AM, Dan Smith <[email protected]> > > wrote: > > >>>> > > >>>> +1 for separating these out and running them with forkEvery 1. > > >>>>> > > >>>>> I think they should probably still run as part of precheckin and > the > > >>>>> nightly builds though. We don't want this to turn into essentially > > >>>>> disabling and ignoring these tests. > > >>>>> > > >>>>> -Dan > > >>>>> > > >>>>> On Tue, Apr 26, 2016 at 10:28 AM, Kirk Lund <[email protected]> > > wrote: > > >>>>> > > >>>>>> Also, I don't think there's much value continuing to use the "CI" > > >>>>>> > > >>>>> label. > > >>>> > > >>>>> If > > >>>>> > > >>>>>> a test fails in Jenkins, then run the test to see if it fails > > >>>>>> > > >>>>> consistently. > > >>>>> > > >>>>>> If it doesn't, it's flaky. The developer looking at it should try > to > > >>>>>> determine the cause of it failing (ie, "it uses thread sleeps or > > random > > >>>>>> ports with BindExceptions or has short timeouts with probable GC > > >>>>>> > > >>>>> pause") > > >>>> > > >>>>> and include that info when adding the FlakyTest annotation and > > filing a > > >>>>>> Jira bug with the Flaky label. If the test fails consistently, > then > > >>>>>> > > >>>>> file > > >>>> > > >>>>> a > > >>>>> > > >>>>>> Jira bug without the Flaky label. > > >>>>>> > > >>>>>> -Kirk > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Apr 26, 2016 at 10:24 AM, Kirk Lund <[email protected]> > > wrote: > > >>>>>> > > >>>>>> There are quite a few test classes that have multiple test methods > > >>>>>>> > > >>>>>> which > > >>>> > > >>>>> are annotated with the FlakyTest category. > > >>>>>>> > > >>>>>>> More thoughts: > > >>>>>>> > > >>>>>>> In general, I think that if any given test fails intermittently > > then > > >>>>>>> > > >>>>>> it > > >>>> > > >>>>> is > > >>>>> > > >>>>>> a FlakyTest. A good test should either pass or fail consistently. > > >>>>>>> > > >>>>>> After > > >>>> > > >>>>> annotating a test method with FlakyTest, the developer should then > > add > > >>>>>>> > > >>>>>> the > > >>>>> > > >>>>>> Flaky label to corresponding Jira ticket. What we then do with the > > >>>>>>> > > >>>>>> Jira > > >>>> > > >>>>> tickets (ie, fix them) is probably more important than deciding if > a > > >>>>>>> > > >>>>>> test > > >>>>> > > >>>>>> is flaky or not. > > >>>>>>> > > >>>>>>> Rather than try to come up with some flaky process for > determining > > if > > >>>>>>> > > >>>>>> a > > >>>> > > >>>>> given test is flaky (ie, "does it have thread sleeps?"), it would > be > > >>>>>>> > > >>>>>> better > > >>>>> > > >>>>>> to have a wiki page that has examples of flakiness and how to fix > > them > > >>>>>>> > > >>>>>> ("if > > >>>>> > > >>>>>> the test has thread sleeps, then switch to using Awaitility and do > > >>>>>>> this..."). > > >>>>>>> > > >>>>>>> -Kirk > > >>>>>>> > > >>>>>>> > > >>>>>>> On Mon, Apr 25, 2016 at 10:51 PM, Anthony Baker < > [email protected] > > > > > >>>>>>> > > >>>>>> wrote: > > >>>>> > > >>>>>> Thanks Kirk! > > >>>>>>>> > > >>>>>>>> ~/code/incubator-geode (develop)$ grep -ro "FlakyTest.class" . | > > grep > > >>>>>>>> > > >>>>>>> -v > > >>>>> > > >>>>>> Binary | wc -l | xargs echo "Flake factor:" > > >>>>>>>> Flake factor: 136 > > >>>>>>>> > > >>>>>>>> Anthony > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Apr 25, 2016, at 9:45 PM, William Markito < > [email protected] > > > > > >>>>>>>>> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> +1 > > >>>>>>>>> > > >>>>>>>>> Are we also planning to automate the additional build task > > somehow > > >>>>>>>>> > > >>>>>>>> ? > > >>>> > > >>>>> I'd also suggest creating a wiki page with some stats (like how > > >>>>>>>>> > > >>>>>>>> many > > >>>> > > >>>>> FlakyTests we currently have) and the idea behind this effort so we > > >>>>>>>>> > > >>>>>>>> can > > >>>>> > > >>>>>> keep track and see how it's evolving over time. > > >>>>>>>>> > > >>>>>>>>> On Mon, Apr 25, 2016 at 6:54 PM, Kirk Lund <[email protected]> > > >>>>>>>>> > > >>>>>>>> wrote: > > >>>> > > >>>>> After completing GEODE-1233, all currently known flickering tests > > >>>>>>>>>> > > >>>>>>>>> are > > >>>>> > > >>>>>> now > > >>>>>>>> > > >>>>>>>>> annotated with our FlakyTest JUnit Category. > > >>>>>>>>>> > > >>>>>>>>>> In an effort to divide our build up into multiple build > > pipelines > > >>>>>>>>>> > > >>>>>>>>> that > > >>>>> > > >>>>>> are > > >>>>>>>> > > >>>>>>>>> sequential and dependable, we could consider excluding > FlakyTests > > >>>>>>>>>> > > >>>>>>>>> from > > >>>>> > > >>>>>> the > > >>>>>>>> > > >>>>>>>>> primary integrationTest and distributedTest tasks. An > additional > > >>>>>>>>>> > > >>>>>>>>> build > > >>>>> > > >>>>>> task > > >>>>>>>> > > >>>>>>>>> would then execute all of the FlakyTests separately. This would > > >>>>>>>>>> > > >>>>>>>>> hopefully > > >>>>>>>> > > >>>>>>>>> help us get to a point where we can depend on our primary > testing > > >>>>>>>>>> > > >>>>>>>>> tasks > > >>>>> > > >>>>>> staying green 100% of the time. We would then prioritize fixing > > >>>>>>>>>> > > >>>>>>>>> the > > >>>> > > >>>>> FlakyTests and one by one removing the FlakyTest category from > > >>>>>>>>>> > > >>>>>>>>> them. > > >>>> > > >>>>> I would also suggest that we execute the FlakyTests with > > >>>>>>>>>> > > >>>>>>>>> "forkEvery > > >>>> > > >>>>> 1" > > >>>>> > > >>>>>> to > > >>>>>>>> > > >>>>>>>>> give each test a clean JVM or set of DistributedTest JVMs. That > > >>>>>>>>>> > > >>>>>>>>> would > > >>>>> > > >>>>>> hopefully decrease the chance of a GC pause or test pollution > > >>>>>>>>>> > > >>>>>>>>> causing > > >>>>> > > >>>>>> flickering failures. > > >>>>>>>>>> > > >>>>>>>>>> Having reviewed lots of test code and failure stacks, I > believe > > >>>>>>>>>> > > >>>>>>>>> that > > >>>> > > >>>>> the > > >>>>>>>> > > >>>>>>>>> primary causes of FlakyTests are timing sensitivity (thread > > sleeps > > >>>>>>>>>> > > >>>>>>>>> or > > >>>>> > > >>>>>> nothing that waits for async activity, timeouts or sleeps that are > > >>>>>>>>>> insufficient on busy CPU or I/O or during due GC pause) and > > random > > >>>>>>>>>> > > >>>>>>>>> ports > > >>>>>>>> > > >>>>>>>>> via AvailablePort (instead of using zero for ephemeral port). > > >>>>>>>>>> > > >>>>>>>>>> Opinions or ideas? Hate it? Love it? > > >>>>>>>>>> > > >>>>>>>>>> -Kirk > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> -- > > >>>>>>>>> > > >>>>>>>>> ~/William > > >>>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >> > > > > >
