By running the Flakes with forkEvery 1 won't it extend precheckin by a fair bit? I'd prefer to see two separate builds running.
On Tue, Apr 26, 2016 at 11:53 AM, Kirk Lund <[email protected]> wrote: > I'm in favor of running the FlakyTests together at the end of precheckin > using forkEvery 1 on them too. > > What about running two nightly builds? One that runs all the non-flaky > UnitTests, IntegrationTests and DistributedTests. Plus another nightly > build that runs only FlakyTests? We can run Jenkins jobs on our local > machines that separates FlakyTests out into its own job too, but I'd like > to see the main nightly build go to 100% green (if that's even possible > without encounter many more flickering tests). > > -Kirk > > > On Tue, Apr 26, 2016 at 11:02 AM, Dan Smith <[email protected]> wrote: > > > +1 for separating these out and running them with forkEvery 1. > > > > I think they should probably still run as part of precheckin and the > > nightly builds though. We don't want this to turn into essentially > > disabling and ignoring these tests. > > > > -Dan > > > > On Tue, Apr 26, 2016 at 10:28 AM, Kirk Lund <[email protected]> wrote: > > > Also, I don't think there's much value continuing to use the "CI" > label. > > If > > > a test fails in Jenkins, then run the test to see if it fails > > consistently. > > > If it doesn't, it's flaky. The developer looking at it should try to > > > determine the cause of it failing (ie, "it uses thread sleeps or random > > > ports with BindExceptions or has short timeouts with probable GC > pause") > > > and include that info when adding the FlakyTest annotation and filing a > > > Jira bug with the Flaky label. If the test fails consistently, then > file > > a > > > Jira bug without the Flaky label. > > > > > > -Kirk > > > > > > > > > On Tue, Apr 26, 2016 at 10:24 AM, Kirk Lund <[email protected]> wrote: > > > > > >> There are quite a few test classes that have multiple test methods > which > > >> are annotated with the FlakyTest category. > > >> > > >> More thoughts: > > >> > > >> In general, I think that if any given test fails intermittently then > it > > is > > >> a FlakyTest. A good test should either pass or fail consistently. > After > > >> annotating a test method with FlakyTest, the developer should then add > > the > > >> Flaky label to corresponding Jira ticket. What we then do with the > Jira > > >> tickets (ie, fix them) is probably more important than deciding if a > > test > > >> is flaky or not. > > >> > > >> Rather than try to come up with some flaky process for determining if > a > > >> given test is flaky (ie, "does it have thread sleeps?"), it would be > > better > > >> to have a wiki page that has examples of flakiness and how to fix them > > ("if > > >> the test has thread sleeps, then switch to using Awaitility and do > > >> this..."). > > >> > > >> -Kirk > > >> > > >> > > >> On Mon, Apr 25, 2016 at 10:51 PM, Anthony Baker <[email protected]> > > wrote: > > >> > > >>> Thanks Kirk! > > >>> > > >>> ~/code/incubator-geode (develop)$ grep -ro "FlakyTest.class" . | grep > > -v > > >>> Binary | wc -l | xargs echo "Flake factor:" > > >>> Flake factor: 136 > > >>> > > >>> Anthony > > >>> > > >>> > > >>> > On Apr 25, 2016, at 9:45 PM, William Markito <[email protected]> > > >>> wrote: > > >>> > > > >>> > +1 > > >>> > > > >>> > Are we also planning to automate the additional build task somehow > ? > > >>> > > > >>> > I'd also suggest creating a wiki page with some stats (like how > many > > >>> > FlakyTests we currently have) and the idea behind this effort so we > > can > > >>> > keep track and see how it's evolving over time. > > >>> > > > >>> > On Mon, Apr 25, 2016 at 6:54 PM, Kirk Lund <[email protected]> > wrote: > > >>> > > > >>> >> After completing GEODE-1233, all currently known flickering tests > > are > > >>> now > > >>> >> annotated with our FlakyTest JUnit Category. > > >>> >> > > >>> >> In an effort to divide our build up into multiple build pipelines > > that > > >>> are > > >>> >> sequential and dependable, we could consider excluding FlakyTests > > from > > >>> the > > >>> >> primary integrationTest and distributedTest tasks. An additional > > build > > >>> task > > >>> >> would then execute all of the FlakyTests separately. This would > > >>> hopefully > > >>> >> help us get to a point where we can depend on our primary testing > > tasks > > >>> >> staying green 100% of the time. We would then prioritize fixing > the > > >>> >> FlakyTests and one by one removing the FlakyTest category from > them. > > >>> >> > > >>> >> I would also suggest that we execute the FlakyTests with > "forkEvery > > 1" > > >>> to > > >>> >> give each test a clean JVM or set of DistributedTest JVMs. That > > would > > >>> >> hopefully decrease the chance of a GC pause or test pollution > > causing > > >>> >> flickering failures. > > >>> >> > > >>> >> Having reviewed lots of test code and failure stacks, I believe > that > > >>> the > > >>> >> primary causes of FlakyTests are timing sensitivity (thread sleeps > > or > > >>> >> nothing that waits for async activity, timeouts or sleeps that are > > >>> >> insufficient on busy CPU or I/O or during due GC pause) and random > > >>> ports > > >>> >> via AvailablePort (instead of using zero for ephemeral port). > > >>> >> > > >>> >> Opinions or ideas? Hate it? Love it? > > >>> >> > > >>> >> -Kirk > > >>> >> > > >>> > > > >>> > > > >>> > > > >>> > -- > > >>> > > > >>> > ~/William > > >>> > > >>> > > >> > > >
