+1 from my side as well. I also like the suggestion to revert a patch causing new test failures instead of expecting the owner to fix the issues in follow up jiras.
And I would also have a borderline "radical" suggestion.. if a flaky test is failing and the committer isn't sure at a glance that the failure is benign (stat diff failure for example I would consider low risk, race conditions on the other hand high risk), put an ignore annotation on it. In my view a flaky test is pretty much useless. If it breaks people will just ignore it (because it fails so often...) and anyway how can you be sure that the green run is the standard and the failure is the exception and not the other way around. We have thousands of tests, ignoring a dozen won't decrease coverage significantly and will take us that much closer to a point where we can demand a green pre-commit run. Thanks On Fri, Oct 14, 2016 at 9:45 AM, Prasanth Jayachandran < pjayachand...@hortonworks.com> wrote: > +1 on the proposal. Adding more to it. > > A lot of time has been spent on improving the test runtime and bringing > down the flaky tests. > Following jiras should give an overview of the effort involved > https://issues.apache.org/jira/browse/HIVE-14547 > https://issues.apache.org/jira/browse/HIVE-13503 > > Committers please ensure that the reported failures are absolutely not > related > to the patch before committing it. > > I would also propose the following to maintain a clean and some tips to > maintain fast test runs > > 1) Revert patch that is causing a failure. It should be the responsibility > of > the contributor to make sure the patch is not causing any failures. I am > against creating follow ups > for fixing test failures usually because it gets ignored or it gets lower > priority causing wasted effort > and time for failure analysis for every other developers waiting to commit > their patch. > > 2) +1 from reviewers AFTER a clean green run. Or if a reviewer is > convinced that test failures are unrelated. > May be we should stop conditional +1s and wait for clean green run. > > 3) Avoid slow tests (there is jira to print out runtime of newly added > tests). In general, its good > to have many smaller tests as opposed to single big tests. If the qfile or > junit test is big, splitting it > up will help in parallelizing them and avoiding stragglers. > > 4) Avoid adding tests to MiniMr (slowest of all). > > 5) Try to keep the test runtime (see surefire-report to get correct > runtime without initialization time) under a minute. > > 6) Avoid adding more read-only tables to init script as this will increase > the initialization time. > > 7) If the test case does not require explain plan then avoid it as most > failures are explain diffs. > > 8) If the test case requires explain and if it does not depend on table or > partition stats explicitly set stats for the table or partition. > Explicitly setting stats will avoid expensive stats computation time and > avoids flakiness due to stats diff. > > 9) Prefer jUnit over qtest. > > 10) Add explicitly timeout for jUnit test to avoid indefinite hanging of > tests (surefire timeouts after 40 mins) > > Thoughts? > > Thanks > Prasanth > > On Oct 13, 2016, at 11:10 PM, Siddharth Seth <ss...@apache.org<mailto: > ss...@apache.org>> wrote: > > There's been a lot of work to make the test runs faster, as well as more > reliable via HIVE-14547, HIVE-13503, and several other jiras. Test runtimes > are around the 1 hour mark, and going down. There were a few green > pre-commit runs (after years?). At the same time, there's still some flaky > tests. > > We really should try to keep the test runtimes down, as well as the number > of failures - so that the pre-commit runs can provide useful information. > > I'm not sure what the current approach w.r.t precommit runs before a > commit. What I've seen in other projects is that the pre-commit needs to > run, and come back clean (mostly) before a commit goes in. Between what > used to be 5 day wait times, and inconsistent runs - I don't think this is > always followed in Hive. > > It'll be useful to start relying on pre-commit test results again. Given > the flaky tests, I'd suggest the following > 1. Pre-commit must be run on a patch before committing (with very few > exceptions) > 2. A green test run is ideal > 3. In case there are failures - keep track of these as sub-jiras under a > flaky test umbrella jira (Some under HIVE-14547 already) - to be eventually > fixed. > 4. Before committing - cite relevant jiras for a flaky test (create and > cite if it doesn't already exist). > > This should help us build up a list of flaky tests over various runs, which > will hopefully get fixed at some point. > > Thoughts? > > Thanks, > Sid > >