Making global formatting changes will create a patch that changes almost every line and make git annotate useless which makes figuring out history of changes difficult.
Perhaps disabling tests should require a separate +1 (ideally from initial author). On 10/14/16, 12:27 PM, "Siddharth Seth" <ss...@apache.org> wrote: >Once we agree upon what the flow should be, I'm also in favor of reverting >patches which break things. Such commits cause problems for all subsequent >patches - where multiple people end up debugging the test. >In terms of disabling tests - we may want to consider how often the test >fails, before disabling them. There are flaky tests - and I suspect we'll >end up with a lot of disabled tests if a test is disabled for a single >failure. >There have been offline suggestions about adding findbugs analysis, rat >checks etc. The indentation checks, I think, falls into the same category. >Would be really nice to hear from others as well. These changes will be >painful to start with - but should help once the tests stabilize further. > >On Fri, Oct 14, 2016 at 2:01 AM, Peter Vary <pv...@cloudera.com> wrote: > >> +1 from me too. >> >> I think it lowers the barrier for the contributors, if there are clean >>and >> easy to follow rules for adding a patch. I had very good experience with >> the Hadoop commit flow where only ³all green² results are accepted. >> >> If we are thinking about changes in the commit flow, it would be great >>to >> run code format checks before and after the patch, and call out any >> increase of formatting errors. I see too many nits in the reviews where >>the >> reviewer points out formatting problems. This could be avoided and the >> reviewer can concentrate on the real code instead of formatting. >> >> After we decide the new commit flow we should update the documentation >>on >> the wiki as well. >> >> Of course I am happy to help out with any of the related tasks too, so >>if >> you need my help just say so :) >> >> Thanks, >> Peter >> >> > On Oct 14, 2016, at 10:29 AM, Zsombor Klara >><zsombor.kl...@cloudera.com> >> wrote: >> > >> > +1 from my side as well. I also like the suggestion to revert a patch >> > causing new test failures instead of expecting the owner to fix the >> issues >> > in follow up jiras. >> > >> > And I would also have a borderline "radical" suggestion.. if a flaky >>test >> > is failing and the committer isn't sure at a glance that the failure >>is >> > benign (stat diff failure for example I would consider low risk, race >> > conditions on the other hand high risk), put an ignore annotation on >>it. >> In >> > my view a flaky test is pretty much useless. If it breaks people will >> just >> > ignore it (because it fails so often...) and anyway how can you be >>sure >> > that the green run is the standard and the failure is the exception >>and >> not >> > the other way around. We have thousands of tests, ignoring a dozen >>won't >> > decrease coverage significantly and will take us that much closer to a >> > point where we can demand a green pre-commit run. >> > >> > Thanks >> > >> > On Fri, Oct 14, 2016 at 9:45 AM, Prasanth Jayachandran < >> > pjayachand...@hortonworks.com> wrote: >> > >> >> +1 on the proposal. Adding more to it. >> >> >> >> A lot of time has been spent on improving the test runtime and >>bringing >> >> down the flaky tests. >> >> Following jiras should give an overview of the effort involved >> >> https://issues.apache.org/jira/browse/HIVE-14547 >> >> https://issues.apache.org/jira/browse/HIVE-13503 >> >> >> >> Committers please ensure that the reported failures are absolutely >>not >> >> related >> >> to the patch before committing it. >> >> >> >> I would also propose the following to maintain a clean and some tips >>to >> >> maintain fast test runs >> >> >> >> 1) Revert patch that is causing a failure. It should be the >> responsibility >> >> of >> >> the contributor to make sure the patch is not causing any failures. >>I am >> >> against creating follow ups >> >> for fixing test failures usually because it gets ignored or it gets >> lower >> >> priority causing wasted effort >> >> and time for failure analysis for every other developers waiting to >> commit >> >> their patch. >> >> >> >> 2) +1 from reviewers AFTER a clean green run. Or if a reviewer is >> >> convinced that test failures are unrelated. >> >> May be we should stop conditional +1s and wait for clean green run. >> >> >> >> 3) Avoid slow tests (there is jira to print out runtime of newly >>added >> >> tests). In general, its good >> >> to have many smaller tests as opposed to single big tests. If the >>qfile >> or >> >> junit test is big, splitting it >> >> up will help in parallelizing them and avoiding stragglers. >> >> >> >> 4) Avoid adding tests to MiniMr (slowest of all). >> >> >> >> 5) Try to keep the test runtime (see surefire-report to get correct >> >> runtime without initialization time) under a minute. >> >> >> >> 6) Avoid adding more read-only tables to init script as this will >> increase >> >> the initialization time. >> >> >> >> 7) If the test case does not require explain plan then avoid it as >>most >> >> failures are explain diffs. >> >> >> >> 8) If the test case requires explain and if it does not depend on >>table >> or >> >> partition stats explicitly set stats for the table or partition. >> >> Explicitly setting stats will avoid expensive stats computation time >>and >> >> avoids flakiness due to stats diff. >> >> >> >> 9) Prefer jUnit over qtest. >> >> >> >> 10) Add explicitly timeout for jUnit test to avoid indefinite >>hanging of >> >> tests (surefire timeouts after 40 mins) >> >> >> >> Thoughts? >> >> >> >> Thanks >> >> Prasanth >> >> >> >> On Oct 13, 2016, at 11:10 PM, Siddharth Seth >><ss...@apache.org<mailto: >> >> ss...@apache.org>> wrote: >> >> >> >> There's been a lot of work to make the test runs faster, as well as >>more >> >> reliable via HIVE-14547, HIVE-13503, and several other jiras. Test >> runtimes >> >> are around the 1 hour mark, and going down. There were a few green >> >> pre-commit runs (after years?). At the same time, there's still some >> flaky >> >> tests. >> >> >> >> We really should try to keep the test runtimes down, as well as the >> number >> >> of failures - so that the pre-commit runs can provide useful >> information. >> >> >> >> I'm not sure what the current approach w.r.t precommit runs before a >> >> commit. What I've seen in other projects is that the pre-commit >>needs to >> >> run, and come back clean (mostly) before a commit goes in. Between >>what >> >> used to be 5 day wait times, and inconsistent runs - I don't think >>this >> is >> >> always followed in Hive. >> >> >> >> It'll be useful to start relying on pre-commit test results again. >>Given >> >> the flaky tests, I'd suggest the following >> >> 1. Pre-commit must be run on a patch before committing (with very few >> >> exceptions) >> >> 2. A green test run is ideal >> >> 3. In case there are failures - keep track of these as sub-jiras >>under a >> >> flaky test umbrella jira (Some under HIVE-14547 already) - to be >> eventually >> >> fixed. >> >> 4. Before committing - cite relevant jiras for a flaky test (create >>and >> >> cite if it doesn't already exist). >> >> >> >> This should help us build up a list of flaky tests over various runs, >> which >> >> will hopefully get fixed at some point. >> >> >> >> Thoughts? >> >> >> >> Thanks, >> >> Sid >> >> >> >> >> >>