Re: reverting test-breaking changes

Vihang Karajgaonkar Fri, 06 Apr 2018 09:57:32 -0700

TestNegativeCliDriver failures are concerning. We are practically having no
negative test coverage since a last couple of weeks. I think we should
treat it as a blocker for Hive 3.0.0 release. Thoughts?


On Thu, Apr 5, 2018 at 9:50 PM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> It may be because of legitimate memory issue/leak. Alternative is to
> decrease the batch size of negative cli driver on ptest cluster but again
> we wouldn't know if there is any actual memory issue.
>
> Thanks
> Prasanth
>
>
>
> On Thu, Apr 5, 2018 at 1:36 PM -0700, "Vineet Garg" <vg...@hortonworks.com
> <mailto:vg...@hortonworks.com>> wrote:
>
>
> TestNegativeCli driver tests are still failing with
> java.lang.OutOfMemoryError: GC overhead limit exceeded error.
> Can we increase the amount of memory for tests?
>
> Vineet
>
> On Mar 5, 2018, at 11:35 AM, Sergey Shelukhin > wrote:
>
> On a semi-related note, I noticed recently that negative tests seem to OOM
> in setup from time to time.
> Can we increase the amount of memory for the tests a little bit, and/or
> maybe add the dump on OOM to them, saved to test logs directory, so we
> could investigate?
>
> On 18/3/5, 11:07, "Vineet Garg" > wrote:
>
> +1 for nightly build. We could generate reports to identify both frequent
> and sporadic test failures plus other interesting bits like average build
> time, yetus failures etc. It'll also help narrow down the culprit
> commit(s) range to one day.
> If you guys decide to go ahead with this I would like to help.
>
> Vineet
>
> On Mar 5, 2018, at 8:50 AM, Sahil Takiar > wrote:
>
> Wow that HBase UI looks super useful. +1 to having something like that.
>
> If not, +1 to having a proper nightly build, it would help devs identify
> which commits break which tests. I find using git-bisect can take a long
> time to run, and can be difficult to use (e.g. finding a known good
> commit
> isn't always easy).
>
> On Mon, Mar 5, 2018 at 9:03 AM, Peter Vary > wrote:
>
> Without a nightly build and with this many flaky tests it is very hard
> to
> identify the braking commits. We can use something like bisect and
> multiple
> test runs.
>
> There is a more elegant way to do this with nightly test runs:
> https://issues.apache.org/jira/browse/HBASE-15917 <
> https://issues.apache.org/jira/browse/HBASE-15917>
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/
> lastSuccessfulBuild/artifact/dashboard.html
>
> This also helps to identify the flaky tests, and creates a continuos,
> updated list of them.
>
> On Feb 23, 2018, at 6:55 PM, Sahil Takiar
> wrote:
>
> +1
>
> Does anyone have suggestions about how to efficiently identify which
> commit
> is breaking a test? Is it just git-bisect or is there an easier way?
> Hive
> QA isn't always that helpful, it will say a test is failing for the
> past
> "x" builds, but that doesn't help much since Hive QA isn't a nightly
> build.
>
> On Thu, Feb 22, 2018 at 10:31 AM, Vihang Karajgaonkar <
> vih...@cloudera.com>
> wrote:
>
> +1
> Commenting on JIRA and giving a 24hr heads-up (excluding weekends)
> would be
> good.
>
> On Thu, Feb 22, 2018 at 10:19 AM, Alan Gates
> wrote:
>
> +1.
>
> Alan.
>
> On Thu, Feb 22, 2018 at 8:25 AM, Thejas Nair
> wrote:
>
> +1
> I agree, this makes sense. The number of failures keeps increasing.
> A 24 hour heads up in either case before revert would be good.
>
>
> On Thu, Feb 22, 2018 at 2:45 AM, Peter Vary
> wrote:
>
> I agree with Zoltan. The continuously braking tests make it very
> hard
> to
> spot real issues.
> Any thoughts on doing it automatically?
>
> On Feb 22, 2018, at 10:47 AM, Zoltan Haindrich
> wrote:
>
> *
>
> Hello,
>
> *
> *
>
> **
>
> In the last couple weeks the number of broken tests have started
> to
> go
> up...and even tho I run bisect/etc from time to time ; sometimes
> people
> don't react to my comments/tickets/etc.
>
> Because keeping this many failing tests makes it easier for a new
> one
> to
> slip in...I think reverting the patch introducing the test
> failures
> would
> also help in some case.
>
> I think it would help a lot to prevent further test breaks to
> revert
> the
> patch if any of the following conditions is met:
>
> *
> *
>
> C1) if the notification/comment about the fact that the patch
> indeed
> broken a test somehow have been unanswered for at least 24 hours.
>
> C2) if the patch is in for 7 days; but the test failure is still
> not
> addressed (note that in this case there might be a conversation
> about
> fixing it...but in this case ; to enable other people to work in a
> cleaner
> environment is more important than a single patch - and if it
> can't
> be
> fixed in 7 days...well it might not get fixed in a month).
>
> *
> *
>
> I would like to also note that I've seen a few tickets which have
> been
> picked up by people who were not involved in creating the original
> change -
> and although the intention was good, they might miss the context
> of
> the
> original patch and may "fix" the tests in the wrong way: accept a
> q.out
> which is inappropriate or ignore the test...
>
> *
> *
>
> would it be ok to implement this from now on? because it makes my
> efforts practically useless if people are not reacting...
>
> *
> *
>
> note: just to be on the same page - this is only about running a
> single
> test which falls on its own - I feel that flaky tests are an
> entirely
> different topic.
>
> *
> *
>
> cheers,
>
> Zoltan
>
> **
> *
>
>
>
>
>
>
>
>
> --
> Sahil Takiar
> Software Engineer
> takiar.sa...@gmail.com | (510) 673-0309
>
>
>
>
> --
> Sahil Takiar
> Software Engineer
> takiar.sa...@gmail.com | (510) 673-0309
>
>
>
>
>

Re: reverting test-breaking changes

Reply via email to