Unstable tests ruin our days!

Zoltan Haindrich Wed, 11 Dec 2019 01:18:55 -0800

Hey all!

Recently the number of unstable tests have rised above the "usual" level...
These are usually hard to fix - and in most cases need a deep dive in the area 
where the test operates.
Because of that I tend to just reattach the patch to the jira to get another 
run in a day or so...


The downside of the above approach is that reattaching on unrelated failures 
has a positive hivqqa queuesize coefficient.

There is another downside which might not be obvious first: it reduces the trust in the system and as a result there were cases when I did reattach the patch; but it was agenuine failure...it seemed unrelated; but actually it was.


Instead of continuing to reattach patches every day; I would like to propose a 
way to handle them:

* check that the falling test has nothing to do with the actual patch

* it's important to be able to run test on our machines - but the most important is to maintain that HiveQA is able to run them successfully; for this reason I thinkhaving 2 HiveQA runs for the same changeset where in one of them the unstable test fails is the best

  * you can search the jira for the testcase and look if other patches have 
also bumped into it
  * ?
* add a comment about that you are about the disable the test in HIVE-22621 and 
commit it
  * I think it would be ok to skip the regular code change process
* create a new subtask under HIVE-22619 with the details you know about the 
falling testcase
* (resubmit your patch)

What do you think?

cheers,
Zoltan

Unstable tests ruin our days!

Reply via email to