Hey all!
Recently the number of unstable tests have rised above the "usual" level...
These are usually hard to fix - and in most cases need a deep dive in the area
where the test operates.
Because of that I tend to just reattach the patch to the jira to get another
run in a day or so...
The downside of the above approach is that reattaching on unrelated failures
has a positive hivqqa queuesize coefficient.
There is another downside which might not be obvious first: it reduces the trust in the system and as a result there were cases when I did reattach the patch; but it was a
genuine failure...it seemed unrelated; but actually it was.
Instead of continuing to reattach patches every day; I would like to propose a
way to handle them:
* check that the falling test has nothing to do with the actual patch
* it's important to be able to run test on our machines - but the most important is to maintain that HiveQA is able to run them successfully; for this reason I think
having 2 HiveQA runs for the same changeset where in one of them the unstable test fails is the best
* you can search the jira for the testcase and look if other patches have
also bumped into it
* ?
* add a comment about that you are about the disable the test in HIVE-22621 and
commit it
* I think it would be ok to skip the regular code change process
* create a new subtask under HIVE-22619 with the details you know about the
falling testcase
* (resubmit your patch)
What do you think?
cheers,
Zoltan