+1; this essentially converts flaky automated tests into manual release
tests until the automation gets fixed. It's an improvement over the current
behavior of simply disabling tests, because when tests are disabled the
quality signal is lost. This also creates a stronger incentive to fix
tests: fixing the automation means you're no longer on the hook to run
manual tests.

This could potentially be a significant increase the validation work for a
release since each flaky test will need to be manually verified. I think
it's worth it, and will push us to fix flaky tests. For reference, there
are currently 34 unresolved JIRA issues under the test-failures component
[1].

[1]
https://issues.apache.org/jira/browse/BEAM-6280?jql=project%20%3D%20BEAM%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20test-failures%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC

On Mon, Jan 7, 2019 at 4:03 PM Ahmet Altay <[email protected]> wrote:

> This is a a good idea. Some suggestions:
> - It would be nicer if we can figure out process to act on flaky test more
> frequently than releases.
> - Another improvement in the process would be having actual owners of
> issues rather than auto assigned component owners. A few folks have 100+
> assigned issues. Unassigning those issues, and finding owners who would
> have time to work on identified flaky tests would be helpful.
>
>
> On Mon, Jan 7, 2019 at 3:45 PM Kenneth Knowles <[email protected]> wrote:
>
>> I love this idea. It can easily feel like bugs filed for Jenkins
>> flakes/failures just get lost if there is no process for looking them over
>> regularly.
>>
>> I would suggest that test failures / flakes all get filed with Fix
>> Version = whatever release is next. Then at release time we can triage the
>> list, making sure none might be a symptom of something that should block
>> the release. One modification to your proposal is that after manual
>> verification that it is safe to release I would move Fix Version to the
>> next release instead of closing, unless the issue really is fixed or
>> otherwise not reproducible.
>>
>> For automation, I wonder if there's something automatic already available
>> somewhere that would:
>>
>>  - mark the Jenkins build to "Keep This Build Forever"
>>  - be *very* careful to try to find an existing bug, else it will be spam
>>  - file bugs to "test-failures" component
>>  - set Fix Version to the "next" - right now we have 2.7.1 (LTS), 2.11.0
>> (next mainline), 3.0.0 (dreamy incompatible ideas) so need the smarts to
>> choose 2.11.0
>>
>> If not, I think doing this stuff manually is not that bad, assuming we
>> can stay fairly green.
>>
>> Kenn
>>
>> On Mon, Jan 7, 2019 at 3:20 PM Sam Rohde <[email protected]> wrote:
>>
>>> Hi All,
>>>
>>> There are a number of tests in our system that are either flaky or
>>> permanently red. I am suggesting to add, if not all, then most of the tests
>>> (style, unit, integration, etc) to the release validation step. In this
>>> way, we will add a regular cadence to ensuring greenness and no flaky tests
>>> in Beam.
>>>
>>> There are a number of ways of implementing this, but what I think might
>>> work the best is to set up a process that either manually or automatically
>>> creates a JIRA for the failing test and assigns it to a component tagged
>>> with the release number. The release can then continue when all JIRAs are
>>> closed by either fixing the failure or manually testing to ensure no
>>> adverse side effects (this is in case there are environmental issues in the
>>> testing infrastructure or otherwise).
>>>
>>> Thanks for reading, what do you think?
>>> - Is there another, easier way to ensure that no test failures go
>>> unfixed?
>>> - Can the process be automated?
>>> - What am I missing?
>>>
>>> Regards,
>>> Sam
>>>
>>>

-- 




Got feedback? tinyurl.com/swegner-feedback

Reply via email to