Re: Add all tests to release validation

Kenneth Knowles Thu, 10 Jan 2019 16:32:40 -0800

What do you think about crowd-sourcing?

1. Fix Version = 2.10.0
2. If assigned, ping ticket and maybe assignee, unassign if unresponsive
3. If unassigned, assign it to yourself while thinking about it
4. If you can route it a bit closer to someone who might know, great
5. If it doesn't look like a blocker (after routing best you can), Fix
Version = 2.11.0


I think this has enough mutexes that there should be no duplicated work if
it is followed. And every step is a standard use of Fix Version and
Assignee so there's not really special policy needed.

Kenn

On Thu, Jan 10, 2019 at 4:25 PM Mikhail Gryzykhin <[email protected]> wrote:

> +1
>
> Although we should be cautious when enabling this policy. We have decent
> backlog of bugs that we need to plumb through.
>
> --Mikhail
>
> Have feedback <http://go/migryz-feedback>?
>
>
> On Thu, Jan 10, 2019 at 11:44 AM Scott Wegner <[email protected]> wrote:
>
>> +1, this sounds good to me.
>>
>> I believe the next step would be to open a PR to add this to the release
>> guide:
>> https://github.com/apache/beam/blob/master/website/src/contribute/release-guide.md
>>
>> On Wed, Jan 9, 2019 at 12:04 PM Sam Rohde <[email protected]> wrote:
>>
>>> Cool, thanks for all of the replies. Does this summary sound reasonable?
>>>
>>> *Problem:* there are a number of failing tests (including flaky) that
>>> don't get looked at, and aren't necessarily green upon cutting a new Beam
>>> release.
>>>
>>> *Proposed Solution:*
>>>
>>>    - Add all tests to the release validation
>>>    - For all failing tests (including flaky) create a JIRA attached to
>>>    the Beam release and add to the "test-failures" component*
>>>    - If a test is continuously failing
>>>          - fix it
>>>          - add fix to release
>>>          - close out JIRA
>>>       - If a test is flaky
>>>          - try and fix it
>>>          - If fixed
>>>             - add fix to release
>>>             - close out JIRA
>>>          - else
>>>             - manually test it
>>>             - modify "Fix Version" to next release
>>>          - The release validation can continue when all JIRAs are
>>>    closed out.
>>>
>>> *Why this is an improvement:*
>>>
>>>    - Ensures that every test is a valid signal (as opposed to disabling
>>>    failing tests)
>>>    - Creates an incentive to automate tests (no longer on the hook to
>>>    manually test)
>>>    - Creates a forcing-function to fix flaky tests (once fixed, no
>>>    longer needs to be manually tested)
>>>    - Ensures that every failing test gets looked at
>>>
>>> *Why this may not be an improvement:*
>>>
>>>    - More effort for release validation
>>>    - May slow down release velocity
>>>
>>> * for brevity, this might be better to create a JIRA per component
>>> containing a summary of failing tests
>>>
>>>
>>> -Sam
>>>
>>>
>>>
>>>
>>> On Tue, Jan 8, 2019 at 10:25 AM Ahmet Altay <[email protected]> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Jan 8, 2019 at 8:25 AM Kenneth Knowles <[email protected]> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Jan 8, 2019 at 7:52 AM Scott Wegner <[email protected]> wrote:
>>>>>
>>>>>> For reference, there are currently 34 unresolved JIRA issues under
>>>>>> the test-failures component [1].
>>>>>>
>>>>>> [1]
>>>>>> https://issues.apache.org/jira/browse/BEAM-6280?jql=project%20%3D%20BEAM%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20test-failures%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
>>>>>>
>>>>>
>>>>> And there are 19 labeled with flake or sickbay:
>>>>> https://issues.apache.org/jira/issues/?filter=12343195
>>>>>
>>>>>
>>>>>> On Mon, Jan 7, 2019 at 4:03 PM Ahmet Altay <[email protected]> wrote:
>>>>>>
>>>>>>> This is a a good idea. Some suggestions:
>>>>>>> - It would be nicer if we can figure out process to act on flaky
>>>>>>> test more frequently than releases.
>>>>>>>
>>>>>>
>>>>> Any ideas? We could just have some cadence and try to establish the
>>>>> practice of having a deflake thread every couple of weeks? How about we 
>>>>> add
>>>>> it to release verification as a first step and then continue to discuss?
>>>>>
>>>>
>>>> Sounds great. I do not know enough JIRA, but I am hoping that a
>>>> solution can come in the form of tooling. If we could configure JIRA with
>>>> SLOs per issue type, we could have customized reports on which issues are
>>>> not getting enough attention and then do a load balance among us.
>>>>
>>>>
>>>>>
>>>>> - Another improvement in the process would be having actual owners of
>>>>>>> issues rather than auto assigned component owners. A few folks have 100+
>>>>>>> assigned issues. Unassigning those issues, and finding owners who would
>>>>>>> have time to work on identified flaky tests would be helpful.
>>>>>>>
>>>>>>
>>>>> Yikes. Two issues here:
>>>>>
>>>>>  - sounds like Jira component owners aren't really working for us as a
>>>>> first point of contact for triage
>>>>>  - a person shouldn't really have more than 5 Jira assigned, or if you
>>>>> get really loose maybe 20 (I am guilty of having 30 at this moment...)
>>>>>
>>>>> Maybe this is one or two separate threads?
>>>>>
>>>>
>>>> I can fork this to another thread. I think both issues are related
>>>> because components owners are more likely to be in this situaion. I agree
>>>> with assessment of two issues.
>>>>
>>>>
>>>>>
>>>>> Kenn
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Mon, Jan 7, 2019 at 3:45 PM Kenneth Knowles <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I love this idea. It can easily feel like bugs filed for Jenkins
>>>>>>>> flakes/failures just get lost if there is no process for looking them 
>>>>>>>> over
>>>>>>>> regularly.
>>>>>>>>
>>>>>>>> I would suggest that test failures / flakes all get filed with Fix
>>>>>>>> Version = whatever release is next. Then at release time we can triage 
>>>>>>>> the
>>>>>>>> list, making sure none might be a symptom of something that should 
>>>>>>>> block
>>>>>>>> the release. One modification to your proposal is that after manual
>>>>>>>> verification that it is safe to release I would move Fix Version to the
>>>>>>>> next release instead of closing, unless the issue really is fixed or
>>>>>>>> otherwise not reproducible.
>>>>>>>>
>>>>>>>> For automation, I wonder if there's something automatic already
>>>>>>>> available somewhere that would:
>>>>>>>>
>>>>>>>>  - mark the Jenkins build to "Keep This Build Forever"
>>>>>>>>  - be *very* careful to try to find an existing bug, else it will
>>>>>>>> be spam
>>>>>>>>  - file bugs to "test-failures" component
>>>>>>>>  - set Fix Version to the "next" - right now we have 2.7.1 (LTS),
>>>>>>>> 2.11.0 (next mainline), 3.0.0 (dreamy incompatible ideas) so need the
>>>>>>>> smarts to choose 2.11.0
>>>>>>>>
>>>>>>>> If not, I think doing this stuff manually is not that bad, assuming
>>>>>>>> we can stay fairly green.
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Mon, Jan 7, 2019 at 3:20 PM Sam Rohde <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> There are a number of tests in our system that are either flaky or
>>>>>>>>> permanently red. I am suggesting to add, if not all, then most of the 
>>>>>>>>> tests
>>>>>>>>> (style, unit, integration, etc) to the release validation step. In 
>>>>>>>>> this
>>>>>>>>> way, we will add a regular cadence to ensuring greenness and no flaky 
>>>>>>>>> tests
>>>>>>>>> in Beam.
>>>>>>>>>
>>>>>>>>> There are a number of ways of implementing this, but what I think
>>>>>>>>> might work the best is to set up a process that either manually or
>>>>>>>>> automatically creates a JIRA for the failing test and assigns it to a
>>>>>>>>> component tagged with the release number. The release can then 
>>>>>>>>> continue
>>>>>>>>> when all JIRAs are closed by either fixing the failure or manually 
>>>>>>>>> testing
>>>>>>>>> to ensure no adverse side effects (this is in case there are 
>>>>>>>>> environmental
>>>>>>>>> issues in the testing infrastructure or otherwise).
>>>>>>>>>
>>>>>>>>> Thanks for reading, what do you think?
>>>>>>>>> - Is there another, easier way to ensure that no test failures go
>>>>>>>>> unfixed?
>>>>>>>>> - Can the process be automated?
>>>>>>>>> - What am I missing?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Sam
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Got feedback? tinyurl.com/swegner-feedback
>>>>>>
>>>>>
>>
>> --
>>
>>
>>
>>
>> Got feedback? tinyurl.com/swegner-feedback
>>
>

Re: Add all tests to release validation

Reply via email to