Re: Workflow improvement

Dmitriy Pavlov Fri, 24 Aug 2018 06:32:38 -0700

Hi Dmitriy,

Sounds like a plan ;) I totally agree.
Just one proposal: I would like to avoid hiding any test failures. 2
separate tables named 'Possible Blockers' and 'Other failures' should be
completely clear. Comment to PR can contain only first category.


We had a private discussion with Anton K. and he proposed a very
interesting thing, I would like to bring it here. We can add configuration
into TC bot and mark some tests and some suites as 'Known Stable' It means
that suite or test failure should be considered as a blocker for merge
every time it fails, even if fail rate is non-zero. Very first list of such
suites are
 - Build Apache Ignite
 - License check
And tests:
 - .Net API Parity check
Please share your vision.

Meanwhile, I've updated the TC bot.
- Underlying Apache Ignite DB was refactored to use lower partitions count,
so restart should be faster.
- Now 100 builds are saved as our failure rate statistics basis. Flakiness
border was not changed, so more test now will be considered as flaky.
- Now 4 builds required to be failed or timed out in a row before
notification is sent.

Sincerely,
Dmitriy Pavlov

чт, 23 авг. 2018 г. в 17:09, Dmitrii Ryabov <somefire...@gmail.com>:

> Hi, Dmitriy,
>
> I propose the next steps:
>
> 1. Show current 0% tests in a separate table at the top of the analysis
> results page. Thus, we'll see most suspicious (new or very flaky) tests
> firstly. Or we can hide other tests under "More >>" button, like top long
> running tests.
> 2. Create a button by clicking on which the info about 0% tests will be
> written in the PR.
> 3. Replace button by webhook for TeamCity (for Run All), which will start
> analysis on TCH and write results in the PR.
> 4. ...
>
> What do you think?
>
>
> 2018-08-22 8:55 GMT+03:00 Dmitrii Ryabov <somefire...@gmail.com>:
>
> > I think we should check not N last runs, but all runs in last N days.
> >
> > A simple rule to detect flaky fails by hands - get test history ordered
> > by TEST_STATUS_DESC and check its date. As I see, we can get list of
> > failures from TC. We don't need to check successfull runs, because they
> are
> > uninformative for our needs.
> >
> > 2018-08-21 20:24 GMT+03:00 Dmitriy Pavlov <dpavlov....@gmail.com>:
> >
> >> Hi Dmitriy,
> >>
> >> The Bot is able to detect a frequent change of test status, but
> currently
> >> only 50 last runs count. Same is true for the failure rate.
> >>
> >> This value can be easily changed to 70 or 100, moreover, the auto
> >> trigger feature gives us much more builds.
> >>
> >> We can improve these rules. We can add not only status change, but
> status
> >> change without any code changes. We can somehow save this data in
> RunStat
> >> class. Let's create a better rule, and later we can code it.
> >>
> >> Sincerely,
> >> Dmitriy Pavlov
> >>
> >> вт, 21 авг. 2018 г. в 19:22, Dmitrii Ryabov <somefire...@gmail.com>:
> >>
> >> > I think plugin will be more pretty looking, but comments can contain
> any
> >> > information, so they can be more usefull. I agree with your idea to
> >> create
> >> > bot instead of plugin.
> >> >
> >> > As for fail rate - I'm not sure it is working as you describe.
> >> > I'm looking on my runAll [1]. There is
> >> > `IgniteCacheGroupsTest.testCacheApiTxReplicated`
> >> > in `Cache 3` suite with fail rate = 0.0%. But it is flaky and fails in
> >> > master branch [2].
> >> >
> >> > [1]
> >> >
> >> > https://mtcga.gridgain.com/pr.html?serverId=apache&suiteId=I
> >> gniteTests24Java8_RunAll&branchForTc=pull%2F4519%2Fhead&action=Latest
> >> > [2]
> >> >
> >> > https://ci.ignite.apache.org/project.html?projectId=IgniteTe
> >> sts24Java8&buildTypeId=&tab=testDetails&testNameId=5628470
> >> 782089555961&order=TEST_STATUS_DESC&branch_IgniteTests
> >> 24Java8=__all_branches__&itemsCount=50
> >> >
> >> > 2018-08-21 18:00 GMT+03:00 Dmitriy Pavlov <dpavlov....@gmail.com>:
> >> >
> >> > > Hi Dmitrii,
> >> > >
> >> > > I'm not sure we're able to install Github apps to Apache mirrors.
> >> > >
> >> > > The simplest solution, what can be as efficient as a plugin, is fake
> >> > MTCGA
> >> > > bot account in Github, which will provide PR comments using Github
> >> > program
> >> > > interface. What do you think?
> >> > >
> >> > > A new test failure can be identified by the Ignite TC Bot by master
> >> > recent
> >> > > fail rate = 0.0%. The same rule can be applied to timed out suites.
> >> > >
> >> > > Sincerely,
> >> > > Dmitriy Pavlov
> >> > >
> >> > > вт, 21 авг. 2018 г. в 16:16, Dmitrii Ryabov <somefire...@gmail.com
> >:
> >> > >
> >> > > > Hello, Igniters!
> >> > > >
> >> > > > I want to suggest improvement for TeamCity Helper [1] – we need an
> >> easy
> >> > > way
> >> > > > to get list of failed tests that don’t fall in the master branch.
> >> These
> >> > > > tests should:
> >> > > > * fail in the PR
> >> > > > * not fail in the master
> >> > > > * not be flaky.
> >> > > >
> >> > > > Also, I want to suggest to create a GitHub plugin, which will
> >> notify PR
> >> > > if
> >> > > > it has such tests. PR will have a marker, which allows/prohibits
> >> merge.
> >> > > > This marker will be shown near PR conflicts.
> >> > > >
> >> > > > Allowing marker will be shown in case:
> >> > > > * no new fails.
> >> > > >
> >> > > > Prohibiting marker will be shown in cases:
> >> > > > * new fails – tests must be fixed.
> >> > > > * new timed out test suite – suite should be restarted or tests
> >> must be
> >> > > > fixed.
> >> > > > * runAll wasn’t launched – tests must be launched.
> >> > > >
> >> > > > This will make test checks much faster and easier. Also, this will
> >> > > decrease
> >> > > > the number of merges with new failed tests made by inattention to
> >> the
> >> > > > tests.
> >> > > >
> >> > > > Further, we can expand the plugin by adding new checks, showing PR
> >> > > quality.
> >> > > >
> >> > > > [1] https://github.com/apache/ignite-teamcity-bot
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: Workflow improvement

Reply via email to