If we still need to accept PRs for Flink-1.9/1.10, that could explain why we 
still need that command hint.
Chesnay, thanks for your explanation.
________________________________
From: Chesnay Schepler <ches...@apache.org>
Sent: Monday, May 25, 2020 18:17
To: dev@flink.apache.org <dev@flink.apache.org>; Yun Tang <myas...@live.com>
Subject: Re: [DISCUSS] Switch to Azure Pipelines as the primary CI tool / 
switch off Travis

The travis bot commands must be retained so long as we accept PRs for
1.9/1.10 .

On 25/05/2020 10:50, Yun Tang wrote:
> I noticed that there still existed travis related bot commands in the github 
> PR page, and I think we should remove the command hint now.
> ________________________________
> From: Robert Metzger <rmetz...@apache.org>
> Sent: Thursday, April 23, 2020 15:44
> To: dev <dev@flink.apache.org>
> Subject: Re: [DISCUSS] Switch to Azure Pipelines as the primary CI tool / 
> switch off Travis
>
> FYI: I have moved the Flink PR and master builds from my personal Azure
> account to a PMC controlled account:
> https://dev.azure.com/apache-flink/apache-flink/_build
>
> On Fri, Apr 17, 2020 at 8:28 PM Robert Metzger <rmetz...@apache.org> wrote:
>
>> Thanks a lot for bringing up this topic again.
>> The reason why I was hesitant to decommission Travis was that we were
>> still facing some issues with the Azure infrastructure that I want to
>> resolve, so that we have a strong test coverage.
>>
>> In the last few weeks, we had the following issues:
>> - unstable e2e tests (we are running the e2e tests much more frequently,
>> thus we see more failures (and discover actual bugs!))
>> - network issues (mostly around downloading maven artifacts. This is
>> solved at the cost of slower builds. I'm preparing a fix to have stable &
>> fast maven downloads)
>> - the private builds were never really stable (this is work in progress.
>> the situation is definitely better than the private Travis builds)
>> - I haven't followed the overall master stability closely before February,
>> but I have the feeling that April so far was a pretty unstable month on
>> master. Piotr is regularly reverting commits that somehow broke master. The
>> problem with unstable master is that is causes a "CI fatigue", were people
>> assume that failing builds are not worth investigating anymore, leading to
>> more instability. This is not a problem of the CI infrastructure itself,
>> but it makes me less confident switching systems :)
>>
>>
>> Unless something unexpected happens, I'm proposing to disable pull request
>> processing on Travis next week.
>>
>>
>>
>> On Fri, Apr 17, 2020 at 10:05 AM Gary Yao <g...@apache.org> wrote:
>>
>>> I am in favour of decommissioning Travis.
>>>
>>> Moreover, I wanted to use this thread to raise another issue with Travis
>>> that I
>>> have discovered recently; many of the builds running in my private Travis
>>> account are timing out in the compilation stage (i.e., compilation takes
>>> more
>>> than 50 minutes). This means that I am not able to reliably run a full
>>> build on
>>> a CI server without creating a pull request. If other developers also
>>> experience
>>> this issue, it would speak for putting more effort into making Azure
>>> Pipelines
>>> the project-wide default.
>>>
>>> Best,
>>> Gary
>>>
>>> On Thu, Mar 26, 2020 at 12:26 PM Yu Li <car...@gmail.com> wrote:
>>>
>>>> Thanks for the clarification Robert.
>>>>
>>>> Since the first step plan is to replace the travis PR runs, I checked
>>> all
>>>> PR builds from 2020-01-01 (PR#10735-11526) [1], and below is the result:
>>>>
>>>> * Travis FAILURE: 298
>>>> * Travis SUCCESS: 649 (68.5%)
>>>> * Azure FAILURE: 420
>>>> * Azure SUCCESS: 571 (57.6%)
>>>>
>>>> Since the patch for each run is equivalent for Travis and Azure, there
>>>> seems to be slightly higher failure rate (~10%) when running in Azure.
>>>>
>>>> However, with the just-merged fix for uploading logs (FLINK-16480), I
>>>> believe the success rate of Azure could compete with Travis now
>>> (uploading
>>>> files contribute to 20% of the failures according to the report [2]).
>>>>
>>>> So I'm +1 to disable travis runs according to the numbers.
>>>>
>>>> Best Regards,
>>>> Yu
>>>>
>>>> [1]
>>>>
>>> https://github.com/apache/flink/pulls?q=is%3Apr+created%3A%3E%3D2020-01-01
>>>> [2]
>>>>
>>>>
>>> https://dev.azure.com/rmetzger/Flink/_pipeline/analytics/stageawareoutcome?definitionId=4
>>>> On Thu, 26 Mar 2020 at 03:28, Robert Metzger <rmetz...@apache.org>
>>> wrote:
>>>>> Thank you for your responses.
>>>>>
>>>>> @Yu Li: In the current master, the log upload always fails, if the e2e
>>>> job
>>>>> failed. I just merged a PR that fixes this issue [1]. The problem was
>>> not
>>>>> really the network stability, rather a problem with the interaction of
>>>> the
>>>>> jobs in the pipeline (the e2e job did not set the right variables for
>>> the
>>>>> log upload)
>>>>> Secondly, you are looking at the report of the "flink-ci.flink"
>>> pipeline,
>>>>> where pull requests are build. Naturally, pull request builds fail all
>>>> the
>>>>> time, because the PRs are not yet perfect.
>>>>>
>>>>> "flink-ci.flink-master" is the right pipeline to look at:
>>>>>
>>>>>
>>> https://dev.azure.com/rmetzger/Flink/_pipeline/analytics/stageawareoutcome?definitionId=8&contextType=build
>>>>> We have a fairly high number of failures there, because we currently
>>> have
>>>>> some issues downloading the maven artifacts [2]. I'm working already
>>> with
>>>>> Chesnay on merging a fix for that.
>>>>>
>>>>>
>>>>> [1]
>>>>>
>>>>>
>>> https://github.com/apache/flink/commit/1c86b8b9dd05615a3b2600984db738a9bf388259
>>>>> [2]https://issues.apache.org/jira/browse/FLINK-16720
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 25, 2020 at 4:48 PM Chesnay Schepler <ches...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> The easiest way to disable travis for pushes is to remove all builds
>>>>>> from the .travis.yml with a push/pr condition.
>>>>>>
>>>>>> On 25/03/2020 15:03, Robert Metzger wrote:
>>>>>>> Thank you for the feedback so far.
>>>>>>>
>>>>>>> Responses to the items Chesnay raised:
>>>>>>>
>>>>>>> - by virtue of maintaining the past 2 releases we will have to
>>>> maintain
>>>>>> any
>>>>>>>> Travis infrastructure as long as 1.10 is supported, i.e., until
>>> 1.12
>>>>>>> Okay. I wasn't sure about the exact policy there.
>>>>>>>
>>>>>>>
>>>>>>>> - the azure setup doesn't appear to be equivalent yet since the
>>> java
>>>>> e2e
>>>>>>>> profile isn't setting the hadoop switch (-Pe2e-hadoop), as a
>>> result
>>>> of
>>>>>>>> which SQLClientKafkaITCase isn't run
>>>>>>>>
>>>>>>> I filed a ticket to address this:
>>>>>>> https://issues.apache.org/jira/browse/FLINK-16778
>>>>>>>
>>>>>>>
>>>>>>>> - the nightly scripts still seems to be using a maven version
>>> other
>>>>> than
>>>>>>>> 3.2.5; from today on master:
>>>>>>>> 2020-03-25T05:31:52.7412964Z [INFO] --------<
>>>>>>>> org.apache.flink:flink-end-to-end-tests-common-kafka >--------
>>>>>>>> 2020-03-25T05:31:52.7413854Z [INFO] Building
>>>>>>>> flink-end-to-end-tests-common-kafka 1.11-SNAPSHOT [39/46]
>>>>>>>> 2020-03-25T05:31:52.7414689Z [INFO]
>>>> --------------------------------[
>>>>>> jar
>>>>>>>> ]---------------------------------
>>>>>>>> 2020-03-25T05:31:52.7518360Z [INFO]
>>>>>>>> 2020-03-25T05:31:52.7519770Z [INFO] ---
>>>>>> maven-checkstyle-plugin:2.17:check
>>>>>>>> (validate) @ flink-end-to-end-tests-common-kafka ---
>>>>>>>>
>>>>>>> I'm planning to address this as part of
>>>>>>> https://issues.apache.org/jira/browse/FLINK-16411, where I work
>>> on
>>>>>>> centralizing all mvn invocations.
>>>>>>>
>>>>>>>
>>>>>>>> - there is no real benefit in retiring the travis support in
>>> CiBot;
>>>>> the
>>>>>>>> important part is whether Travis is run or not for pull requests.
>>>>>>>>   From what I can tell though azure seems to be working fine for
>>> pull
>>>>>>>> requests, so +1 to at least disable the travis PR runs.
>>>>>>> So we disable Travis for https://github.com/flink-ci/flink ? I
>>> will
>>>> do
>>>>>> it
>>>>>>> once there are no new concerns and above tickets are resolved.
>>>>>>>
>>>>>>> What about disabling travis for master pushes? (e.g. removing the
>>>>>>> .travis.yml file from master)?
>>>>>>>
>>>>>>>
>>>>>>> @Dian:
>>>>>>> Thanks a lot for your feedback.
>>>>>>>
>>>>>>> - The report of Azure is still not viewable[1] (I noticed that
>>> Hequn
>>>>> has
>>>>>>>> also reported this issue in another thread). This is very useful
>>>>>>>> information.
>>>>>>> You are referring to the emails send to builds@f.a.o right?
>>>>>>> I have reported this both as a bug [1] and a feature request [2]
>>> to
>>>>>> Azure.
>>>>>>> But I don't believe they will resolve this issue anytime soon.
>>>>>>> Azure has an notifications API that we could use to build a
>>> service
>>>>> that
>>>>>>> sends emails to that list, but I feel that this is really a waste
>>> of
>>>>>> time.
>>>>>>> The URL in the link even contains the ID of the build. We would
>>> just
>>>>> need
>>>>>>> to extract this ID and generate the appropriate URL. I will try to
>>>>>> directly
>>>>>>> reach the product management of AZP, maybe I can get some
>>> attention
>>>>> from
>>>>>>> there.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [1]
>>>>>>>
>>> https://developercommunity.visualstudio.com/content/problem/957778/third-parties-are-unable-to-access-notification-li.html?childToView=960403#comment-960403
>>>>>>> [2]
>>>>>>>
>>> https://developercommunity.visualstudio.com/content/idea/960472/third-parties-are-unable-to-access-notification-li-1.html
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 25, 2020 at 10:34 AM Chesnay Schepler <
>>>> ches...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> It was left out since it adds significant additional complexity
>>> and
>>>>> the
>>>>>>>> value is dubious at best for PRs that aren't merged shortly after
>>>> the
>>>>>>>> build has finished.
>>>>>>>>
>>>>>>>> On 25/03/2020 10:28, Dian Fu wrote:
>>>>>>>>> Thanks for the information. I'm sorry that I'm not aware of this
>>>>> before
>>>>>>>> and I have checked the build log of travis and confirmed that
>>> this
>>>> is
>>>>>> true.
>>>>>>>>> @Chesnay Are there any specific reasons for this and is it
>>> possible
>>>>> to
>>>>>>>> add this back for Azure Pipelines?
>>>>>>>>> Thanks,
>>>>>>>>> Dian
>>>>>>>>>
>>>>>>>>>> 在 2020年3月25日,下午4:43,Chesnay Schepler <ches...@apache.org> 写道:
>>>>>>>>>>
>>>>>>>>>> @Dian we haven't been rebasing PR's against master for months,
>>>> ever
>>>>>>>> since we switched to CiBot.
>>>>>>>>>> On 25/03/2020 09:29, Dian Fu wrote:
>>>>>>>>>>> Hi Robert,
>>>>>>>>>>>
>>>>>>>>>>> Thanks a lot for your great work!
>>>>>>>>>>>
>>>>>>>>>>> Overall I'm +1 to switch to Azure as the primary CI tool if
>>> it's
>>>>>>>> stable enough as I think there is no need to run both the travis
>>> and
>>>>>> Azure
>>>>>>>> for one single PR.
>>>>>>>>>>> However, there are still some improvements need to do and it
>>>> would
>>>>> be
>>>>>>>> great if these issues could be addressed before fully switch to
>>>> Azure:
>>>>>>>>>>> - The report of Azure is still not viewable[1] (I noticed that
>>>>> Hequn
>>>>>>>> has also reported this issue in another thread). This is very
>>> useful
>>>>>>>> information.
>>>>>>>>>>> - For PR test of Azure pipeline, it seems that it will not
>>> rebase
>>>>> the
>>>>>>>> master code before running the tests.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Dian
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>> https://dev.azure.com/rmetzger/web/build.aspx?pcguid=03e2a4fd-f647-46c5-a324-527d2c2984ce&builduri=vstfs%3a%2f%2f%2fBuild%2fBuild%2f6593&tracking_data=eyJTb3VyY2UiOiJFbWFpbCIsIlR5cGUiOiJOb3RpZmljYXRpb24iLCJTSUQiOiIzMzk0MzciLCJTVHlwZSI6IkdSUCIsIlJlY2lwIjoxLCJfeGNpIjp7Ik5JRCI6NDAyODQ3NzksIk1SZWNpcCI6Im0wPTEgIiwiQWN0IjoiMTNjNDc3YWMtZTBjYS00MjJkLTkxOTItZWI0NzFkZmUzMWY0In0sIkVsZW1lbnQiOiJoZXJvL2N0YSJ9
>>>>>>>> <
>>>>>>>>
>>> https://dev.azure.com/rmetzger/web/build.aspx?pcguid=03e2a4fd-f647-46c5-a324-527d2c2984ce&builduri=vstfs%3a%2f%2f%2fBuild%2fBuild%2f6593&tracking_data=eyJTb3VyY2UiOiJFbWFpbCIsIlR5cGUiOiJOb3RpZmljYXRpb24iLCJTSUQiOiIzMzk0MzciLCJTVHlwZSI6IkdSUCIsIlJlY2lwIjoxLCJfeGNpIjp7Ik5JRCI6NDAyODQ3NzksIk1SZWNpcCI6Im0wPTEgIiwiQWN0IjoiMTNjNDc3YWMtZTBjYS00MjJkLTkxOTItZWI0NzFkZmUzMWY0In0sIkVsZW1lbnQiOiJoZXJvL2N0YSJ9
>>>>>>>> <
>>>>>>>>
>>> https://dev.azure.com/rmetzger/web/build.aspx?pcguid=03e2a4fd-f647-46c5-a324-527d2c2984ce&builduri=vstfs:///Build/Build/6593&tracking_data=eyJTb3VyY2UiOiJFbWFpbCIsIlR5cGUiOiJOb3RpZmljYXRpb24iLCJTSUQiOiIzMzk0MzciLCJTVHlwZSI6IkdSUCIsIlJlY2lwIjoxLCJfeGNpIjp7Ik5JRCI6NDAyODQ3NzksIk1SZWNpcCI6Im0wPTEgIiwiQWN0IjoiMTNjNDc3YWMtZTBjYS00MjJkLTkxOTItZWI0NzFkZmUzMWY0In0sIkVsZW1lbnQiOiJoZXJvL2N0YSJ9
>>>>>>>> <
>>>>>>>>
>>> https://dev.azure.com/rmetzger/web/build.aspx?pcguid=03e2a4fd-f647-46c5-a324-527d2c2984ce&builduri=vstfs:///Build/Build/6593&tracking_data=eyJTb3VyY2UiOiJFbWFpbCIsIlR5cGUiOiJOb3RpZmljYXRpb24iLCJTSUQiOiIzMzk0MzciLCJTVHlwZSI6IkdSUCIsIlJlY2lwIjoxLCJfeGNpIjp7Ik5JRCI6NDAyODQ3NzksIk1SZWNpcCI6Im0wPTEgIiwiQWN0IjoiMTNjNDc3YWMtZTBjYS00MjJkLTkxOTItZWI0NzFkZmUzMWY0In0sIkVsZW1lbnQiOiJoZXJvL2N0YSJ9
>>>>>>>>>>>> 在 2020年3月25日,下午3:33,Chesnay Schepler <ches...@apache.org>
>>> 写道:
>>>>>>>>>>>> Some thoughts:
>>>>>>>>>>>> - by virtue of maintaining the past 2 releases we will have
>>> to
>>>>>>>> maintain any Travis infrastructure as long as 1.10 is supported,
>>>> i.e.,
>>>>>>>> until 1.12
>>>>>>>>>>>> - the azure setup doesn't appear to be equivalent yet since
>>> the
>>>>> java
>>>>>>>> e2e profile isn't setting the hadoop switch (-Pe2e-hadoop), as a
>>>>> result
>>>>>> of
>>>>>>>> which SQLClientKafkaITCase isn't run
>>>>>>>>>>>> - the nightly scripts still seems to be using a maven version
>>>>> other
>>>>>>>> than 3.2.5; from today on master:
>>>>>>>>>>>> 2020-03-25T05:31:52.7412964Z [INFO] --------<
>>>>>>>> org.apache.flink:flink-end-to-end-tests-common-kafka >--------
>>>>>>>>>>>> 2020-03-25T05:31:52.7413854Z [INFO] Building
>>>>>>>> flink-end-to-end-tests-common-kafka 1.11-SNAPSHOT       [39/46]
>>>>>>>>>>>> 2020-03-25T05:31:52.7414689Z [INFO]
>>>>>> --------------------------------[
>>>>>>>> jar ]---------------------------------
>>>>>>>>>>>> 2020-03-25T05:31:52.7518360Z [INFO]
>>>>>>>>>>>> 2020-03-25T05:31:52.7519770Z [INFO] ---
>>>>>>>> maven-checkstyle-plugin:2.17:check (validate) @
>>>>>>>> flink-end-to-end-tests-common-kafka ---
>>>>>>>>>>>> - there is no real benefit in retiring the travis support in
>>>>> CiBot;
>>>>>>>> the important part is whether Travis is run or not for pull
>>>> requests.
>>>>>>>>>>>>    From what I can tell though azure seems to be working fine
>>> for
>>>>>> pull
>>>>>>>> requests, so +1 to at least disable the travis PR runs.
>>>>>>>>>>>> On 23/03/2020 14:48, Robert Metzger wrote:
>>>>>>>>>>>>> Hey devs,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would like to discuss whether it makes sense to fully
>>> switch
>>>> to
>>>>>>>> Azure
>>>>>>>>>>>>> Pipelines and phase out our Travis integration.
>>>>>>>>>>>>> More information on our Azure integration can be found here:
>>>>>>>>>>>>>
>>> https://cwiki.apache.org/confluence/display/FLINK/2020/03/22/Migrating+Flink%27s+CI+Infrastructure+from+Travis+CI+to+Azure+Pipelines
>>>>>>>>>>>>> Travis will stay for the release-1.10 and older branches,
>>> as I
>>>>> have
>>>>>>>> set up
>>>>>>>>>>>>> Azure only for the master branch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Proposal:
>>>>>>>>>>>>> - We keep the flinkbot infrastructure supporting both Travis
>>>> and
>>>>>>>> Azure
>>>>>>>>>>>>> around, while we are still receive pull requests and pushes
>>> for
>>>>> the
>>>>>>>>>>>>> "master" and "release-1.10" branches.
>>>>>>>>>>>>> - We remove the travis-specific files from "master", so that
>>>>> builds
>>>>>>>> are not
>>>>>>>>>>>>> triggered anymore
>>>>>>>>>>>>> - once we receive no more builds at Travis (because 1.11 has
>>>> been
>>>>>>>>>>>>> released), we remove the remaining travis-related
>>>> infrastructure
>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Robert
>>>>>>>>
>>>>>>

Reply via email to