Before ARM CI is ready, I can close the CI test for each PR and let it only
be triggered by PR comment.  It's quite easy for OpenLab to do this.

OpenLab have many job piplines[1].  Now I use `check` pipline in
https://github.com/apache/flink/pull/9416. The job trigger contains
github_action and github_comment[2]. I can create a new pipline for Flink,
the new trigger can only contain github_coment like:

trigger:
  github:
 - event: pull_request
   action: comment
   comment: (?i)^\s*recheck_arm_build\s*$

So that the ARM job will not be ran for every PR. It'll be just ran for the
PR which have `recheck_arm_build` comment.

Then once ARM CI is ready, I can add it back.


nightly tests can be added as well of couse. There is a kind of job in
OpenLab called `periodic job`. We can use it for Flink daily nightly tests.
If any error occur, the report can be sent to bui...@flink.apache.org  as
well.

[1]:
https://github.com/theopenlab/openlab-zuul-jobs/blob/master/zuul.d/pipelines.yaml
[2]:
https://github.com/theopenlab/openlab-zuul-jobs/blob/master/zuul.d/pipelines.yaml#L10-L19

Stephan Ewen <se...@apache.org> 于2019年8月26日周一 下午6:13写道:

> Adding CI builds for ARM makes only sense when we actually take them into
> account as "blocking a merge", otherwise there is no point in having them.
> So we would need to be prepared to do that.
>
> The cases where something runs in UNIX/x64 but fails on ARM are few cases
> and so far seem to have been related to libraries or some magic that tries
> to do system dependent actions outside Java.
>
> One worthwhile discussion could be whether to run the ARM CI builds as part
> of the nightly tests, not on every commit.
> There are a lot of nightly tests, for example for different Java / Scala /
> Hadoop versions.
>
> On Mon, Aug 26, 2019 at 10:46 AM Xiyuan Wang <wangxiyuan1...@gmail.com>
> wrote:
>
> > Sorry, maybe my words is misleading.
> >
> > We are just starting adding ARM support. So the CI is non-voting at this
> > moment to avoid blocking normal Flink development.
> >
> > But once the ARM CI works well and stable enough. We should mark it as
> > voting. It means that in the future, if the ARM test is failed in a PR,
> the
> > PR can not be merged. The test log may tell develpers what error is
> > comming. If the develper need debug the detail on an ARM vm, OpenLab can
> > provider it.
> >
> > Adding ARM CI can make sure Flink support ARM originally
> >
> > I left a workflow in the PR, I'd like to print it here:
> >
> >    1. Add the basic build script to ensure the CI system and build job
> >    works as expect. The job should be marked as non-voting first, it
> means the
> >    CI test failure won't block Flink PR to be merged.
> >    2. Add the test script to run unit/intergration test. At this step the
> >    --fn parameter will be added to mvn test. It will run the full test
> cases
> >    in Flink, so that we can find what test is failed on ARM.
> >    3. Fix the test failure one by one.
> >    4. Once all the tests are passed, remove the --fn parameter and keep
> >    watch the CI's status for some days. If some bugs raise then, fix
> them as
> >    what we usually do for travis-ci.
> >    5. Once the CI is stable enought, remove the non-voting tag, so that
> >    the ARM CI will be the same as travis-ci, to be one of the gate for
> Flink
> >    PR.
> >    6. Finally, Flink community can announce and release Flink ARM
> version.
> >
> >
> > Chesnay Schepler <ches...@apache.org> 于2019年8月26日周一 下午2:25写道:
> >
> >> I'm sorry, but if these issues are only fixed later anyway I see no
> >> reason to run these tests on each PR. We're just adding noise to each PR
> >> that everyone will just ignore.
> >>
> >> I'm curious as to the benefit of having this directly in Flink; why
> >> aren't the ARM builds run outside of the Flink project, and fixes for it
> >> provided?
> >>
> >> It seems to me like nothing about these arm builds is actually handled
> >> by the Flink project.
> >>
> >> On 26/08/2019 03:43, Xiyuan Wang wrote:
> >> > Thanks for Stephan to bring up this topic.
> >> >
> >> > The package build jobs work well now. I have a simple online demo
> which
> >> is
> >> > built and ran on a ARM VM. Feel free to have a try[1].
> >> >
> >> > As the first step for ARM support, maybe it's good to add them now.
> >> >
> >> > While for the next step, the test part is still broken. It relates to
> >> some
> >> > points we find:
> >> >
> >> > 1. Some unit tests are failed[1] by Java coding. These kind of failure
> >> can
> >> > be fixed easily.
> >> > 2. Some tests are failed by depending on third part libaraies[2]. It
> >> > includes frocksdb, MapR Client and Netty. They don't have ARM release.
> >> >      a. Frocksdb: I'm testing it locally now by `make check_some` and
> >> `make
> >> > jtest` similar with its travis job. There are 3 tests failed by `make
> >> > check_some`. Please see the ticket for more details. Once the test
> pass,
> >> > frocksdb can release ARM package then.
> >> >      b. MapR Client. This belongs to MapR company. At this moment,
> >> maybe we
> >> > should skip MapR support for Flink ARM.
> >> >      c. Netty. Actually Netty runs well on our ARM machine. We will
> ask
> >> > Netty community to release ARM support. If they do not want, OpenLab
> >> will
> >> > handle a Maven Repository for some common libraries on ARM.
> >> >
> >> >
> >> > For Chesnay's concern:
> >> >
> >> > Firstly, OpenLab team will keep maintaining and fixing ARM CI. It
> means
> >> > that once build or test fails, we'll fix it at once.
> >> > Secondly,  OpenLab can provide ARM VMs to everyone for reproducing and
> >> > testing. You just need to creat a  Test Request issue in openlab[1].
> >> Then
> >> > we'll create ARM VMs for you, you can  login and do the thing you
> want.
> >> >
> >> > Does it make sense?
> >> >
> >> > [1]: http://114.115.168.52:8081/#/overview
> >> > [1]: https://issues.apache.org/jira/browse/FLINK-13449
> >> >        https://issues.apache.org/jira/browse/FLINK-13450
> >> > [2]: https://issues.apache.org/jira/browse/FLINK-13598
> >> > [3]: https://github.com/theopenlab/openlab/issues/new/choose
> >> >
> >> >
> >> >
> >> >
> >> > Chesnay Schepler <ches...@apache.org> 于2019年8月24日周六 上午12:10写道:
> >> >
> >> >> I'm wondering what we are supposed to do if the build fails?
> >> >> We aren't providing and guides on setting up an arm dev environment;
> so
> >> >> reproducing it locally isn't possible.
> >> >>
> >> >> On 23/08/2019 17:55, Stephan Ewen wrote:
> >> >>> Hi all!
> >> >>>
> >> >>> As part of the Flink on ARM effort, there is a pull request that
> >> >> triggers a
> >> >>> build on OpenLabs CI for each push and runs tests on ARM machines.
> >> >>>
> >> >>> Currently that build is roughly equivalent to what the "core" and
> >> "tests"
> >> >>> profiles do on Travis.
> >> >>> The result will be posted to the PR comments, similar to the Flink
> >> Bot's
> >> >>> Travis build result.
> >> >>> The build currently passes :-) so Flink seems to be okay on ARM.
> >> >>>
> >> >>> My suggestion would be to try and add this and gather some
> experience
> >> >> with
> >> >>> it.
> >> >>> The Travis build results should be our "ground truth" and the ARM CI
> >> >>> (openlabs CI) would be "informational only" at the beginning, but
> >> helping
> >> >>> us understand when we break ARM support.
> >> >>>
> >> >>> You can see this in the PR that adds the openlabs CI config:
> >> >>> https://github.com/apache/flink/pull/9416
> >> >>>
> >> >>> Any objections?
> >> >>>
> >> >>> Best,
> >> >>> Stephan
> >> >>>
> >> >>
> >>
> >>
>

Reply via email to