Hey Xiyuan,
thanks a lot for checking out Travis ARM-based offering.

As part of the "Reducing build times" discussion, we have considered moving
away from Travis to Azure Pipelines. What I want to say is that Travis
might not be important for the Flink community in the long run.
I think running the ARM tests on the openlab infrastructure as a cron job
is fine, until all tests are passing on ARM. Once we have achieved full ARM
compatibility, we can consider integrating the ARM build into the regular
check (through Flinkbot) to ensure that we maintain the architecture
support.

Thank you also for posting links to pull requests fixing the test issues. I
hope a committer soon finds time to take a look.

I have the feeling that there's currently no committer who's feeling
responsible for helping getting this effort done. I'm only guessing why we
have this situation, but I see the following potential reasons:
1. There's too many other competing pull requests, which are considered
more important
2. People don't believe that ARM support is important for Flink. It seems
that Apache Spark used to have ARM support, and is now re-adding it after
Openlab reached out to them as well [2]. There seem to be some research
projects [3] or some marketing? [4] for it, but the number of people
actually asking for it is unclear to me at this point.
I am not aware of any data or anecdotal knowledge of ARM-based server
platforms being adopted in Flink's space. As long as we don't have users
asking for it, it remains a bet. For Apache Kafka, I could not find any
evidence that goes beyond toy projects.
3. Since Openlab apparently reached out to multiple open source projects
regarding ARM support, I wonder about Openlab's long-term commitment and
motivation. I assume your goal is to help growing the adoption of the ARM
CPU architecture, by making sure as many tools as possible are supported by
it. I don't want to stand in your way of growing ARM's adoption, and the
benefit for Flink is also clear: We will potentially reach more users, and
we might get additional attention for the project. On the other hand, I see
risks, such as Openlab loosing interest / funding / ... in the middle of
the project.

I personally don't feel comfortable reviewing the changes, because I
haven't been very active in the day to day development of Flink recently,
and I don't want to make changes in code-areas I'm not fully confident in.
But I hope that this discussion might shed some light into the reasons for
the low activity on the effort.

Best,
Robert

[1]
https://lists.apache.org/thread.html/4d7e6b1fd5c570973a68de91438dd9045afdae1685b1d1467b2149ce@%3Cdev.flink.apache.org%3E
[2]
http://apache-spark-developers-list.1001551.n3.nabble.com/Re-Ask-for-ARM-CI-for-spark-td27415.html#a27440
[3]
https://developer.arm.com/-/media/Arm%20Developer%20Community/Images/White%20Paper%20and%20Webinar%20Images/HPC%20White%20Papers/UCAM_Arm_Spark_2017.pdf?revision=6e22a6b7-16a0-4478-8eca-2835b69c7305
[4] http://www.sparkonarm.com/


On Mon, Oct 21, 2019 at 10:22 AM Xiyuan Wang <wangxiyuan1...@gmail.com>
wrote:

> According to my test, the Travis ARM CI is not ready. For example:
> 1. Java8 support is missing.
>
> https://travis-ci.community/t/about-the-arm-cpu-architecture-category/5336/4
> 2. Cache function is not supported.
> https://travis-ci.community/t/no-cache-support-on-arm64/5416
>
> The compile job ran timeout without cache after 50 min. It's not a good
> time to use travis ARM CI at the moment. While OpenLab doesn't have any
> limitations (It can provide 16U16G VMs with no time limitation for CI job).
>
> Just FYI, any response is welcome.
>
> Thanks. Regards.
>
> Xiyuan Wang <wangxiyuan1...@gmail.com> 于2019年10月16日周三 上午10:37写道:
>
> > Hi all,
> >
> > Recently Travis announced that ARM arch is in Alpha release[1]. Since
> > Flink has integrated with Travis already, I think it's quite easy for
> Flink
> > to use it for ARM CI.
> >
> > Maybe some of you know that I'm working on Flink ARM testing and support.
> > I suggested to use OpenLab[2] as the ARM CI infrastructure before. Though
> > it's not hard to use OpenLab, it'll still introduce some new concept or
> > burden to Flink. Flink team has another choice now.
> >
> > And as the discussion before, we can still run ARM CI as Cron job first.
> I
> > have ran POC e2e test in OpenLab for some days[3](Of cause, it can be
> > changed to Travis).
> >
> > Following travis x86 test, it includes:
> >
> > flink-end-to-end-test-part1
> >     split_checkpoints.sh  and split_sticky.sh
> > flink-end-to-end-test-part2
> >      split_heavy.sh  and split_ha.sh
> > flink-end-to-end-test-part3
> >     split_misc.sh and split_misc_hadoopfree.sh
> >
> > part1 and part2 runs well. part3 is not statble. I need take more time to
> > fix part3. container part is not included because the problem5 mentioned
> > below.
> >
> > While I did som hacks to make sure the job pass. It includes:
> > 1. Frocksdb ARM package:
> https://issues.apache.org/jira/browse/FLINK-13598
> >  (Not solved)
> > 2. PrometheusReporterEndToEndITCase doesn't support ARM arch:
> > https://issues.apache.org/jira/browse/FLINK-14086 (PR for fix:
> > https://github.com/apache/flink/pull/9768)
> > 3. Elasticsearch Xpack Machine Learning doesn't support ARM :
> > https://issues.apache.org/jira/browse/FLINK-14126 (PR for fix:
> > https://github.com/apache/flink/pull/9765)
> > 4. maven-shade-plugin 3.2.1 doesn't work on ARM for Flink (Fixed, thanks
> > @Dian Fu )
> > 5. flink e2e container test doesn't support ARM:
> > https://issues.apache.org/jira/browse/FLINK-14241 (PR for fix:
> > https://github.com/apache/flink/pull/9782)
> >
> > No matter which CI Flink will use, all the bug mentioned above should be
> > fixed. Please help review these PRs. And if you have any question, please
> > let me know.
> >
> > No matter which CI Flink will choose, I'd like to keep working on Flink
> > ARM support and keep testing and fixing ARM related bugs.
> >
> > Thanks very much.
> >
> >
> > [1]:
> https://blog.travis-ci.com/2019-10-07-multi-cpu-architecture-support
> > [2]: https://openlabtesting.org/
> > [3]: http://status.openlabtesting.org/builds?project=apache%2Fflink
> >
>

Reply via email to