Thanx Vinay for the initiative, Makes sense to add support for different architectures.
+1, for the branch idea. Good Luck!!! -Ayush > On 03-Sep-2019, at 6:19 AM, 张铎(Duo Zhang) <palomino...@gmail.com> wrote: > > For HBase, we purged all the protobuf related things from the public API, > and then upgraded to a shaded and relocated version of protobuf. We have > created a repo for this: > > https://github.com/apache/hbase-thirdparty > > But since the hadoop dependencies still pull in the protobuf 2.5 jars, our > coprocessors are still on protobuf 2.5. Recently we have opened a discuss > on how to deal with the upgrading of coprocessor. Glad to see that the > hadoop community is also willing to solve the problem. > > Anu Engineer <aengin...@cloudera.com.invalid> 于2019年9月3日周二 上午1:23写道: > >> +1, for the branch idea. Just FYI, Your biggest problem is proving that >> Hadoop and the downstream projects work correctly after you upgrade core >> components like Protobuf. >> So while branching and working on a branch is easy, merging back after you >> upgrade some of these core components is insanely hard. You might want to >> make sure that community buys into upgrading these components in the trunk. >> That way we will get testing and downstream components will notice when >> things break. >> >> That said, I have lobbied for the upgrade of Protobuf for a really long >> time; I have argued that 2.5 is out of support and we cannot stay on that >> branch forever; or we need to take ownership of the Protobuf 2.5 code base. >> It has been rightly pointed to me that while all the arguments I make is >> correct; it is a very complicated task to upgrade Protobuf, and the worst >> part is we will not even know what breaks until downstream projects pick up >> these changes and work against us. >> >> If we work off the Hadoop version 3 — and assume that we have "shading" in >> place for all deployments; it might be possible to get there; still a >> daunting task. >> >> So best of luck with the branch approach — But please remember, Merging >> back will be hard, Just my 2 cents. >> >> — Anu >> >> >> >> >> On Sun, Sep 1, 2019 at 7:40 PM Zhenyu Zheng <zhengzhenyul...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Thanks Vinaya for bring this up and thanks Sheng for the idea. A separate >>> branch with it's own ARM CI seems a really good idea. >>> By doing this we won't break any of the undergoing development in trunk >> and >>> a CI can be a very good way to show what are the >>> current problems and what have been fixed, it will also provide a very >> good >>> view for contributors that are intrested to working on >>> this. We can finally merge back the branch to trunk until the community >>> thinks it is good enough and stable enough. We can donate >>> ARM machines to the existing CI system for the job. >>> >>> I wonder if this approch possible? >>> >>> BR, >>> >>>> On Thu, Aug 29, 2019 at 11:29 AM Sheng Liu <liusheng2...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> Thanks Vinay for bring this up, I am a member of "Openlab" community >>>> mentioned by Vinay. I am working on building and >>>> testing Hadoop components on aarch64 server these days, besides the >>> missing >>>> dependices of ARM platform issues #1 #2 #3 >>>> mentioned by Vinay, other similar issue has also be found, such as the >>>> "PhantomJS" dependent package also missing for aarch64. >>>> >>>> To promote the ARM support for Hadoop, we have discussed and hoped to >> add >>>> an ARM specific CI to Hadoop repo. we are not >>>> sure about if there is any potential effect or confilict on the trunk >>>> branch, so maybe creating a ARM specific branch for doing these stuff >>>> is a better choice, what do you think? >>>> >>>> Hope to hear thoughts from you :) >>>> >>>> BR, >>>> Liu sheng >>>> >>>> Vinayakumar B <vinayakum...@apache.org> 于2019年8月27日周二 上午5:34写道: >>>> >>>>> Hi Folks, >>>>> >>>>> ARM is becoming famous lately in its processing capability and has >> got >>>> the >>>>> potential to run Bigdata workloads. >>>>> Many users have been moving to ARM machines due to its low cost. >>>>> >>>>> In the past there were attempts to compile Hadoop on ARM (Rasberry >> PI) >>>> for >>>>> experimental purposes. Today ARM architecture is taking some of the >>>>> serverside processing as well. So there will be/is a real need of >>> Hadoop >>>> to >>>>> support ARM architecture as well. >>>>> >>>>> There are bunch of users who are trying out building Hadoop on ARM, >>>> trying >>>>> to add ARM CI to hadoop and facing issues[1]. Also some >>>>> >>>>> As of today, Hadoop does not compile on ARM due to below issues, >> found >>>> from >>>>> testing done in openlab in [2]. >>>>> >>>>> 1. Protobuf : >>>>> ------------------- >>>>> Hadoop project (also some downstream projects) stuck to protobuf >>>> 2.5.0 >>>>> version, due to backward compatibility reasons. Protobuf-2.5.0 is not >>>> being >>>>> maintained in the community. While protobuf 3.x is being actively >>> adopted >>>>> widely, still protobuf 3.x provides wire compatibility for proto2 >>>> messages. >>>>> Due to some compilation issues in the generated java code, which can >>>> induce >>>>> problems in downstream. Due to this reason protobuf upgrade from >> 2.5.0 >>>> was >>>>> not taken up. >>>>> In 3.0.0 onwards, hadoop supports shading of libraries to avoid >>> classpath >>>>> problem in downstream projects. >>>>> There are patches available to fix compilation in Hadoop. But >> need >>> to >>>>> find a way to upgrade protobuf to latest version and still maintain >> the >>>>> downstream's classpath using shading feature of Hadoop build. >>>>> >>>>> There is a Jira for protobuf upgrade[3] created even before >> shade >>>>> support was added to Hadoop. Now need to revisit the Jira and >> continue >>>>> explore possibilities. >>>>> >>>>> 2. leveldbjni: >>>>> --------------- >>>>> Current leveldbjni used in YARN doesnot support ARM architecture, >>>> need >>>>> to check whether any of the future versions support ARM and can >> hadoop >>>>> upgrade to that version. >>>>> >>>>> >>>>> 3. hadoop-yarn-csi's dependency 'protoc-gen-grpc-java:1.15.1' >>>>> ------------------------- >>>>> 'protoc-gen-grpc-java:1.15.1' does not provide ARM executable by >>> default >>>> in >>>>> the maven repository. Workaround is to build it locally and keep in >>> local >>>>> maven repository. >>>>> Need to check whether any future versions of 'protoc-gen-grpc-java' >> is >>>>> having ARM executable and whether hadoop-yarn-csi can upgrade it? >>>>> >>>>> >>>>> Once the compilation issues are solved, then there might be many >> native >>>>> code related issues due to different architectures. >>>>> So to explore everything, need to join hands together and proceed. >>>>> >>>>> >>>>> Let us discuss and check, whether any body else out there who also >> need >>>> the >>>>> support of Hadoop on ARM architectures and ready to lend their hands >>> and >>>>> time in this work. >>>>> >>>>> >>>>> [1] https://issues.apache.org/jira/browse/HADOOP-16358 >>>>> [2] >> https://issues.apache.org/jira/browse/HADOOP-16358?focusedCommentId=16904887&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16904887 >>>>> [3] https://issues.apache.org/jira/browse/HADOOP-13363 >>>>> >>>>> -Vinay >> --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org