Hi all, We received comments and suggestions from contributors, committers and PMC members regarding the proposal: https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit?ts=5d529ec0
@Vinod Kumar Vavilapalli <vino...@apache.org> could you provide suggestions regarding what we should do next? Could you help to send this to the ASF board? Thanks, Wangda Tan On Tue, Aug 13, 2019 at 4:36 PM Wangda Tan <wheele...@gmail.com> wrote: > Hi folks, > > I just drafted a proposal which is targetted to send to PMC list and board > for thoughts. Thanks Xun Liu for providing thoughts about future > directions/architecture, and reviews from Keqiu Hu. > > Title: "Apache Submarine for Apache Top-Level Project" > > > https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit > > I plan to send it to PMC list/board next Monday, so any > comments/suggestions are welcome. > > Thanks, > Wangda > > > On Tue, Jul 30, 2019 at 6:01 PM 俊平堵 <dujunp...@gmail.com> wrote: > >> Thanks Vinod for these great suggestions. I agree most of your comments >> above. >> "For the Apache Hadoop community, this will be treated simply as >> code-change and so need a committer +1?". IIUC, this should be treated as >> feature branch merge, so may be 3 committer +1 is needed here according to >> https://hadoop.apache.org/bylaws.html? >> >> bq. Can somebody who have cycles and been on the ASF lists for a while >> look into the process here? >> I can check with ASF members who has experience on this if no one haven't >> yet. >> >> Thanks, >> >> Junping >> >> Vinod Kumar Vavilapalli <vino...@apache.org> 于2019年7月29日周一 下午9:46写道: >> >>> Looks like there's a meaningful push behind this. >>> >>> Given the desire is to fork off Apache Hadoop, you'd want to make sure >>> this enthusiasm turns into building a real, independent but more >>> importantly a sustainable community. >>> >>> Given that there were two official releases off the Apache Hadoop >>> project, I doubt if you'd need to go through the incubator process. Instead >>> you can directly propose a new TLP at ASF board. The last few times this >>> happened was with ORC, and long before that with Hive, HBase etc. Can >>> somebody who have cycles and been on the ASF lists for a while look into >>> the process here? >>> >>> For the Apache Hadoop community, this will be treated simply as >>> code-change and so need a committer +1? You can be more gently by formally >>> doing a vote once a process doc is written down. >>> >>> Back to the sustainable community point, as part of drafting this >>> proposal, you'd definitely want to make sure all of the Apache Hadoop >>> PMC/Committers can exercise their will to join this new project as >>> PMC/Committers respectively without any additional constraints. >>> >>> Thanks >>> +Vinod >>> >>> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <wheele...@gmail.com> wrote: >>> > >>> > Thanks everybody for sharing your thoughts. I saw positive feedbacks >>> from >>> > 20+ contributors! >>> > >>> > So I think we should move it forward, any suggestions about what we >>> should >>> > do? >>> > >>> > Best, >>> > Wangda >>> > >>> > On Mon, Jul 22, 2019 at 5:36 PM neo <n...@pingcap.com> wrote: >>> > >>> >> +1, This is neo from TiDB & TiKV community. >>> >> Thanks Xun for bring this up. >>> >> >>> >> Our CNCF project's open source distributed KV storage system TiKV, >>> >> Hadoop submarine's machine learning engine helps us to optimize data >>> >> storage, >>> >> helping us solve some problems in data hotspots and data shuffers. >>> >> >>> >> We are ready to improve the performance of TiDB in our open source >>> >> distributed relational database TiDB and also using the hadoop >>> submarine >>> >> machine learning engine. >>> >> >>> >> I think if submarine can be independent, it will develop faster and >>> better. >>> >> Thanks to the hadoop community for developing submarine! >>> >> >>> >> Best Regards, >>> >> neo >>> >> www.pingcap.com / https://github.com/pingcap/tidb / >>> >> https://github.com/tikv >>> >> >>> >> Xun Liu <liu...@apache.org> 于2019年7月22日周一 下午4:07写道: >>> >> >>> >>> @adam.antal >>> >>> >>> >>> The submarine development team has completed the following >>> preparations: >>> >>> 1. Established a temporary test repository on Github. >>> >>> 2. Change the package name of hadoop submarine from >>> org.hadoop.submarine >>> >> to >>> >>> org.submarine >>> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module; >>> >>> 4. On the Github docked travis-ci system, all test cases have been >>> >> tested; >>> >>> 5. Several Hadoop submarine users completed the system test using the >>> >> code >>> >>> in this repository. >>> >>> >>> >>> 赵欣 <xinz...@seu.edu.cn> 于2019年7月22日周一 上午9:38写道: >>> >>> >>> >>>> Hi >>> >>>> >>> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/). >>> We >>> >> are >>> >>>> a major in electrical engineering. Our teaching teams and students >>> use >>> >>>> bigoop submarine for big data analysis and automation control of >>> >>> electrical >>> >>>> equipment. >>> >>>> >>> >>>> Many thanks to the hadoop community for providing us with machine >>> >>> learning >>> >>>> tools like submarine. >>> >>>> >>> >>>> I wish hadoop submarine is getting better and better. >>> >>>> >>> >>>> >>> >>>> ============================== >>> >>>> 赵欣 >>> >>>> 东南大学电气工程学院 >>> >>>> >>> >>>> ----------------------------------------------------- >>> >>>> >>> >>>> Zhao XIN >>> >>>> >>> >>>> School of Electrical Engineering >>> >>>> >>> >>>> ============================== >>> >>>> 2019-07-18 >>> >>>> >>> >>>> >>> >>>> *From:* Xun Liu <liu...@apache.org> >>> >>>> *Date:* 2019-07-18 09:46 >>> >>>> *To:* xinzhao <xinz...@seu.edu.cn> >>> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache >>> >>>> project? >>> >>>> >>> >>>> >>> >>>> ---------- Forwarded message --------- >>> >>>> 发件人: dashuiguailu...@gmail.com <dashuiguailu...@gmail.com> >>> >>>> Date: 2019年7月17日周三 下午3:17 >>> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache >>> >> project? >>> >>>> To: Szilard Nemeth <snem...@cloudera.com.invalid>, runlin zhang < >>> >>>> runlin...@gmail.com> >>> >>>> Cc: Xun Liu <liu...@apache.org>, common-dev < >>> >>> common-...@hadoop.apache.org>, >>> >>>> yarn-dev <yarn-...@hadoop.apache.org>, hdfs-dev < >>> >>>> hdfs-dev@hadoop.apache.org>, mapreduce-dev < >>> >>>> mapreduce-...@hadoop.apache.org>, submarine-dev < >>> >>>> submarine-...@hadoop.apache.org> >>> >>>> >>> >>>> >>> >>>> +1 ,Good idea, we are very much looking forward to it. >>> >>>> >>> >>>> ------------------------------ >>> >>>> dashuiguailu...@gmail.com >>> >>>> >>> >>>> >>> >>>> *From:* Szilard Nemeth <snem...@cloudera.com.INVALID> >>> >>>> *Date:* 2019-07-17 14:55 >>> >>>> *To:* runlin zhang <runlin...@gmail.com> >>> >>>> *CC:* Xun Liu <liu...@apache.org>; Hadoop Common >>> >>>> <common-...@hadoop.apache.org>; yarn-dev < >>> yarn-...@hadoop.apache.org>; >>> >>>> Hdfs-dev <hdfs-dev@hadoop.apache.org>; mapreduce-dev >>> >>>> <mapreduce-...@hadoop.apache.org>; submarine-dev >>> >>>> <submarine-...@hadoop.apache.org> >>> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache >>> project? >>> >>>> +1, this is a very great idea. >>> >>>> As Hadoop repository has already grown huge and contains many >>> >> projects, I >>> >>>> think in general it's a good idea to separate projects in the early >>> >>> phase. >>> >>>> >>> >>>> >>> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <runlin...@gmail.com> >>> wrote: >>> >>>> >>> >>>>> +1 ,That will be great ! >>> >>>>> >>> >>>>>> 在 2019年7月10日,下午3:34,Xun Liu <liu...@apache.org> 写道: >>> >>>>>> >>> >>>>>> Hi all, >>> >>>>>> >>> >>>>>> This is Xun Liu contributing to the Submarine project for deep >>> >>> learning >>> >>>>>> workloads running with big data workloads together on Hadoop >>> >>> clusters. >>> >>>>>> >>> >>>>>> There are a bunch of integrations of Submarine to other projects >>> >> are >>> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The >>> >>> next >>> >>>>> step >>> >>>>>> of Submarine is going to integrate with more projects like Apache >>> >>>> Arrow, >>> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine >>> learning >>> >>> use >>> >>>>>> cases like model serving, notebook management, advanced training >>> >>>>>> optimizations (like auto parameter tuning, memory cache >>> >> optimizations >>> >>>> for >>> >>>>>> large datasets for training, etc.), and make it run on other >>> >>> platforms >>> >>>>> like >>> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate >>> TonY >>> >>>>> project >>> >>>>>> to Apache so we can put Submarine and TonY together to the same >>> >>>> codebase >>> >>>>>> (Page #30. >>> >>>>>> >>> >>>>> >>> >>>> >>> >>> >>> >> >>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30 >>> >>>>>> ). >>> >>>>>> >>> >>>>>> This expands the scope of the original Submarine project in >>> >> exciting >>> >>>> new >>> >>>>>> ways. Toward that end, would it make sense to create a separate >>> >>>> Submarine >>> >>>>>> project at Apache? This can make faster adoption of Submarine, and >>> >>>> allow >>> >>>>>> Submarine to grow to a full-blown machine learning platform. >>> >>>>>> >>> >>>>>> There will be lots of technical details to work out, but any >>> >> initial >>> >>>>>> thoughts on this? >>> >>>>>> >>> >>>>>> Best Regards, >>> >>>>>> Xun Liu >>> >>>>> >>> >>>>> >>> >>>>> >>> --------------------------------------------------------------------- >>> >>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org >>> >>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org >>> >>>>> >>> >>>>> >>> >>>> >>> >>>> >>> >>> >>> >> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org >>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org >>> >>>