Re: Any thoughts making Submarine a separate Apache project?

Wangda Tan Fri, 23 Aug 2019 19:14:21 -0700

Hi all,

We received comments and suggestions from contributors, committers and PMC
members regarding the proposal:
https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit?ts=5d529ec0


@Vinod Kumar Vavilapalli <[email protected]> could you provide suggestions
regarding what we should do next? Could you help to send this to the ASF
board?

Thanks,
Wangda Tan

On Tue, Aug 13, 2019 at 4:36 PM Wangda Tan <[email protected]> wrote:

> Hi folks,
>
> I just drafted a proposal which is targetted to send to PMC list and board
> for thoughts. Thanks Xun Liu for providing thoughts about future
> directions/architecture, and reviews from Keqiu Hu.
>
> Title: "Apache Submarine for Apache Top-Level Project"
>
>
> https://docs.google.com/document/d/1kE_f-r-ANh9qOeapdPwQPHhaJTS7IMiqDQAS8ESi4TA/edit
>
> I plan to send it to PMC list/board next Monday, so any
> comments/suggestions are welcome.
>
> Thanks,
> Wangda
>
>
> On Tue, Jul 30, 2019 at 6:01 PM 俊平堵 <[email protected]> wrote:
>
>> Thanks Vinod for these great suggestions. I agree most of your comments
>> above.
>>  "For the Apache Hadoop community, this will be treated simply as
>> code-change and so need a committer +1?". IIUC, this should be treated as
>> feature branch merge, so may be 3 committer +1 is needed here according to
>> https://hadoop.apache.org/bylaws.html?
>>
>> bq. Can somebody who have cycles and been on the ASF lists for a while
>> look into the process here?
>> I can check with ASF members who has experience on this if no one haven't
>> yet.
>>
>> Thanks,
>>
>> Junping
>>
>> Vinod Kumar Vavilapalli <[email protected]> 于2019年7月29日周一 下午9:46写道：
>>
>>> Looks like there's a meaningful push behind this.
>>>
>>> Given the desire is to fork off Apache Hadoop, you'd want to make sure
>>> this enthusiasm turns into building a real, independent but more
>>> importantly a sustainable community.
>>>
>>> Given that there were two official releases off the Apache Hadoop
>>> project, I doubt if you'd need to go through the incubator process. Instead
>>> you can directly propose a new TLP at ASF board. The last few times this
>>> happened was with ORC, and long before that with Hive, HBase etc. Can
>>> somebody who have cycles and been on the ASF lists for a while look into
>>> the process here?
>>>
>>> For the Apache Hadoop community, this will be treated simply as
>>> code-change and so need a committer +1? You can be more gently by formally
>>> doing a vote once a process doc is written down.
>>>
>>> Back to the sustainable community point, as part of drafting this
>>> proposal, you'd definitely want to make sure all of the Apache Hadoop
>>> PMC/Committers can exercise their will to join this new project as
>>> PMC/Committers respectively without any additional constraints.
>>>
>>> Thanks
>>> +Vinod
>>>
>>> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <[email protected]> wrote:
>>> >
>>> > Thanks everybody for sharing your thoughts. I saw positive feedbacks
>>> from
>>> > 20+ contributors!
>>> >
>>> > So I think we should move it forward, any suggestions about what we
>>> should
>>> > do?
>>> >
>>> > Best,
>>> > Wangda
>>> >
>>> > On Mon, Jul 22, 2019 at 5:36 PM neo <[email protected]> wrote:
>>> >
>>> >> +1, This is neo from TiDB & TiKV community.
>>> >> Thanks Xun for bring this up.
>>> >>
>>> >> Our CNCF project's open source distributed KV storage system TiKV,
>>> >> Hadoop submarine's machine learning engine helps us to optimize data
>>> >> storage,
>>> >> helping us solve some problems in data hotspots and data shuffers.
>>> >>
>>> >> We are ready to improve the performance of TiDB in our open source
>>> >> distributed relational database TiDB and also using the hadoop
>>> submarine
>>> >> machine learning engine.
>>> >>
>>> >> I think if submarine can be independent, it will develop faster and
>>> better.
>>> >> Thanks to the hadoop community for developing submarine!
>>> >>
>>> >> Best Regards,
>>> >> neo
>>> >> www.pingcap.com / https://github.com/pingcap/tidb /
>>> >> https://github.com/tikv
>>> >>
>>> >> Xun Liu <[email protected]> 于2019年7月22日周一 下午4:07写道：
>>> >>
>>> >>> @adam.antal
>>> >>>
>>> >>> The submarine development team has completed the following
>>> preparations:
>>> >>> 1. Established a temporary test repository on Github.
>>> >>> 2. Change the package name of hadoop submarine from
>>> org.hadoop.submarine
>>> >> to
>>> >>> org.submarine
>>> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
>>> >>> 4. On the Github docked travis-ci system, all test cases have been
>>> >> tested;
>>> >>> 5. Several Hadoop submarine users completed the system test using the
>>> >> code
>>> >>> in this repository.
>>> >>>
>>> >>> 赵欣 <[email protected]> 于2019年7月22日周一 上午9:38写道：
>>> >>>
>>> >>>> Hi
>>> >>>>
>>> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/).
>>> We
>>> >> are
>>> >>>> a major in electrical engineering. Our teaching teams and students
>>> use
>>> >>>> bigoop submarine for big data analysis and automation control of
>>> >>> electrical
>>> >>>> equipment.
>>> >>>>
>>> >>>> Many thanks to the hadoop community for providing us with machine
>>> >>> learning
>>> >>>> tools like submarine.
>>> >>>>
>>> >>>> I wish hadoop submarine is getting better and better.
>>> >>>>
>>> >>>>
>>> >>>> ==============================
>>> >>>> 赵欣
>>> >>>> 东南大学电气工程学院
>>> >>>>
>>> >>>> -----------------------------------------------------
>>> >>>>
>>> >>>> Zhao XIN
>>> >>>>
>>> >>>> School of Electrical Engineering
>>> >>>>
>>> >>>> ==============================
>>> >>>> 2019-07-18
>>> >>>>
>>> >>>>
>>> >>>> *From:* Xun Liu <[email protected]>
>>> >>>> *Date:* 2019-07-18 09:46
>>> >>>> *To:* xinzhao <[email protected]>
>>> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
>>> >>>> project?
>>> >>>>
>>> >>>>
>>> >>>> ---------- Forwarded message ---------
>>> >>>> 发件人： [email protected] <[email protected]>
>>> >>>> Date: 2019年7月17日周三 下午3:17
>>> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
>>> >> project?
>>> >>>> To: Szilard Nemeth <[email protected]>, runlin zhang <
>>> >>>> [email protected]>
>>> >>>> Cc: Xun Liu <[email protected]>, common-dev <
>>> >>> [email protected]>,
>>> >>>> yarn-dev <[email protected]>, hdfs-dev <
>>> >>>> [email protected]>, mapreduce-dev <
>>> >>>> [email protected]>, submarine-dev <
>>> >>>> [email protected]>
>>> >>>>
>>> >>>>
>>> >>>> +1 ，Good idea, we are very much looking forward to it.
>>> >>>>
>>> >>>> ------------------------------
>>> >>>> [email protected]
>>> >>>>
>>> >>>>
>>> >>>> *From:* Szilard Nemeth <[email protected]>
>>> >>>> *Date:* 2019-07-17 14:55
>>> >>>> *To:* runlin zhang <[email protected]>
>>> >>>> *CC:* Xun Liu <[email protected]>; Hadoop Common
>>> >>>> <[email protected]>; yarn-dev <
>>> [email protected]>;
>>> >>>> Hdfs-dev <[email protected]>; mapreduce-dev
>>> >>>> <[email protected]>; submarine-dev
>>> >>>> <[email protected]>
>>> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
>>> project?
>>> >>>> +1, this is a very great idea.
>>> >>>> As Hadoop repository has already grown huge and contains many
>>> >> projects, I
>>> >>>> think in general it's a good idea to separate projects in the early
>>> >>> phase.
>>> >>>>
>>> >>>>
>>> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <[email protected]>
>>> wrote:
>>> >>>>
>>> >>>>> +1 ，That will be great ！
>>> >>>>>
>>> >>>>>> 在 2019年7月10日，下午3:34，Xun Liu <[email protected]> 写道：
>>> >>>>>>
>>> >>>>>> Hi all,
>>> >>>>>>
>>> >>>>>> This is Xun Liu contributing to the Submarine project for deep
>>> >>> learning
>>> >>>>>> workloads running with big data workloads together on Hadoop
>>> >>> clusters.
>>> >>>>>>
>>> >>>>>> There are a bunch of integrations of Submarine to other projects
>>> >> are
>>> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
>>> >>> next
>>> >>>>> step
>>> >>>>>> of Submarine is going to integrate with more projects like Apache
>>> >>>> Arrow,
>>> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine
>>> learning
>>> >>> use
>>> >>>>>> cases like model serving, notebook management, advanced training
>>> >>>>>> optimizations (like auto parameter tuning, memory cache
>>> >> optimizations
>>> >>>> for
>>> >>>>>> large datasets for training, etc.), and make it run on other
>>> >>> platforms
>>> >>>>> like
>>> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate
>>> TonY
>>> >>>>> project
>>> >>>>>> to Apache so we can put Submarine and TonY together to the same
>>> >>>> codebase
>>> >>>>>> (Page #30.
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
>>> >>>>>> ).
>>> >>>>>>
>>> >>>>>> This expands the scope of the original Submarine project in
>>> >> exciting
>>> >>>> new
>>> >>>>>> ways. Toward that end, would it make sense to create a separate
>>> >>>> Submarine
>>> >>>>>> project at Apache? This can make faster adoption of Submarine, and
>>> >>>> allow
>>> >>>>>> Submarine to grow to a full-blown machine learning platform.
>>> >>>>>>
>>> >>>>>> There will be lots of technical details to work out, but any
>>> >> initial
>>> >>>>>> thoughts on this?
>>> >>>>>>
>>> >>>>>> Best Regards,
>>> >>>>>> Xun Liu
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> ---------------------------------------------------------------------
>>> >>>>> To unsubscribe, e-mail: [email protected]
>>> >>>>> For additional commands, e-mail: [email protected]
>>> >>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>

Re: Any thoughts making Submarine a separate Apache project?

Reply via email to