Re: Re: Any thoughts making Submarine a separate Apache project?
+1 ,Good idea, we are very much looking forward to it. dashuiguailu...@gmail.com From: Szilard Nemeth Date: 2019-07-17 14:55 To: runlin zhang CC: Xun Liu; Hadoop Common; yarn-dev; Hdfs-dev; mapreduce-dev; submarine-dev Subject: Re: Any thoughts making Submarine a separate Apache project? +1, this is a very great idea. As Hadoop repository has already grown huge and contains many projects, I think in general it's a good idea to separate projects in the early phase. On Wed, Jul 17, 2019, 08:50 runlin zhang wrote: > +1 ,That will be great ! > > > 在 2019年7月10日,下午3:34,Xun Liu 写道: > > > > Hi all, > > > > This is Xun Liu contributing to the Submarine project for deep learning > > workloads running with big data workloads together on Hadoop clusters. > > > > There are a bunch of integrations of Submarine to other projects are > > finished or going on, such as Apache Zeppelin, TonY, Azkaban. The next > step > > of Submarine is going to integrate with more projects like Apache Arrow, > > Redis, MLflow, etc. & be able to handle end-to-end machine learning use > > cases like model serving, notebook management, advanced training > > optimizations (like auto parameter tuning, memory cache optimizations for > > large datasets for training, etc.), and make it run on other platforms > like > > Kubernetes or natively on Cloud. LinkedIn also wants to donate TonY > project > > to Apache so we can put Submarine and TonY together to the same codebase > > (Page #30. > > > https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30 > > ). > > > > This expands the scope of the original Submarine project in exciting new > > ways. Toward that end, would it make sense to create a separate Submarine > > project at Apache? This can make faster adoption of Submarine, and allow > > Submarine to grow to a full-blown machine learning platform. > > > > There will be lots of technical details to work out, but any initial > > thoughts on this? > > > > Best Regards, > > Xun Liu > > > - > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >
Re: Re: Any thoughts making Submarine a separate Apache project?
+1.Submarine is already in use at our company(贝壳找房) and is performing. Looking forward to the next step to provide more features dashuiguailu...@gmail.com From: Oliver Hu Date: 2019-07-19 07:50 To: Jeff Zhang CC: sid yu; Xun Liu; Hadoop Common; yarn-dev; Hdfs-dev; mapreduce-dev; submarine-dev Subject: Re: Any thoughts making Submarine a separate Apache project? +1 (non-binding). Make Submarine a separate project would make it easier to integrate with other components in the ML pipeline and expand cross platform. On Thu, Jul 18, 2019 at 2:48 AM Jeff Zhang wrote: > +1, This is Jeff Zhang from Zeppelin community. > Thanks Xun for bring this up. Submarine has been integrated into Zeppelin > several months ago, and I already see some early adoption of that in China. > AI is fast growing area, I believe moving into a separate project would be > helpful for Submarine to catch up with the new trend of AI and release more > new features quickly than before. > > > > sid yu 于2019年7月18日周四 下午2:06写道: > > > +1 We are look forward to it. The idea is great. > > > > > On Jul 10, 2019, at 3:34 PM, Xun Liu wrote: > > > > > > Hi all, > > > > > > This is Xun Liu contributing to the Submarine project for deep learning > > > workloads running with big data workloads together on Hadoop clusters. > > > > > > There are a bunch of integrations of Submarine to other projects are > > > finished or going on, such as Apache Zeppelin, TonY, Azkaban. The next > > step > > > of Submarine is going to integrate with more projects like Apache > Arrow, > > > Redis, MLflow, etc. & be able to handle end-to-end machine learning use > > > cases like model serving, notebook management, advanced training > > > optimizations (like auto parameter tuning, memory cache optimizations > for > > > large datasets for training, etc.), and make it run on other platforms > > like > > > Kubernetes or natively on Cloud. LinkedIn also wants to donate TonY > > project > > > to Apache so we can put Submarine and TonY together to the same > codebase > > > (Page #30. > > > > > > https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30 > > > ). > > > > > > This expands the scope of the original Submarine project in exciting > new > > > ways. Toward that end, would it make sense to create a separate > Submarine > > > project at Apache? This can make faster adoption of Submarine, and > allow > > > Submarine to grow to a full-blown machine learning platform. > > > > > > There will be lots of technical details to work out, but any initial > > > thoughts on this? > > > > > > Best Regards, > > > Xun Liu > > > > > > - > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > > > > > -- > Best Regards > > Jeff Zhang >
Re: Thoughts about moving submarine to a separate git repo?
+1 ,Agree that independent development of submarine can better adapt to the development of machine learning dashuiguailu...@gmail.com From: Xun Liu Date: 2019-08-16 12:43 To: common-dev; yarn-dev; hdfs-dev; submarine-dev Subject: Thoughts about moving submarine to a separate git repo? Dear Submarine developers, My name is Xun Liu, I am a member of the Hadoop submarine development team. I'm one of the major contributor of Submarine since June 2018. I want to hear your thoughts about creating a separate GitHub repo under Apache to do submarine development. This is an independent effort of Submarine spin-off from the Hadoop project [ https://lists.apache.org/thread.html/3fab657f905d081b536d9081dc404f7fd20c80eb824c857bc8e16e3b@]. However, once the spin-off is approved, this effort can benefit the follow-up processes as well. Submarine dev community has a total of 8 developers and submits an average of 4 to 5 PR per day. But there are a limited number of Hadoop committer actively help review and merge patches, which causes development progress delays. So we created an external GitHub repo [ https://github.com/hadoopsubmarine/submarine] and moved all the code for the Hadoop submarine project into the external Github repo. In this way, everyone can review the code for each other, and now the development progress of Hadoop submarine is very fast. Also, now Submarine has little dependency on Hadoop, we want to have a separate CI/CD pipeline to release and test submarine instead of every time build whole Hadoop. Putting Submarine under Hadoop will introduce unnecessary dependencies to Hadoop's top-level pom.xml. Our development process still complies with the development rules of the Hadoop community: first, create a ticket in the submarine JIRA, and then develop, in the external GitHub repo repository, the title of each PR will be accompanied by the JIRA ID number. Once the Apache Github repo is created, we going to move all external commits to the new Apache Github repo. Any suggestions are welcome! Best Regards Xun Liu