Hi, Морковкин

I am very happy to be your mentor for GSOC. :-)
I believe that by completing this work, I can also learn a lot.

Please watch to https://issues.apache.org/jira/browse/ZEPPELIN-4018 
<https://issues.apache.org/jira/browse/ZEPPELIN-4018> 

> 在 2019年3月8日,上午12:08,Морковкин, Василий Владимирович 
> <morkovkin...@phystech.edu> 写道:
> 
> Hi! For fun I've sketched a toy-prototype of workflow manager in Scala. It 
> makes it easy to impose dependencies on the execution order of tasks. Check 
> this out: https://scastie.scala-lang.org/aRJGberkQ4CWatyABCOJcQ 
> <https://scastie.scala-lang.org/aRJGberkQ4CWatyABCOJcQ> . It reproduces the 
> flow which is shown in the attached picture.
> Xun Liu, It would be great to clarify whether you agree to be a mentor 
> exactly within GSOC, or without it? :)
> 
> ----------------------------------------
> Best regards, Basil Morkovkin
> 
> чт, 7 мар. 2019 г. в 11:32, Jeff Zhang <zjf...@gmail.com 
> <mailto:zjf...@gmail.com>>:
> 
> Thanks Liu for taking over this, I will help review the design.  
> 
> Xun Liu <neliu...@163.com <mailto:neliu...@163.com>> 于2019年3月7日周四 下午4:05写道:
> Hi Vasiliy Morkovkin
> 
> Thank you very much for your willingness to implement this feature of 
> workflow.
> I will work with you with the highest priority.
> I am planning to update the system design documentation for workflow first at 
> https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
> <https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> <https://issues.apache.org/jira/browse/ZEPPELIN-4018>> .
> Please set the Watcher in ZEPPELIN-4018.
> This way you can get notification messages for document updates in a timely 
> manner.
> 
> We can communicate all the questions in the ZEPPELIN-4018 JIRA comments.
> If you need it, you can email me at liuxun...@gmail.com 
> <mailto:liuxun...@gmail.com> <mailto:liuxun...@gmail.com 
> <mailto:liuxun...@gmail.com>> , I will reply you the fastest.
> Do you think this kind of cooperation is OK?
> 
> 
> @moon, @Jeff, @Jongyoul Lee , If interested, Please help us improve our 
> system design. Thanks!
> 
> :-)
> 
> > 在 2019年3月7日,上午6:04,Морковкин, Василий Владимирович 
> > <morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu>> 写道:
> > 
> > Thank you for such a detailed feedback!
> > I am definitely interested to work on the workflow implementation with you 
> > Xun Liu! Could you become a mentor in GSOC with this task?
> > Some front-end work is not a problem at all.
> > I'm ready to work at least 30 hours per week in the summer, while now I'd 
> > like to take some smaller tasks to take a closer look at existing codebase 
> > and to get familiar with your development workflow. Do you have such tasks 
> > on mind?
> > 
> > ср, 6 мар. 2019 г. в 05:23, Xun Liu <neliu...@163.com 
> > <mailto:neliu...@163.com> <mailto:neliu...@163.com 
> > <mailto:neliu...@163.com>>>:
> > Hi Vasiliy Morkovkin
> > 
> > I said my thoughts on workflow, 
> > https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> > <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
> > <https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> > <https://issues.apache.org/jira/browse/ZEPPELIN-4018>> 
> > 
> > Because there are more than 20 interpreters in zeppelin, 
> > Data analysts can be used to do a variety of data development,
> > A lot of data development is interdependent. For example, 
> > the development of machine learning algorithms requires relying on spark to 
> > preprocess data, and so on.
> > 
> > Now open source workflow software has Azkaban, airflow,
> > Azkaban is relatively simple and has been used to meet most scenarios, and 
> > our company is using it.
> > Airflow looks complicated and I have not used it.
> > In fact, I have previously implemented workflow workflow for notes and 
> > paragraphs in zeppelin via azkaban.
> > https://youtu.be/2r6q-2Tq7hk?t=33 <https://youtu.be/2r6q-2Tq7hk?t=33> 
> > <https://youtu.be/2r6q-2Tq7hk?t=33 <https://youtu.be/2r6q-2Tq7hk?t=33>> 
> > 
> > However, I think zeppelin should have built-in workflow capabilities. 
> > Instead of relying on external software to schedule notes in zeppelin for 
> > the following reasons:
> > 1. Now that we have upgraded from the data processing era to the algorithm 
> > era,
> > After zeppelin has its own workflow, it will form a data loop.
> > 
> > 2. zeppelin's powerful interactive processing capabilities help algorithm 
> > engineers improve productivity and work.
> > Zeppelin should give the algorithm engineer more direct control.
> > Instead of handing the algorithm to other teams(or software) to do the 
> > workflow.
> > 
> > 3. zeppelin knows more about the processing status of data than Azkaban and 
> > airflow.
> > So the built-in workflow will have better performance, user experience and 
> > control.
> > 
> > If you are interested in workflow(ZEPPELIN-4018), 
> > I am willing to work with you to complete all system design and code 
> > development work.
> > 
> > :-)
> > 
> >> 在 2019年3月6日,上午9:32,Jeff Zhang <zjf...@gmail.com <mailto:zjf...@gmail.com> 
> >> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>>> 写道:
> >> 
> >> https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi> 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi>> Basil,
> >> 
> >> Thanks for your interest in zeppelin, here's my comments about the tickets
> >> you interested.
> >> 
> >> 1. https://issues.apache.org/jira/browse/ZEPPELIN-3651 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3651> 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3651 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3651>>
> >>    This involves 2 sides of work: frontend and backend:
> >>    In frontend, we should use arrow js to handle the table data, include
> >> display it and processing it (such as aggregation)
> >>    In backend, we should use arrow for each language, and allow them to
> >> exchange data in the same process. And use arrow IPC to exchange data
> >> across processes.
> >>   Overall, this is a pretty large task. If you really want to do, I would
> >> suggest you to just take part of it.
> >> 
> >> 2. https://issues.apache.org/jira/browse/ZEPPELIN-3994 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3994> 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3994 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3994>>
> >>    Regarding model serving, I don't have clear picture about this. Others
> >> can comment on this.
> >> 
> >> 3. https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-4018>>
> >>    Job scheduling is pretty important for zeppelin, I would make this as
> >> the highest priority for zeppelin among these tickets. airflow is one
> >> option, but I am open to other solutions. First we need to figure out how
> >> user schedule jobs in zeppelin, then choose the right framework. It would
> >> also involves some frontend work
> >> 
> >> 4. https://issues.apache.org/jira/browse/ZEPPELIN-3857 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3857> 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3857 
> >> <https://issues.apache.org/jira/browse/ZEPPELIN-3857>>
> >>    Spark 2.4.0 supporting is already there, but scala 2.12 is not
> >> supported yet. It won't be a big project for GSOC IMO.
> >> 
> >> 5. OLAP.
> >>    Regarding OLAP, as long as the OLAP engine provide Jdbc interface,
> >> Zeppelin can support it very well. But we could create specific interpreter
> >> for OLAP engine if their native api perform better than jdbc. Another thing
> >> I can think of improving OLAP is visualization, although Zeppelin already
> >> support some built-in visualization, there's still some visualization
> >> missing. We could provide more.
> >> 
> >> 6. Auto-completions.
> >>   We have already support ipython[1]  in zeppelin which provide almost the
> >> same auto-completion like jupyter. But it lacks for accessing python api
> >> doc. This is also pretty important for python users IMO. SQL is another
> >> popular language in Zeppelin, but it also doesn't provide good
> >> code-completion experience, we can do better as well.
> >> 
> >> 7. Notifications.
> >>   I think notification can be integrated into job scheduling. Notification
> >> can be sent when job is failed/succeed.
> >> 
> >> 
> >> Let us know which jira you are more interested, and also please consider
> >> how much time you can spent on this. Again, we are very appreciated your
> >> interest on zeppelin and look forward your contribution.
> >> 
> >> 
> >> [1]
> >> http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support
> >>  
> >> <http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support>
> >>  
> >> <http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support
> >>  
> >> <http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support>>
> >> 
> >> 
> >> 
> >> Морковкин, Василий Владимирович <morkovkin...@phystech.edu 
> >> <mailto:morkovkin...@phystech.edu> <mailto:morkovkin...@phystech.edu 
> >> <mailto:morkovkin...@phystech.edu>>> 于2019年3月6日周三
> >> 上午7:41写道:
> >> 
> >>> Thank you for your replies! I've checked existing set of issues and found
> >>> several curious ones:
> >>> - https://issues.apache.org/jira/browse/ZEPPELIN-3651 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651> 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651>> seems to be very
> >>> nice
> >>> way to increase analytical processing performance using Arrow project;
> >>> - https://issues.apache.org/jira/browse/ZEPPELIN-3994 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994> 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994>> deploying models
> >>> regardless of ZeppelinServer sounds quite intriguing too. Although there 
> >>> is
> >>> much to think about;
> >>> - https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018>> at first glance
> >>> https://airflow.apache.org/ <https://airflow.apache.org/> 
> >>> <https://airflow.apache.org/ <https://airflow.apache.org/>> seems to be 
> >>> useful in implementing complex
> >>> execution workflows.
> >>> Those tasks are global and intriguing, requiring complex architectural
> >>> solutions.
> >>> Also I've probably found the ticket which is suitable for me to get
> >>> involved into the project:
> >>> - https://issues.apache.org/jira/browse/ZEPPELIN-3857 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857> 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857 
> >>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857>>. What do you think?
> >>> Are there any "low hanging fruits"?
> >>> 
> >>> And I have several ideas on my own. Some of them might be not relevant due
> >>> to the vision of the project or other reasons. Just ideas:
> >>> - OLAP. As Zeppelin is a tool aimed at analytics, it seems to be quite
> >>> logical to add more integrations with existing OLAP solutions like Pinot,
> >>> ClickHouse and Druid. Currently I've found integration only with Kylin;
> >>> - Better autocompletion. Jupyter offers not only a list of already
> >>> initialized variables, but also quick access to documentation. It's
> >>> convenient;
> >>> - Notifications. Some colleagues would have appreciated the notifications
> >>> service, which sends you messages (via mail, Slack bot or something else)
> >>> indicating that your long-running paragraphs has completed.
> >>> 
> >>> Feedback is very appreciated :)
> >>> 
> >>> It would be wonderful if someone agreed to sacrifice his time and become a
> >>> mentor in GSOC program!
> >>> 
> >>> ----------------------------------------
> >>> Best regards, Basil Morkovkin.
> >>> 
> >>> 
> >>> вт, 5 мар. 2019 г. в 11:48, Jongyoul Lee <jongy...@gmail.com 
> >>> <mailto:jongy...@gmail.com> <mailto:jongy...@gmail.com 
> >>> <mailto:jongy...@gmail.com>>>:
> >>> 
> >>>> Hello,
> >>>> 
> >>>> I've confirmed I could add more issues for GSOC. Can you explain what you
> >>>> would like to contribute to? I can add more issues
> >>>> 
> >>>> JL
> >>>> 
> >>>> On Tue, Mar 5, 2019 at 1:03 PM Xun Liu <neliu...@163.com 
> >>>> <mailto:neliu...@163.com> <mailto:neliu...@163.com 
> >>>> <mailto:neliu...@163.com>>> wrote:
> >>>> 
> >>>>> Hi, Vasiliy Morkovkin
> >>>>> 
> >>>>> Welcome to the zeppelin community! :-)
> >>>>> 
> >>>>>> 在 2019年3月5日,上午11:49,Jongyoul Lee <jongy...@gmail.com 
> >>>>>> <mailto:jongy...@gmail.com> <mailto:jongy...@gmail.com 
> >>>>>> <mailto:jongy...@gmail.com>>> 写道:
> >>>>>> 
> >>>>>> Thanks for contacting Zeppelin with your interest.
> >>>>>> 
> >>>>>> I added FE topics for GSOC because FE is the most urgent issue I have
> >>>>>> thought about. We always encourage to contribute Zeppelin with several
> >>>>>> topics including your idea.
> >>>>>> 
> >>>>>> Please describe something more.
> >>>>>> 
> >>>>>> Thanks.
> >>>>>> JL
> >>>>>> 
> >>>>>> On Tue, Mar 5, 2019 at 10:41 AM moon soo Lee <m...@apache.org 
> >>>>>> <mailto:m...@apache.org> <mailto:m...@apache.org 
> >>>>>> <mailto:m...@apache.org>>> wrote:
> >>>>>> 
> >>>>>>> Hi,
> >>>>>>> 
> >>>>>>> Great to see your interest to project. Thanks!
> >>>>>>> Looks like we need volunteers for a mentor and some backend subject
> >>> for
> >>>>>>> GSoC2019.
> >>>>>>> Any ideas?
> >>>>>>> 
> >>>>>>> Best,
> >>>>>>> moon
> >>>>>>> 
> >>>>>>> 
> >>>>>>> 
> >>>>>>> 
> >>>>>>> On Mon, Mar 4, 2019 at 3:05 PM Vasiliy Morkovkin <
> >>>>>>> morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu> 
> >>>>>>> <mailto:morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu>>>
> >>>>>>> wrote:
> >>>>>>> 
> >>>>>>>> Hi everyone, I'm pursuing bachelor degree at Moscow institute of
> >>>>> physics
> >>>>>>>> and technology and eager to contribute to Zeppelin in context of
> >>> GSOC
> >>>>>>>> 2019. I've become a real fan of Zeppelin over the past couple of
> >>>>> months,
> >>>>>>>> using it at my job. But I have found out only one ticket (front-end
> >>>>>>>> task) with label of GSOC 2019 on your Jira. Perhaps you may have any
> >>>>>>>> ideas for new features or improvements in Zeppelin, but you don't
> >>> have
> >>>>>>>> enough hands on them. It would be wonderful if anyone agreed to
> >>> mentor
> >>>>>>>> these ideas within GSOC :)
> >>>>>>>> Currently I am in a position of Scala developer (back-end) for 1.5
> >>>>> year.
> >>>>>>>> I also can write in Java or Python without any problems if
> >>> necessary.
> >>>>>>>> Really fond of databases and highload. Also I have experience with
> >>>>> some
> >>>>>>>> other great Apache projects like Cassandra, Kafka and Spark.
> >>>>>>>> 
> >>>>>>>> Best regards, Basil Morkovkin.
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>> --
> >>>>>> 이종열, Jongyoul Lee, 李宗烈
> >>>>>> http://madeng.net <http://madeng.net/> <http://madeng.net/ 
> >>>>>> <http://madeng.net/>>
> >>>>> 
> >>>>> 
> >>>> 
> >>>> --
> >>>> 이종열, Jongyoul Lee, 李宗烈
> >>>> http://madeng.net <http://madeng.net/> <http://madeng.net/ 
> >>>> <http://madeng.net/>>
> >>>> 
> >>> 
> >> 
> >> 
> >> -- 
> >> Best Regards
> >> 
> >> Jeff Zhang
> > 
> 
> 
> 
> -- 
> Best Regards
> 
> Jeff Zhang

  • Re: Zeppelin ... Jongyoul Lee
    • Re: Zepp... Xun Liu
      • Re: ... Jongyoul Lee
        • ... Морковкин , Василий Владимирович
          • ... Jeff Zhang
            • ... Xun Liu
            • ... Морковкин , Василий Владимирович
            • ... Xun Liu
            • ... Jeff Zhang
            • ... Морковкин , Василий Владимирович
            • ... Xun Liu
            • ... Xun Liu
            • ... Jeff Zhang
            • ... Xun Liu
            • ... Felix Cheung
            • ... Xun Liu

Reply via email to