Hi Liu,

See this link https://community.apache.org/gsoc.html


Xun Liu <neliu...@163.com> 于2019年3月8日周五 下午1:58写道:

> Hi, Jongyoul Lee, Морковкин
>
> I queried the information about GSOS. Is it still necessary to apply for
> the zeppelin community first?
> I don't know much about GSOS. In addition to helping the project, the
> mentor
> What other work needs to be done?
>
> > 在 2019年3月8日,上午10:01,Xun Liu <neliu...@163.com> 写道:
> >
> > Hi, Морковкин
> >
> > I am very happy to be your mentor for GSOC. :-)
> > I believe that by completing this work, I can also learn a lot.
> >
> > Please watch to https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018>
> >
> >> 在 2019年3月8日,上午12:08,Морковкин, Василий Владимирович <
> morkovkin...@phystech.edu> 写道:
> >>
> >> Hi! For fun I've sketched a toy-prototype of workflow manager in Scala.
> It makes it easy to impose dependencies on the execution order of tasks.
> Check this out: https://scastie.scala-lang.org/aRJGberkQ4CWatyABCOJcQ <
> https://scastie.scala-lang.org/aRJGberkQ4CWatyABCOJcQ> . It reproduces
> the flow which is shown in the attached picture.
> >> Xun Liu, It would be great to clarify whether you agree to be a mentor
> exactly within GSOC, or without it? :)
> >>
> >> ----------------------------------------
> >> Best regards, Basil Morkovkin
> >>
> >> чт, 7 мар. 2019 г. в 11:32, Jeff Zhang <zjf...@gmail.com <mailto:
> zjf...@gmail.com>>:
> >>
> >> Thanks Liu for taking over this, I will help review the design.
> >>
> >> Xun Liu <neliu...@163.com <mailto:neliu...@163.com>> 于2019年3月7日周四
> 下午4:05写道:
> >> Hi Vasiliy Morkovkin
> >>
> >> Thank you very much for your willingness to implement this feature of
> workflow.
> >> I will work with you with the highest priority.
> >> I am planning to update the system design documentation for workflow
> first at https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018> <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018>> .
> >> Please set the Watcher in ZEPPELIN-4018.
> >> This way you can get notification messages for document updates in a
> timely manner.
> >>
> >> We can communicate all the questions in the ZEPPELIN-4018 JIRA comments.
> >> If you need it, you can email me at liuxun...@gmail.com <mailto:
> liuxun...@gmail.com> <mailto:liuxun...@gmail.com <mailto:
> liuxun...@gmail.com>> , I will reply you the fastest.
> >> Do you think this kind of cooperation is OK?
> >>
> >>
> >> @moon, @Jeff, @Jongyoul Lee , If interested, Please help us improve our
> system design. Thanks!
> >>
> >> :-)
> >>
> >>> 在 2019年3月7日,上午6:04,Морковкин, Василий Владимирович <
> morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu>> 写道:
> >>>
> >>> Thank you for such a detailed feedback!
> >>> I am definitely interested to work on the workflow implementation with
> you Xun Liu! Could you become a mentor in GSOC with this task?
> >>> Some front-end work is not a problem at all.
> >>> I'm ready to work at least 30 hours per week in the summer, while now
> I'd like to take some smaller tasks to take a closer look at existing
> codebase and to get familiar with your development workflow. Do you have
> such tasks on mind?
> >>>
> >>> ср, 6 мар. 2019 г. в 05:23, Xun Liu <neliu...@163.com <mailto:
> neliu...@163.com> <mailto:neliu...@163.com <mailto:neliu...@163.com>>>:
> >>> Hi Vasiliy Morkovkin
> >>>
> >>> I said my thoughts on workflow,
> https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018> <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018>>
> >>>
> >>> Because there are more than 20 interpreters in zeppelin,
> >>> Data analysts can be used to do a variety of data development,
> >>> A lot of data development is interdependent. For example,
> >>> the development of machine learning algorithms requires relying on
> spark to preprocess data, and so on.
> >>>
> >>> Now open source workflow software has Azkaban, airflow,
> >>> Azkaban is relatively simple and has been used to meet most scenarios,
> and our company is using it.
> >>> Airflow looks complicated and I have not used it.
> >>> In fact, I have previously implemented workflow workflow for notes and
> paragraphs in zeppelin via azkaban.
> >>> https://youtu.be/2r6q-2Tq7hk?t=33 <https://youtu.be/2r6q-2Tq7hk?t=33>
> <https://youtu.be/2r6q-2Tq7hk?t=33 <https://youtu.be/2r6q-2Tq7hk?t=33>>
> >>>
> >>> However, I think zeppelin should have built-in workflow capabilities.
> >>> Instead of relying on external software to schedule notes in zeppelin
> for the following reasons:
> >>> 1. Now that we have upgraded from the data processing era to the
> algorithm era,
> >>> After zeppelin has its own workflow, it will form a data loop.
> >>>
> >>> 2. zeppelin's powerful interactive processing capabilities help
> algorithm engineers improve productivity and work.
> >>> Zeppelin should give the algorithm engineer more direct control.
> >>> Instead of handing the algorithm to other teams(or software) to do the
> workflow.
> >>>
> >>> 3. zeppelin knows more about the processing status of data than
> Azkaban and airflow.
> >>> So the built-in workflow will have better performance, user experience
> and control.
> >>>
> >>> If you are interested in workflow(ZEPPELIN-4018),
> >>> I am willing to work with you to complete all system design and code
> development work.
> >>>
> >>> :-)
> >>>
> >>>> 在 2019年3月6日,上午9:32,Jeff Zhang <zjf...@gmail.com <mailto:
> zjf...@gmail.com> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>>> 写道:
> >>>>
> >>>> https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi> <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi>> Basil,
> >>>>
> >>>> Thanks for your interest in zeppelin, here's my comments about the
> tickets
> >>>> you interested.
> >>>>
> >>>> 1. https://issues.apache.org/jira/browse/ZEPPELIN-3651 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3651> <
> https://issues.apache.org/jira/browse/ZEPPELIN-3651 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3651>>
> >>>>   This involves 2 sides of work: frontend and backend:
> >>>>   In frontend, we should use arrow js to handle the table data,
> include
> >>>> display it and processing it (such as aggregation)
> >>>>   In backend, we should use arrow for each language, and allow them to
> >>>> exchange data in the same process. And use arrow IPC to exchange data
> >>>> across processes.
> >>>>  Overall, this is a pretty large task. If you really want to do, I
> would
> >>>> suggest you to just take part of it.
> >>>>
> >>>> 2. https://issues.apache.org/jira/browse/ZEPPELIN-3994 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3994> <
> https://issues.apache.org/jira/browse/ZEPPELIN-3994 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3994>>
> >>>>   Regarding model serving, I don't have clear picture about this.
> Others
> >>>> can comment on this.
> >>>>
> >>>> 3. https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018> <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018>>
> >>>>   Job scheduling is pretty important for zeppelin, I would make this
> as
> >>>> the highest priority for zeppelin among these tickets. airflow is one
> >>>> option, but I am open to other solutions. First we need to figure out
> how
> >>>> user schedule jobs in zeppelin, then choose the right framework. It
> would
> >>>> also involves some frontend work
> >>>>
> >>>> 4. https://issues.apache.org/jira/browse/ZEPPELIN-3857 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857> <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857>>
> >>>>   Spark 2.4.0 supporting is already there, but scala 2.12 is not
> >>>> supported yet. It won't be a big project for GSOC IMO.
> >>>>
> >>>> 5. OLAP.
> >>>>   Regarding OLAP, as long as the OLAP engine provide Jdbc interface,
> >>>> Zeppelin can support it very well. But we could create specific
> interpreter
> >>>> for OLAP engine if their native api perform better than jdbc. Another
> thing
> >>>> I can think of improving OLAP is visualization, although Zeppelin
> already
> >>>> support some built-in visualization, there's still some visualization
> >>>> missing. We could provide more.
> >>>>
> >>>> 6. Auto-completions.
> >>>>  We have already support ipython[1]  in zeppelin which provide almost
> the
> >>>> same auto-completion like jupyter. But it lacks for accessing python
> api
> >>>> doc. This is also pretty important for python users IMO. SQL is
> another
> >>>> popular language in Zeppelin, but it also doesn't provide good
> >>>> code-completion experience, we can do better as well.
> >>>>
> >>>> 7. Notifications.
> >>>>  I think notification can be integrated into job scheduling.
> Notification
> >>>> can be sent when job is failed/succeed.
> >>>>
> >>>>
> >>>> Let us know which jira you are more interested, and also please
> consider
> >>>> how much time you can spent on this. Again, we are very appreciated
> your
> >>>> interest on zeppelin and look forward your contribution.
> >>>>
> >>>>
> >>>> [1]
> >>>>
> http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support
> <
> http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support>
> <
> http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support
> <
> http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support
> >>
> >>>>
> >>>>
> >>>>
> >>>> Морковкин, Василий Владимирович <morkovkin...@phystech.edu <mailto:
> morkovkin...@phystech.edu> <mailto:morkovkin...@phystech.edu <mailto:
> morkovkin...@phystech.edu>>> 于2019年3月6日周三
> >>>> 上午7:41写道:
> >>>>
> >>>>> Thank you for your replies! I've checked existing set of issues and
> found
> >>>>> several curious ones:
> >>>>> - https://issues.apache.org/jira/browse/ZEPPELIN-3651 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3651> <
> https://issues.apache.org/jira/browse/ZEPPELIN-3651 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3651>> seems to be very
> >>>>> nice
> >>>>> way to increase analytical processing performance using Arrow
> project;
> >>>>> - https://issues.apache.org/jira/browse/ZEPPELIN-3994 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3994> <
> https://issues.apache.org/jira/browse/ZEPPELIN-3994 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3994>> deploying models
> >>>>> regardless of ZeppelinServer sounds quite intriguing too. Although
> there is
> >>>>> much to think about;
> >>>>> - https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018> <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018>> at first glance
> >>>>> https://airflow.apache.org/ <https://airflow.apache.org/> <
> https://airflow.apache.org/ <https://airflow.apache.org/>> seems to be
> useful in implementing complex
> >>>>> execution workflows.
> >>>>> Those tasks are global and intriguing, requiring complex
> architectural
> >>>>> solutions.
> >>>>> Also I've probably found the ticket which is suitable for me to get
> >>>>> involved into the project:
> >>>>> - https://issues.apache.org/jira/browse/ZEPPELIN-3857 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857> <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857 <
> https://issues.apache.org/jira/browse/ZEPPELIN-3857>>. What do you think?
> >>>>> Are there any "low hanging fruits"?
> >>>>>
> >>>>> And I have several ideas on my own. Some of them might be not
> relevant due
> >>>>> to the vision of the project or other reasons. Just ideas:
> >>>>> - OLAP. As Zeppelin is a tool aimed at analytics, it seems to be
> quite
> >>>>> logical to add more integrations with existing OLAP solutions like
> Pinot,
> >>>>> ClickHouse and Druid. Currently I've found integration only with
> Kylin;
> >>>>> - Better autocompletion. Jupyter offers not only a list of already
> >>>>> initialized variables, but also quick access to documentation. It's
> >>>>> convenient;
> >>>>> - Notifications. Some colleagues would have appreciated the
> notifications
> >>>>> service, which sends you messages (via mail, Slack bot or something
> else)
> >>>>> indicating that your long-running paragraphs has completed.
> >>>>>
> >>>>> Feedback is very appreciated :)
> >>>>>
> >>>>> It would be wonderful if someone agreed to sacrifice his time and
> become a
> >>>>> mentor in GSOC program!
> >>>>>
> >>>>> ----------------------------------------
> >>>>> Best regards, Basil Morkovkin.
> >>>>>
> >>>>>
> >>>>> вт, 5 мар. 2019 г. в 11:48, Jongyoul Lee <jongy...@gmail.com
> <mailto:jongy...@gmail.com> <mailto:jongy...@gmail.com <mailto:
> jongy...@gmail.com>>>:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> I've confirmed I could add more issues for GSOC. Can you explain
> what you
> >>>>>> would like to contribute to? I can add more issues
> >>>>>>
> >>>>>> JL
> >>>>>>
> >>>>>> On Tue, Mar 5, 2019 at 1:03 PM Xun Liu <neliu...@163.com <mailto:
> neliu...@163.com> <mailto:neliu...@163.com <mailto:neliu...@163.com>>>
> wrote:
> >>>>>>
> >>>>>>> Hi, Vasiliy Morkovkin
> >>>>>>>
> >>>>>>> Welcome to the zeppelin community! :-)
> >>>>>>>
> >>>>>>>> 在 2019年3月5日,上午11:49,Jongyoul Lee <jongy...@gmail.com <mailto:
> jongy...@gmail.com> <mailto:jongy...@gmail.com <mailto:jongy...@gmail.com>>>
> 写道:
> >>>>>>>>
> >>>>>>>> Thanks for contacting Zeppelin with your interest.
> >>>>>>>>
> >>>>>>>> I added FE topics for GSOC because FE is the most urgent issue I
> have
> >>>>>>>> thought about. We always encourage to contribute Zeppelin with
> several
> >>>>>>>> topics including your idea.
> >>>>>>>>
> >>>>>>>> Please describe something more.
> >>>>>>>>
> >>>>>>>> Thanks.
> >>>>>>>> JL
> >>>>>>>>
> >>>>>>>> On Tue, Mar 5, 2019 at 10:41 AM moon soo Lee <m...@apache.org
> <mailto:m...@apache.org> <mailto:m...@apache.org <mailto:m...@apache.org>>>
> wrote:
> >>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> Great to see your interest to project. Thanks!
> >>>>>>>>> Looks like we need volunteers for a mentor and some backend
> subject
> >>>>> for
> >>>>>>>>> GSoC2019.
> >>>>>>>>> Any ideas?
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> moon
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Mar 4, 2019 at 3:05 PM Vasiliy Morkovkin <
> >>>>>>>>> morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu>
> <mailto:morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi everyone, I'm pursuing bachelor degree at Moscow institute of
> >>>>>>> physics
> >>>>>>>>>> and technology and eager to contribute to Zeppelin in context of
> >>>>> GSOC
> >>>>>>>>>> 2019. I've become a real fan of Zeppelin over the past couple of
> >>>>>>> months,
> >>>>>>>>>> using it at my job. But I have found out only one ticket
> (front-end
> >>>>>>>>>> task) with label of GSOC 2019 on your Jira. Perhaps you may
> have any
> >>>>>>>>>> ideas for new features or improvements in Zeppelin, but you
> don't
> >>>>> have
> >>>>>>>>>> enough hands on them. It would be wonderful if anyone agreed to
> >>>>> mentor
> >>>>>>>>>> these ideas within GSOC :)
> >>>>>>>>>> Currently I am in a position of Scala developer (back-end) for
> 1.5
> >>>>>>> year.
> >>>>>>>>>> I also can write in Java or Python without any problems if
> >>>>> necessary.
> >>>>>>>>>> Really fond of databases and highload. Also I have experience
> with
> >>>>>>> some
> >>>>>>>>>> other great Apache projects like Cassandra, Kafka and Spark.
> >>>>>>>>>>
> >>>>>>>>>> Best regards, Basil Morkovkin.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> 이종열, Jongyoul Lee, 李宗烈
> >>>>>>>> http://madeng.net <http://madeng.net/> <http://madeng.net/ <
> http://madeng.net/>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> 이종열, Jongyoul Lee, 李宗烈
> >>>>>> http://madeng.net <http://madeng.net/> <http://madeng.net/ <
> http://madeng.net/>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best Regards
> >>>>
> >>>> Jeff Zhang
> >>>
> >>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >
>
>
>

-- 
Best Regards

Jeff Zhang
  • Re: Zeppelin ... Jongyoul Lee
    • Re: Zepp... Морковкин , Василий Владимирович
      • Re: ... Jeff Zhang
        • ... Xun Liu
        • ... Морковкин , Василий Владимирович
        • ... Xun Liu
        • ... Jeff Zhang
        • ... Морковкин , Василий Владимирович
        • ... Xun Liu
        • ... Xun Liu
        • ... Jeff Zhang
        • ... Xun Liu
        • ... Felix Cheung
        • ... Xun Liu

Reply via email to