Hi, Jongyoul Lee, Морковкин

I queried the information about GSOS. Is it still necessary to apply for the 
zeppelin community first?
I don't know much about GSOS. In addition to helping the project, the mentor
What other work needs to be done?

> 在 2019年3月8日,上午10:01,Xun Liu <neliu...@163.com> 写道:
> 
> Hi, Морковкин
> 
> I am very happy to be your mentor for GSOC. :-)
> I believe that by completing this work, I can also learn a lot.
> 
> Please watch to https://issues.apache.org/jira/browse/ZEPPELIN-4018 
> <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
> 
>> 在 2019年3月8日,上午12:08,Морковкин, Василий Владимирович 
>> <morkovkin...@phystech.edu> 写道:
>> 
>> Hi! For fun I've sketched a toy-prototype of workflow manager in Scala. It 
>> makes it easy to impose dependencies on the execution order of tasks. Check 
>> this out: https://scastie.scala-lang.org/aRJGberkQ4CWatyABCOJcQ 
>> <https://scastie.scala-lang.org/aRJGberkQ4CWatyABCOJcQ> . It reproduces the 
>> flow which is shown in the attached picture.
>> Xun Liu, It would be great to clarify whether you agree to be a mentor 
>> exactly within GSOC, or without it? :)
>> 
>> ----------------------------------------
>> Best regards, Basil Morkovkin
>> 
>> чт, 7 мар. 2019 г. в 11:32, Jeff Zhang <zjf...@gmail.com 
>> <mailto:zjf...@gmail.com>>:
>> 
>> Thanks Liu for taking over this, I will help review the design.  
>> 
>> Xun Liu <neliu...@163.com <mailto:neliu...@163.com>> 于2019年3月7日周四 下午4:05写道:
>> Hi Vasiliy Morkovkin
>> 
>> Thank you very much for your willingness to implement this feature of 
>> workflow.
>> I will work with you with the highest priority.
>> I am planning to update the system design documentation for workflow first 
>> at https://issues.apache.org/jira/browse/ZEPPELIN-4018 
>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018 
>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018>> .
>> Please set the Watcher in ZEPPELIN-4018.
>> This way you can get notification messages for document updates in a timely 
>> manner.
>> 
>> We can communicate all the questions in the ZEPPELIN-4018 JIRA comments.
>> If you need it, you can email me at liuxun...@gmail.com 
>> <mailto:liuxun...@gmail.com> <mailto:liuxun...@gmail.com 
>> <mailto:liuxun...@gmail.com>> , I will reply you the fastest.
>> Do you think this kind of cooperation is OK?
>> 
>> 
>> @moon, @Jeff, @Jongyoul Lee , If interested, Please help us improve our 
>> system design. Thanks!
>> 
>> :-)
>> 
>>> 在 2019年3月7日,上午6:04,Морковкин, Василий Владимирович 
>>> <morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu>> 写道:
>>> 
>>> Thank you for such a detailed feedback!
>>> I am definitely interested to work on the workflow implementation with you 
>>> Xun Liu! Could you become a mentor in GSOC with this task?
>>> Some front-end work is not a problem at all.
>>> I'm ready to work at least 30 hours per week in the summer, while now I'd 
>>> like to take some smaller tasks to take a closer look at existing codebase 
>>> and to get familiar with your development workflow. Do you have such tasks 
>>> on mind?
>>> 
>>> ср, 6 мар. 2019 г. в 05:23, Xun Liu <neliu...@163.com 
>>> <mailto:neliu...@163.com> <mailto:neliu...@163.com 
>>> <mailto:neliu...@163.com>>>:
>>> Hi Vasiliy Morkovkin
>>> 
>>> I said my thoughts on workflow, 
>>> https://issues.apache.org/jira/browse/ZEPPELIN-4018 
>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018 
>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018>> 
>>> 
>>> Because there are more than 20 interpreters in zeppelin, 
>>> Data analysts can be used to do a variety of data development,
>>> A lot of data development is interdependent. For example, 
>>> the development of machine learning algorithms requires relying on spark to 
>>> preprocess data, and so on.
>>> 
>>> Now open source workflow software has Azkaban, airflow,
>>> Azkaban is relatively simple and has been used to meet most scenarios, and 
>>> our company is using it.
>>> Airflow looks complicated and I have not used it.
>>> In fact, I have previously implemented workflow workflow for notes and 
>>> paragraphs in zeppelin via azkaban.
>>> https://youtu.be/2r6q-2Tq7hk?t=33 <https://youtu.be/2r6q-2Tq7hk?t=33> 
>>> <https://youtu.be/2r6q-2Tq7hk?t=33 <https://youtu.be/2r6q-2Tq7hk?t=33>> 
>>> 
>>> However, I think zeppelin should have built-in workflow capabilities. 
>>> Instead of relying on external software to schedule notes in zeppelin for 
>>> the following reasons:
>>> 1. Now that we have upgraded from the data processing era to the algorithm 
>>> era,
>>> After zeppelin has its own workflow, it will form a data loop.
>>> 
>>> 2. zeppelin's powerful interactive processing capabilities help algorithm 
>>> engineers improve productivity and work.
>>> Zeppelin should give the algorithm engineer more direct control.
>>> Instead of handing the algorithm to other teams(or software) to do the 
>>> workflow.
>>> 
>>> 3. zeppelin knows more about the processing status of data than Azkaban and 
>>> airflow.
>>> So the built-in workflow will have better performance, user experience and 
>>> control.
>>> 
>>> If you are interested in workflow(ZEPPELIN-4018), 
>>> I am willing to work with you to complete all system design and code 
>>> development work.
>>> 
>>> :-)
>>> 
>>>> 在 2019年3月6日,上午9:32,Jeff Zhang <zjf...@gmail.com <mailto:zjf...@gmail.com> 
>>>> <mailto:zjf...@gmail.com <mailto:zjf...@gmail.com>>> 写道:
>>>> 
>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi> 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857Hi>> Basil,
>>>> 
>>>> Thanks for your interest in zeppelin, here's my comments about the tickets
>>>> you interested.
>>>> 
>>>> 1. https://issues.apache.org/jira/browse/ZEPPELIN-3651 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651> 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651>>
>>>>   This involves 2 sides of work: frontend and backend:
>>>>   In frontend, we should use arrow js to handle the table data, include
>>>> display it and processing it (such as aggregation)
>>>>   In backend, we should use arrow for each language, and allow them to
>>>> exchange data in the same process. And use arrow IPC to exchange data
>>>> across processes.
>>>>  Overall, this is a pretty large task. If you really want to do, I would
>>>> suggest you to just take part of it.
>>>> 
>>>> 2. https://issues.apache.org/jira/browse/ZEPPELIN-3994 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994> 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994>>
>>>>   Regarding model serving, I don't have clear picture about this. Others
>>>> can comment on this.
>>>> 
>>>> 3. https://issues.apache.org/jira/browse/ZEPPELIN-4018 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018>>
>>>>   Job scheduling is pretty important for zeppelin, I would make this as
>>>> the highest priority for zeppelin among these tickets. airflow is one
>>>> option, but I am open to other solutions. First we need to figure out how
>>>> user schedule jobs in zeppelin, then choose the right framework. It would
>>>> also involves some frontend work
>>>> 
>>>> 4. https://issues.apache.org/jira/browse/ZEPPELIN-3857 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857> 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857 
>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857>>
>>>>   Spark 2.4.0 supporting is already there, but scala 2.12 is not
>>>> supported yet. It won't be a big project for GSOC IMO.
>>>> 
>>>> 5. OLAP.
>>>>   Regarding OLAP, as long as the OLAP engine provide Jdbc interface,
>>>> Zeppelin can support it very well. But we could create specific interpreter
>>>> for OLAP engine if their native api perform better than jdbc. Another thing
>>>> I can think of improving OLAP is visualization, although Zeppelin already
>>>> support some built-in visualization, there's still some visualization
>>>> missing. We could provide more.
>>>> 
>>>> 6. Auto-completions.
>>>>  We have already support ipython[1]  in zeppelin which provide almost the
>>>> same auto-completion like jupyter. But it lacks for accessing python api
>>>> doc. This is also pretty important for python users IMO. SQL is another
>>>> popular language in Zeppelin, but it also doesn't provide good
>>>> code-completion experience, we can do better as well.
>>>> 
>>>> 7. Notifications.
>>>>  I think notification can be integrated into job scheduling. Notification
>>>> can be sent when job is failed/succeed.
>>>> 
>>>> 
>>>> Let us know which jira you are more interested, and also please consider
>>>> how much time you can spent on this. Again, we are very appreciated your
>>>> interest on zeppelin and look forward your contribution.
>>>> 
>>>> 
>>>> [1]
>>>> http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support
>>>>  
>>>> <http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support>
>>>>  
>>>> <http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support
>>>>  
>>>> <http://zeppelin.apache.org/docs/0.8.1/interpreter/python.html#ipython-support>>
>>>> 
>>>> 
>>>> 
>>>> Морковкин, Василий Владимирович <morkovkin...@phystech.edu 
>>>> <mailto:morkovkin...@phystech.edu> <mailto:morkovkin...@phystech.edu 
>>>> <mailto:morkovkin...@phystech.edu>>> 于2019年3月6日周三
>>>> 上午7:41写道:
>>>> 
>>>>> Thank you for your replies! I've checked existing set of issues and found
>>>>> several curious ones:
>>>>> - https://issues.apache.org/jira/browse/ZEPPELIN-3651 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651> 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3651>> seems to be very
>>>>> nice
>>>>> way to increase analytical processing performance using Arrow project;
>>>>> - https://issues.apache.org/jira/browse/ZEPPELIN-3994 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994> 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3994>> deploying models
>>>>> regardless of ZeppelinServer sounds quite intriguing too. Although there 
>>>>> is
>>>>> much to think about;
>>>>> - https://issues.apache.org/jira/browse/ZEPPELIN-4018 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018> 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-4018>> at first glance
>>>>> https://airflow.apache.org/ <https://airflow.apache.org/> 
>>>>> <https://airflow.apache.org/ <https://airflow.apache.org/>> seems to be 
>>>>> useful in implementing complex
>>>>> execution workflows.
>>>>> Those tasks are global and intriguing, requiring complex architectural
>>>>> solutions.
>>>>> Also I've probably found the ticket which is suitable for me to get
>>>>> involved into the project:
>>>>> - https://issues.apache.org/jira/browse/ZEPPELIN-3857 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857> 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857 
>>>>> <https://issues.apache.org/jira/browse/ZEPPELIN-3857>>. What do you think?
>>>>> Are there any "low hanging fruits"?
>>>>> 
>>>>> And I have several ideas on my own. Some of them might be not relevant due
>>>>> to the vision of the project or other reasons. Just ideas:
>>>>> - OLAP. As Zeppelin is a tool aimed at analytics, it seems to be quite
>>>>> logical to add more integrations with existing OLAP solutions like Pinot,
>>>>> ClickHouse and Druid. Currently I've found integration only with Kylin;
>>>>> - Better autocompletion. Jupyter offers not only a list of already
>>>>> initialized variables, but also quick access to documentation. It's
>>>>> convenient;
>>>>> - Notifications. Some colleagues would have appreciated the notifications
>>>>> service, which sends you messages (via mail, Slack bot or something else)
>>>>> indicating that your long-running paragraphs has completed.
>>>>> 
>>>>> Feedback is very appreciated :)
>>>>> 
>>>>> It would be wonderful if someone agreed to sacrifice his time and become a
>>>>> mentor in GSOC program!
>>>>> 
>>>>> ----------------------------------------
>>>>> Best regards, Basil Morkovkin.
>>>>> 
>>>>> 
>>>>> вт, 5 мар. 2019 г. в 11:48, Jongyoul Lee <jongy...@gmail.com 
>>>>> <mailto:jongy...@gmail.com> <mailto:jongy...@gmail.com 
>>>>> <mailto:jongy...@gmail.com>>>:
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> I've confirmed I could add more issues for GSOC. Can you explain what you
>>>>>> would like to contribute to? I can add more issues
>>>>>> 
>>>>>> JL
>>>>>> 
>>>>>> On Tue, Mar 5, 2019 at 1:03 PM Xun Liu <neliu...@163.com 
>>>>>> <mailto:neliu...@163.com> <mailto:neliu...@163.com 
>>>>>> <mailto:neliu...@163.com>>> wrote:
>>>>>> 
>>>>>>> Hi, Vasiliy Morkovkin
>>>>>>> 
>>>>>>> Welcome to the zeppelin community! :-)
>>>>>>> 
>>>>>>>> 在 2019年3月5日,上午11:49,Jongyoul Lee <jongy...@gmail.com 
>>>>>>>> <mailto:jongy...@gmail.com> <mailto:jongy...@gmail.com 
>>>>>>>> <mailto:jongy...@gmail.com>>> 写道:
>>>>>>>> 
>>>>>>>> Thanks for contacting Zeppelin with your interest.
>>>>>>>> 
>>>>>>>> I added FE topics for GSOC because FE is the most urgent issue I have
>>>>>>>> thought about. We always encourage to contribute Zeppelin with several
>>>>>>>> topics including your idea.
>>>>>>>> 
>>>>>>>> Please describe something more.
>>>>>>>> 
>>>>>>>> Thanks.
>>>>>>>> JL
>>>>>>>> 
>>>>>>>> On Tue, Mar 5, 2019 at 10:41 AM moon soo Lee <m...@apache.org 
>>>>>>>> <mailto:m...@apache.org> <mailto:m...@apache.org 
>>>>>>>> <mailto:m...@apache.org>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> Great to see your interest to project. Thanks!
>>>>>>>>> Looks like we need volunteers for a mentor and some backend subject
>>>>> for
>>>>>>>>> GSoC2019.
>>>>>>>>> Any ideas?
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> moon
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Mon, Mar 4, 2019 at 3:05 PM Vasiliy Morkovkin <
>>>>>>>>> morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu> 
>>>>>>>>> <mailto:morkovkin...@phystech.edu <mailto:morkovkin...@phystech.edu>>>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi everyone, I'm pursuing bachelor degree at Moscow institute of
>>>>>>> physics
>>>>>>>>>> and technology and eager to contribute to Zeppelin in context of
>>>>> GSOC
>>>>>>>>>> 2019. I've become a real fan of Zeppelin over the past couple of
>>>>>>> months,
>>>>>>>>>> using it at my job. But I have found out only one ticket (front-end
>>>>>>>>>> task) with label of GSOC 2019 on your Jira. Perhaps you may have any
>>>>>>>>>> ideas for new features or improvements in Zeppelin, but you don't
>>>>> have
>>>>>>>>>> enough hands on them. It would be wonderful if anyone agreed to
>>>>> mentor
>>>>>>>>>> these ideas within GSOC :)
>>>>>>>>>> Currently I am in a position of Scala developer (back-end) for 1.5
>>>>>>> year.
>>>>>>>>>> I also can write in Java or Python without any problems if
>>>>> necessary.
>>>>>>>>>> Really fond of databases and highload. Also I have experience with
>>>>>>> some
>>>>>>>>>> other great Apache projects like Cassandra, Kafka and Spark.
>>>>>>>>>> 
>>>>>>>>>> Best regards, Basil Morkovkin.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> 이종열, Jongyoul Lee, 李宗烈
>>>>>>>> http://madeng.net <http://madeng.net/> <http://madeng.net/ 
>>>>>>>> <http://madeng.net/>>
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 이종열, Jongyoul Lee, 李宗烈
>>>>>> http://madeng.net <http://madeng.net/> <http://madeng.net/ 
>>>>>> <http://madeng.net/>>
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Best Regards
>>>> 
>>>> Jeff Zhang
>>> 
>> 
>> 
>> 
>> -- 
>> Best Regards
>> 
>> Jeff Zhang
> 


  • Re: Zeppelin ... Xun Liu
    • Re: Zepp... Jongyoul Lee
      • Re: ... Морковкин , Василий Владимирович
        • ... Jeff Zhang
          • ... Xun Liu
          • ... Морковкин , Василий Владимирович
          • ... Xun Liu
          • ... Jeff Zhang
          • ... Морковкин , Василий Владимирович
          • ... Xun Liu
          • ... Xun Liu
          • ... Jeff Zhang
          • ... Xun Liu
          • ... Felix Cheung
          • ... Xun Liu

Reply via email to