Hello, everyone Because there are more than 20 interpreters in zeppelin, Data analysts can be used to do a variety of data development, A lot of data development is interdependent. For example, the development of machine learning algorithms requires relying on spark to preprocess data, and so on.
Zeppelin should have built-in workflow capabilities. Instead of relying on external software to schedule notes in zeppelin for the following reasons: 1. Now that we have upgraded from the data processing era to the algorithm era, After zeppelin has its own workflow, Will have a complete ecosystem of complete data processing and algorithmic operations. 2. zeppelin's powerful interactive processing capabilities help algorithm engineers improve productivity and work. Zeppelin should give the algorithm engineer more direct control. Instead of handing the algorithm to other teams(or software) to do the workflow. 3. zeppelin knows more about the processing status of data than Azkaban and airflow. So the built-in workflow will have better performance, user experience and control. Typical use case Especially in machine learning, Because machine learning generally has a long task execution. A typical example is as follows: 1) First, obtain data from HDFS through spark; 2) Clean and convert the data through sparksql; 3) Feature extraction of data through spark; 4) Tensorflow writing algorithm through hadoop submarine; 5) Distribute the tensorflow algorithm as a job to YARN or k8s for batch processing; 6) Publish the training acquisition model and provide online prediction services; 7) Model prediction by flink; 8) Receive incremental data through flink for incremental update of the model; Therefore, zeppelin is especially required to have the ability to arrange workflows. I completed the draft of the zeppelin workflow system design, please review, you can directly modify the document or fill in the comments. JIRA: https://issues.apache.org/jira/browse/ZEPPELIN-4018 <https://issues.apache.org/jira/browse/ZEPPELIN-4018> gdoc: https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit <https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit> :-) Xun Liu 2019-03-11