GitHub user Leemoonsoo reopened a pull request: https://github.com/apache/zeppelin/pull/3240
[ZEPPELIN-3840] Zeppelin on Kubernetes ### What type of PR is it? This PR adds ability to run Zeppelin on Kubernetes. It aims - Zero configuration to start Zeppelin on Kubernetes. - Run everything on Kubernetes, Zeppelin, Interpreters, Spark. - Highly customizable to adopt various user demands and extension. Key features are - Provides zeppelin-server.yaml file for `kubectl` to run Zeppelin server - All interpreters are automatically running as a Pod. - Spark interpreter automatically configured to use [Spark on Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html) - Reverse proxy is configured to access Spark UI To do - [ ] Document how reverse proxy for Spark UI works and how to configure custom domain. - [ ] Document how to customize zeppelin-server and interpreter yaml. - [ ] Document new configurations ### How it works #### Run Zeppelin Server on Kubernetes `k8s/zeppelin-server.yaml` is provided to run Zeppelin Server with few sidecars and configurations. This file is easy to publish (user can easily consume it using `curl`), highly customizable while it includes all the necessary things. #### K8s Interpreter launcher This PR adds new module, `launcher-k8s-standard` under `zeppelin/zeppelin-plugins/launcher/k8s-standard/` directory. This launcher is [automatically being selected](https://github.com/apache/zeppelin/pull/3240/files#diff-82fddd2ffb77aaffc4b9cf7b5b1eaa79) when Zeppelin is running on Kubernetes. The launcher both handles Spark interpreter and All other interpreters. The launcher launches interpreter as a Pod using template [k8s/interpreter/100-interpreter-pod.yaml](https://github.com/apache/zeppelin/pull/3240/files#diff-d9ce62e2c992d32f0184d7edb862f3c4). Reason filename has `100-` in prefix is because all files in the directory is consumed in alphabetical order by launcher on interpreter start/stop. User can drop more files here to extend/customize interpreter, and filename can be used to control order. The template is rendered by [jinjava](https://github.com/HubSpot/jinjava). #### Spark interpreter When interpreter group is `spark`, K8sRemoteInterpreterProcess [sets necessary spark configuration](https://github.com/apache/zeppelin/pull/3240/files#diff-6d1d3084f55bdd519e39ede4a619e73dR297) automatically to use [Spark on Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html). User doesn't have to configure anything. It uses client mode. #### Spark UI We may make user manually configure port-forward or do something to access Spark UI, but that's not optimal. It is the best when Spark UI is automatically accessible when user have access to Zeppelin UI, without any extra configuration. To enable this, Zeppelin server Pod has a reverse proxy as a sidecar, and it split traffic to Zeppelin server and Spark UI running in the other Pod. It assume both `service.domain.com` and `*.service.domain.com` point the nginx proxy address. `service.domain.com` is directed to ZeppelinServer, `*.service.domain.com` is directed to interpreter Pod. `<port>-<interpreter pod svc name>.service.domain.com` is convention to access any application running in interpreter Pod. If Spark interpreter Pod is running with a name `spark-axefeg` and Spark UI is running on port 4040, ``` 4040-spark-axefeg.service.domain.com ``` is the address to access Spark UI. Default service domain is [localtest.me:8080](https://github.com/apache/zeppelin/pull/3240/files#diff-56ccb2e2c2617b27dbaae866d9431e51R22), while `localtest.me` and `*.localtest.me` point `127.0.0.1`, and it works with `kubectl port-forward`. ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-3840 ### How should this be tested? Prepare a Kubernetes cluster with enough resources (cpus > 5, mem > 6g). If you're using [minikube](https://github.com/kubernetes/minikube), check your capacity using `kubectl describe node` command before start. You'll need to build Zeppelin docker image and Spark docker image to test. Please follow guide docs/quickstart/kubernetes.md. To quickly try without building docker images, I have uploaded pre-built image on docker hub `moon/zeppelin:0.9.0-SNAPSHOT`, `moon/spark:2.4.0`. Try following command ``` ZEPPELIN_SERVER_YAML="curl -s https://raw.githubusercontent.com/Leemoonsoo/zeppelin/kubernetes/k8s/zeppelin-server.yaml | sed 's/apache\/zeppelin:0.9.0-SNAPSHOT/moon\/zeppelin:0.9.0-SNAPSHOT/' | sed 's/spark:2.4.0/moon\/spark:2.4.0/'" $ZEPPELIN_SERVER_YAML | kubectl apply -f - ``` And port forward ``` kubectl port-forward zeppelin-server 8080:80 ``` And browse http://localhost:8080 To clean up ``` $ZEPPELIN_SERVER_YAML | kubectl delete -f - ``` ### Screenshots (if appropriate) See this video https://youtu.be/7E4ZGn4pnTo ### Future work - Per interpreter docker image ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? yes You can merge this pull request into a Git repository by running: $ git pull https://github.com/Leemoonsoo/zeppelin kubernetes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zeppelin/pull/3240.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3240 ---- commit d2f3d5b7e1ad00da9148e6baf866d0d0506274e2 Author: Lee moon soo <moon@...> Date: 2018-11-05T17:43:19Z add k8s-standard launcher module commit 07489f76df68e68b4b85025d71bf5b2afc460098 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-18T23:05:25Z kubectl with exec commit 5f602a65ef9700b6a5eb05790ff5e8585c8d26bf Author: Lee moon soo <moon@...> Date: 2018-11-20T03:38:00Z K8sRemoteInterpreterProcess commit 52bb6c7e15ab2fe64fa4089c7edf982acf407af9 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-21T20:31:20Z add k8s dir in package commit 36cf391a4256636336b83c7010c0db12867f49dd Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-21T20:31:53Z correct plugin name commit 58f9f19094afee7121c6cdebd83148742b0b1ce4 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-21T20:32:26Z add rbac commit 2fd2ac8c396a570b58cf77a6e32e09a2df4a89b8 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-21T20:32:52Z kubernetes mode configuration commit 9f1b7a1691020ca43a91d14be8a8ebe9ab37adfe Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-21T20:33:22Z run kubernetes launcher commit 0dea3836b09c90bb7cd45c288d8a8e0b37d0cd0e Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T02:56:59Z create and connect interpreter pod commit 18b8f68cb11e54ed54b222e7da670c32902c2d79 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T04:02:10Z print spec file contents on debug log commit 86e8764357c5645d248ae468d9322443b01e6ca9 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T04:02:27Z add services on RBAC commit 7fe9823b1f2d6f311104af38b674b77494e2e970 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T05:12:40Z interpreter pod cascade delete on zeppelin-server delete commit 263d859d426339eddfd02554fde32f870cac710b Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T05:33:59Z use headless service for interpreter pod commit 7a87367561a11371a7bb77858d39c5821a5d870f Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T21:12:14Z configure spark on kubernetes commit a4072e6b90c57514b8ea690abef15ddbdb370a9d Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T22:35:09Z add signal handler commit 2960dcb878eb29eb2fb515f50bc728bd659498ec Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T22:35:34Z configure namespace commit b0e2c36c68b28310afe98272f6bc6b1408d305d7 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-22T22:35:57Z Rbac role, rolebinding commit 0d472ea522ec8d99118bbaed4734fdb36cfc8076 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-23T03:15:21Z load properties and environment variables commit f4166ad04c088c900ba8586445e02a9242b2934b Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-23T04:31:12Z make spark container image configurable commit 2b579ff12943bf76ae6aadacd52bceda3cb5b382 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-23T05:49:47Z let user override namespace commit f30561189ce0e92511d59b7190908d376f2264c5 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-23T05:50:17Z rename file commit 0f7c0d4e8671ebde3b281411cb66fa692a272b37 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-23T05:53:50Z add license commit 9341fcbfea43264c181c57e3e4a73b68ab9b60b6 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-23T05:54:11Z install kubectl and configure log4j in docker image commit 3078bac55036a4f16125268c18ab3b6340bd53d7 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-24T05:44:45Z spark ui support commit ec09b8b88226d477566d5f77ce3218bda9242499 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-24T07:15:28Z add test commit e9ce64fe790dff110af849322720f475e949d100 Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-24T08:42:02Z update dockerfile commit 64a56b5c94194829b469d9dd30aad8c8c6e7a51f Author: Lee moon soo <leemoonsoo@...> Date: 2018-11-24T10:39:02Z Add docs ---- ---