GitHub user Leemoonsoo reopened a pull request:

    https://github.com/apache/zeppelin/pull/3240

    [ZEPPELIN-3840] Zeppelin on Kubernetes

    ### What type of PR is it?
    This PR adds ability to run Zeppelin on Kubernetes. It aims
    
     - Zero configuration to start Zeppelin on Kubernetes.
     - Run everything on Kubernetes, Zeppelin, Interpreters, Spark.
     - Highly customizable to adopt various user demands and extension.
    
    Key features are 
    
     - Provides zeppelin-server.yaml file for `kubectl` to run Zeppelin server
     - All interpreters are automatically running as a Pod.
     - Spark interpreter automatically configured to use [Spark on 
Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html)
     - Reverse proxy is configured to access Spark UI
    
    To do
     - [ ] Document how reverse proxy for Spark UI works and how to configure 
custom domain.
     - [ ] Document how to customize zeppelin-server and interpreter yaml.
     - [ ] Document new configurations
    
    ### How it works
    
    #### Run Zeppelin Server on Kubernetes
    `k8s/zeppelin-server.yaml` is provided to run Zeppelin Server with few 
sidecars and configurations.
    This file is easy to publish (user can easily consume it using `curl`), 
highly customizable while it includes all the necessary things.
    
    #### K8s Interpreter launcher
    This PR adds new module, `launcher-k8s-standard` under 
`zeppelin/zeppelin-plugins/launcher/k8s-standard/` directory. This launcher is 
[automatically being 
selected](https://github.com/apache/zeppelin/pull/3240/files#diff-82fddd2ffb77aaffc4b9cf7b5b1eaa79)
 when Zeppelin is running on Kubernetes. The launcher both handles Spark 
interpreter and All other interpreters.
    
    The launcher launches interpreter as a Pod using template 
[k8s/interpreter/100-interpreter-pod.yaml](https://github.com/apache/zeppelin/pull/3240/files#diff-d9ce62e2c992d32f0184d7edb862f3c4).
    Reason filename has `100-` in prefix is because all files in the directory 
is consumed in alphabetical order by launcher on interpreter start/stop. User 
can drop more files here to extend/customize interpreter, and filename can be 
used to control order. The template is rendered by 
[jinjava](https://github.com/HubSpot/jinjava).
    
    #### Spark interpreter
    
    When interpreter group is `spark`, K8sRemoteInterpreterProcess [sets 
necessary spark 
configuration](https://github.com/apache/zeppelin/pull/3240/files#diff-6d1d3084f55bdd519e39ede4a619e73dR297)
 automatically to use [Spark on 
Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html). 
User doesn't have to configure anything. It uses client mode.
    
    #### Spark UI
    
    We may make user manually configure port-forward or do something to access 
Spark UI, but that's not optimal. It is the best when Spark UI is automatically 
accessible when user have access to Zeppelin UI, without any extra 
configuration. 
    
    To enable this, Zeppelin server Pod has a reverse proxy as a sidecar, and 
it split traffic to Zeppelin server and Spark UI running in the other Pod. It 
assume both `service.domain.com` and `*.service.domain.com` point the nginx 
proxy address. `service.domain.com` is directed to ZeppelinServer, 
`*.service.domain.com` is directed to interpreter Pod.
    
    `<port>-<interpreter pod svc name>.service.domain.com` is convention to 
access any application running in interpreter Pod. If Spark interpreter Pod is 
running with a name `spark-axefeg` and Spark UI is running on port 4040, 
    
    ```
    4040-spark-axefeg.service.domain.com
    ```
    
    is the address to access Spark UI. Default service domain is 
[localtest.me:8080](https://github.com/apache/zeppelin/pull/3240/files#diff-56ccb2e2c2617b27dbaae866d9431e51R22),
 while `localtest.me` and `*.localtest.me` point `127.0.0.1`, and it works with 
`kubectl port-forward`.
    
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/browse/ZEPPELIN-3840
    
    ### How should this be tested?
    
    Prepare a Kubernetes cluster with enough resources (cpus > 5, mem > 6g).
    If you're using [minikube](https://github.com/kubernetes/minikube), check 
your capacity using `kubectl describe node` command before start.
    
    You'll need to build Zeppelin docker image and Spark docker image to test. 
Please follow guide docs/quickstart/kubernetes.md.
    
    To quickly try without building docker images, I have uploaded pre-built 
image on docker hub `moon/zeppelin:0.9.0-SNAPSHOT`, `moon/spark:2.4.0`. Try 
following command
    
    ```
    ZEPPELIN_SERVER_YAML="curl -s 
https://raw.githubusercontent.com/Leemoonsoo/zeppelin/kubernetes/k8s/zeppelin-server.yaml
 | sed 's/apache\/zeppelin:0.9.0-SNAPSHOT/moon\/zeppelin:0.9.0-SNAPSHOT/' | sed 
's/spark:2.4.0/moon\/spark:2.4.0/'"
    $ZEPPELIN_SERVER_YAML | kubectl apply -f -
    ```
    
    And port forward
    
    ```
    kubectl port-forward zeppelin-server 8080:80
    ```
    
    And browse http://localhost:8080
    
    To clean up
    
    ```
    $ZEPPELIN_SERVER_YAML | kubectl delete -f -
    ```
    
    ### Screenshots (if appropriate)
    See this video https://youtu.be/7E4ZGn4pnTo
    
    
    ### Future work
    
     - Per interpreter docker image
    
    ### Questions:
    * Does the licenses files need update? no
    * Is there breaking changes for older versions? no
    * Does this needs documentation? yes


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Leemoonsoo/zeppelin kubernetes

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/3240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3240
    
----
commit d2f3d5b7e1ad00da9148e6baf866d0d0506274e2
Author: Lee moon soo <moon@...>
Date:   2018-11-05T17:43:19Z

    add k8s-standard launcher module

commit 07489f76df68e68b4b85025d71bf5b2afc460098
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-18T23:05:25Z

    kubectl with exec

commit 5f602a65ef9700b6a5eb05790ff5e8585c8d26bf
Author: Lee moon soo <moon@...>
Date:   2018-11-20T03:38:00Z

    K8sRemoteInterpreterProcess

commit 52bb6c7e15ab2fe64fa4089c7edf982acf407af9
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-21T20:31:20Z

    add k8s dir in package

commit 36cf391a4256636336b83c7010c0db12867f49dd
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-21T20:31:53Z

    correct plugin name

commit 58f9f19094afee7121c6cdebd83148742b0b1ce4
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-21T20:32:26Z

    add rbac

commit 2fd2ac8c396a570b58cf77a6e32e09a2df4a89b8
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-21T20:32:52Z

    kubernetes mode configuration

commit 9f1b7a1691020ca43a91d14be8a8ebe9ab37adfe
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-21T20:33:22Z

    run kubernetes launcher

commit 0dea3836b09c90bb7cd45c288d8a8e0b37d0cd0e
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T02:56:59Z

    create and connect interpreter pod

commit 18b8f68cb11e54ed54b222e7da670c32902c2d79
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T04:02:10Z

    print spec file contents on debug log

commit 86e8764357c5645d248ae468d9322443b01e6ca9
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T04:02:27Z

    add services on RBAC

commit 7fe9823b1f2d6f311104af38b674b77494e2e970
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T05:12:40Z

    interpreter pod cascade delete on zeppelin-server delete

commit 263d859d426339eddfd02554fde32f870cac710b
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T05:33:59Z

    use headless service for interpreter pod

commit 7a87367561a11371a7bb77858d39c5821a5d870f
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T21:12:14Z

    configure spark on kubernetes

commit a4072e6b90c57514b8ea690abef15ddbdb370a9d
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T22:35:09Z

    add signal handler

commit 2960dcb878eb29eb2fb515f50bc728bd659498ec
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T22:35:34Z

    configure namespace

commit b0e2c36c68b28310afe98272f6bc6b1408d305d7
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-22T22:35:57Z

    Rbac role, rolebinding

commit 0d472ea522ec8d99118bbaed4734fdb36cfc8076
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-23T03:15:21Z

    load properties and environment variables

commit f4166ad04c088c900ba8586445e02a9242b2934b
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-23T04:31:12Z

    make spark container image configurable

commit 2b579ff12943bf76ae6aadacd52bceda3cb5b382
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-23T05:49:47Z

    let user override namespace

commit f30561189ce0e92511d59b7190908d376f2264c5
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-23T05:50:17Z

    rename file

commit 0f7c0d4e8671ebde3b281411cb66fa692a272b37
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-23T05:53:50Z

    add license

commit 9341fcbfea43264c181c57e3e4a73b68ab9b60b6
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-23T05:54:11Z

    install kubectl and configure log4j in docker image

commit 3078bac55036a4f16125268c18ab3b6340bd53d7
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-24T05:44:45Z

    spark ui support

commit ec09b8b88226d477566d5f77ce3218bda9242499
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-24T07:15:28Z

    add test

commit e9ce64fe790dff110af849322720f475e949d100
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-24T08:42:02Z

    update dockerfile

commit 64a56b5c94194829b469d9dd30aad8c8c6e7a51f
Author: Lee moon soo <leemoonsoo@...>
Date:   2018-11-24T10:39:02Z

    Add docs

----


---

Reply via email to