> On 7 Oct 2015, at 09:26, Dominik Fries <dominik.fr...@woodmark.de> wrote: > > Hello Folks, > > We want to deploy several spark projects and want to use a unique project > user for each of them. Only the project user should start the spark > application and have the corresponding packages installed. > > Furthermore a personal user, which belongs to a specific project, should > start a spark application via the corresponding spark project user as proxy. > (Development) > > The Application is currently running with ipython / pyspark. (HDP 2.3 - > Spark 1.3.1) > > Is this possible or what is the best practice for a spark multi tenancy > environment ? > >
Deploy on a kerberized YARN cluster and each application instance will be running as a different unix user in the cluster, with the appropriate access to HDFS —isolated. The issue then becomes "do workloads clash with each other?". If you want to isolate dev & production, using node labels to keep dev work off the production nodes is the standard technique.