Spark on YARN multitenancy

David Fox Tue, 15 Dec 2015 07:36:57 -0800

 Hello Spark experts,

We are currently evaluating Spark on our cluster that already supports MRv2
over YARN.


We have noticed a problem with running jobs concurrently, in particular
that a running Spark job will not release its resources until the job is
finished. Ideally, if two people run any combination of MRv2 and Spark
jobs, the resources should be fairly distributed.

I have noticed a feature called "dynamic resource allocation" in Spark 1.2,
but this does not seem to be solving the problem, because it releases
resources only when Spark is IDLE, not while it's BUSY. What I am looking
for is similar approch to MapReduce where a new user obtains fair share of
resources

I haven't been able to locate any further information on this matter. On
the other hand, I feel this must be pretty common issue for a lot of users.

So,

   1. What is your experience when dealing with multitenant (multiple
   users) Spark cluster with YARN?
   2. Is Spark architectually adept to support releasing resources while
   it's busy? Is this a planned feature or is it something that conflicts with
   the idea of Spark executors?

Thanks

Spark on YARN multitenancy

Reply via email to