Hello Spark experts, We are currently evaluating Spark on our cluster that already supports MRv2 over YARN.
We have noticed a problem with running jobs concurrently, in particular that a running Spark job will not release its resources until the job is finished. Ideally, if two people run any combination of MRv2 and Spark jobs, the resources should be fairly distributed. I have noticed a feature called "dynamic resource allocation" in Spark 1.2, but this does not seem to be solving the problem, because it releases resources only when Spark is IDLE, not while it's BUSY. What I am looking for is similar approch to MapReduce where a new user obtains fair share of resources I haven't been able to locate any further information on this matter. On the other hand, I feel this must be pretty common issue for a lot of users. So, 1. What is your experience when dealing with multitenant (multiple users) Spark cluster with YARN? 2. Is Spark architectually adept to support releasing resources while it's busy? Is this a planned feature or is it something that conflicts with the idea of Spark executors? Thanks