Thanks for your clarification Till. I agree with the current semantics of the per-job mode, one should deploy a new cluster for each part of the job. Apart from the performance concern it also means that PerJobExecutor knows how to deploy a cluster actually, which is different from the description that Executor submit a job.
Anyway it sounds workable and narrow the changes.