I'm looking to bypass the master entirely. I manage the workers outside of Spark. So I want to start the driver, the start workers that connect directly to the driver.
Anyway, it looks like I will have to live with our current solution for a while. On Thu, May 19, 2016 at 8:32 PM Marcelo Vanzin <van...@cloudera.com> wrote: > Hi Mathieu, > > There's nothing like that in Spark currently. For that, you'd need a > new cluster manager implementation that knows how to start executors > in those remote machines (e.g. by running ssh or something). > > In the current master there's an interface you can implement to try > that if you really want to (ExternalClusterManager), but it's > currently "private[spark]" and it probably wouldn't be a very simple > task. > > > On Thu, May 19, 2016 at 10:45 AM, Mathieu Longtin > <math...@closetwork.org> wrote: > > First a bit of context: > > We use Spark on a platform where each user start workers as needed. This > has > > the advantage that all permission management is handled by the OS, so the > > users can only read files they have permission to. > > > > To do this, we have some utility that does the following: > > - start a master > > - start worker managers on a number of servers > > - "submit" the Spark driver program > > - the driver then talks to the master, tell it how many executors it > needs > > - the master tell the worker nodes to start executors and talk to the > driver > > - the executors are started > > > > From here on, the master doesn't do much, neither do the process manager > on > > the worker nodes. > > > > What I would like to do is simplify this to: > > - Start the driver program > > - Start executors on a number of servers, telling them where to find the > > driver > > - The executors connect directly to the driver > > > > Is there a way I could do this without the master and worker managers? > > > > Thanks! > > > > > > -- > > Mathieu Longtin > > 1-514-803-8977 > > > > -- > Marcelo > -- Mathieu Longtin 1-514-803-8977