The driver (the process started by spark-submit) runs locally. The executors run on any of thousands of servers. So far, I haven't tried more than 500 executors.
Right now, I run a master on the same server as the driver. On Thu, May 19, 2016 at 3:49 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > ok so you are using some form of NFS mounted file system shared among the > nodes and basically you start the processes through spark-submit. > > In Stand-alone mode, a simple cluster manager included with Spark. It > does the management of resources so it is not clear to me what you are > referring as worker manager here? > > This is my take from your model. > The application will go and grab all the cores in the cluster. > You only have one worker that lives within the driver JVM process. > The Driver node runs on the same host that the cluster manager is running. > The Driver requests the Cluster Manager for resources to run tasks. In this > case there is only one executor for the Driver? The Executor runs tasks for > the Driver. > > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 19 May 2016 at 20:37, Mathieu Longtin <math...@closetwork.org> wrote: > >> No master and no node manager, just the processes that do actual work. >> >> We use the "stand alone" version because we have a shared file system and >> a way of allocating computing resources already (Univa Grid Engine). If an >> executor were to die, we have other ways of restarting it, we don't need >> the worker manager to deal with it. >> >> On Thu, May 19, 2016 at 3:16 PM Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> Hi Mathieu >>> >>> What does this approach provide that the norm lacks? >>> >>> So basically each node has its master in this model. >>> >>> Are these supposed to be individual stand alone servers? >>> >>> >>> Thanks >>> >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >>> On 19 May 2016 at 18:45, Mathieu Longtin <math...@closetwork.org> wrote: >>> >>>> First a bit of context: >>>> We use Spark on a platform where each user start workers as needed. >>>> This has the advantage that all permission management is handled by the OS, >>>> so the users can only read files they have permission to. >>>> >>>> To do this, we have some utility that does the following: >>>> - start a master >>>> - start worker managers on a number of servers >>>> - "submit" the Spark driver program >>>> - the driver then talks to the master, tell it how many executors it >>>> needs >>>> - the master tell the worker nodes to start executors and talk to the >>>> driver >>>> - the executors are started >>>> >>>> From here on, the master doesn't do much, neither do the process >>>> manager on the worker nodes. >>>> >>>> What I would like to do is simplify this to: >>>> - Start the driver program >>>> - Start executors on a number of servers, telling them where to find >>>> the driver >>>> - The executors connect directly to the driver >>>> >>>> Is there a way I could do this without the master and worker managers? >>>> >>>> Thanks! >>>> >>>> >>>> -- >>>> Mathieu Longtin >>>> 1-514-803-8977 >>>> >>> >>> -- >> Mathieu Longtin >> 1-514-803-8977 >> > > -- Mathieu Longtin 1-514-803-8977