The driver (the process started by spark-submit) runs locally. The
executors run on any of thousands of servers. So far, I haven't tried more
than 500 executors.

Right now, I run a master on the same server as the driver.

On Thu, May 19, 2016 at 3:49 PM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> ok so you are using some form of NFS mounted file system shared among the
> nodes and basically you start the processes through spark-submit.
>
> In Stand-alone mode, a simple cluster manager included with Spark. It
> does the management of resources so it is not clear to me what you are
> referring as worker manager here?
>
> This is my take from your model.
>  The application will go and grab all the cores in the cluster.
> You only have one worker that lives within the driver JVM process.
> The Driver node runs on the same host that the cluster manager is running.
> The Driver requests the Cluster Manager for resources to run tasks. In this
> case there is only one executor for the Driver? The Executor runs tasks for
> the Driver.
>
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 19 May 2016 at 20:37, Mathieu Longtin <math...@closetwork.org> wrote:
>
>> No master and no node manager, just the processes that do actual work.
>>
>> We use the "stand alone" version because we have a shared file system and
>> a way of allocating computing resources already (Univa Grid Engine). If an
>> executor were to die, we have other ways of restarting it, we don't need
>> the worker manager to deal with it.
>>
>> On Thu, May 19, 2016 at 3:16 PM Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Hi Mathieu
>>>
>>> What does this approach provide that the norm lacks?
>>>
>>> So basically each node has its master in this model.
>>>
>>> Are these supposed to be individual stand alone servers?
>>>
>>>
>>> Thanks
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 19 May 2016 at 18:45, Mathieu Longtin <math...@closetwork.org> wrote:
>>>
>>>> First a bit of context:
>>>> We use Spark on a platform where each user start workers as needed.
>>>> This has the advantage that all permission management is handled by the OS,
>>>> so the users can only read files they have permission to.
>>>>
>>>> To do this, we have some utility that does the following:
>>>> - start a master
>>>> - start worker managers on a number of servers
>>>> - "submit" the Spark driver program
>>>> - the driver then talks to the master, tell it how many executors it
>>>> needs
>>>> - the master tell the worker nodes to start executors and talk to the
>>>> driver
>>>> - the executors are started
>>>>
>>>> From here on, the master doesn't do much, neither do the process
>>>> manager on the worker nodes.
>>>>
>>>> What I would like to do is simplify this to:
>>>> - Start the driver program
>>>> - Start executors on a number of servers, telling them where to find
>>>> the driver
>>>> - The executors connect directly to the driver
>>>>
>>>> Is there a way I could do this without the master and worker managers?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> --
>>>> Mathieu Longtin
>>>> 1-514-803-8977
>>>>
>>>
>>> --
>> Mathieu Longtin
>> 1-514-803-8977
>>
>
> --
Mathieu Longtin
1-514-803-8977

Reply via email to