Re: Dynamic resource allocation to Spark on Mesos

Michael Gummelt Thu, 02 Feb 2017 13:03:48 -0800

It sounds like you've answered your own question, right?  --executor-memory
means the memory per executor.  If you have no executor w/ 200GB memory,
then the driver will accept no offers.


On Thu, Feb 2, 2017 at 1:01 PM, Ji Yan <ji...@drive.ai> wrote:

> sorry, to clarify, i was using --executor-memory for memory,
> and --total-executor-cores for cpu cores
>
> On Thu, Feb 2, 2017 at 12:56 PM, Michael Gummelt <mgumm...@mesosphere.io>
> wrote:
>
>> What CLI args are your referring to?  I'm aware of spark-submit's
>> arguments (--executor-memory, --total-executor-cores, and --executor-cores)
>>
>> On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <ji...@drive.ai> wrote:
>>
>>> I have done a experiment on this today. It shows that only CPUs are
>>> tolerant of insufficient cluster size when a job starts. On my cluster, I
>>> have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos )
>>> with --cpu_cores set to 1000, the job starts up with 64 cores. but when I
>>> set --memory to 200Gb, the job fails to start with "Initial job has not
>>> accepted any resources; check your cluster UI to ensure that workers are
>>> registered and have sufficient resources"
>>>
>>> Also it is confusing to me that --cpu_cores specifies the number of cpu
>>> cores across all executors, but --memory specifies per executor memory
>>> requirement.
>>>
>>> On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <
>>> mgumm...@mesosphere.io> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>
>>>>> Tasks begin scheduling as soon as the first executor comes up
>>>>>
>>>>>
>>>>> Thanks all for the clarification. Is this the default behavior of
>>>>> Spark on Mesos today? I think this is what we are looking for because
>>>>> sometimes a job can take up lots of resources and later jobs could not get
>>>>> all the resources that it asks for. If a Spark job starts with only a
>>>>> subset of resources that it asks for, does it know to expand its resources
>>>>> later when more resources become available?
>>>>>
>>>>
>>>> Yes.
>>>>
>>>>
>>>>>
>>>>> Launch each executor with at least 1GB RAM, but if mesos offers 2GB at
>>>>>> some moment, then launch an executor with 2GB RAM
>>>>>
>>>>>
>>>>> This is less useful in our use case. But I am also quite interested in
>>>>> cases in which this could be helpful. I think this will also help with
>>>>> overall resource utilization on the cluster if when another job starts up
>>>>> that has a hard requirement on resources, the extra resources to the first
>>>>> job can be flexibly re-allocated to the second job.
>>>>>
>>>>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <
>>>>> mgumm...@mesosphere.io> wrote:
>>>>>
>>>>>> We've talked about that, but it hasn't become a priority because we
>>>>>> haven't had a driving use case.  If anyone has a good argument for
>>>>>> "variable" resource allocation like this, please let me know.
>>>>>>
>>>>>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> An alternative behavior is to launch the job with the best resource
>>>>>>>> offer Mesos is able to give
>>>>>>>
>>>>>>>
>>>>>>> Michael has just made an excellent explanation about dynamic
>>>>>>> allocation support in mesos. But IIUC, what you want to achieve is
>>>>>>> something like (using RAM as an example) : "Launch each executor with at
>>>>>>> least 1GB RAM, but if mesos offers 2GB at some moment, then launch an
>>>>>>> executor with 2GB RAM".
>>>>>>>
>>>>>>> I wonder what's benefit of that? To reduce the "resource
>>>>>>> fragmentation"?
>>>>>>>
>>>>>>> Anyway, that is not supported at this moment. In all the supported
>>>>>>> cluster managers of spark (mesos, yarn, standalone, and the up-to-coming
>>>>>>> spark on kubernetes), you have to specify the cores and memory of each
>>>>>>> executor.
>>>>>>>
>>>>>>> It may not be supported in the future, because only mesos has the
>>>>>>> concepts of offers because of its two-level scheduling model.
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <ji...@drive.ai> wrote:
>>>>>>>
>>>>>>>> Dear Spark Users,
>>>>>>>>
>>>>>>>> Currently is there a way to dynamically allocate resources to Spark
>>>>>>>> on Mesos? Within Spark we can specify the CPU cores, memory before 
>>>>>>>> running
>>>>>>>> job. The way I understand is that the Spark job will not run if the 
>>>>>>>> CPU/Mem
>>>>>>>> requirement is not met. This may lead to decrease in overall 
>>>>>>>> utilization of
>>>>>>>> the cluster. An alternative behavior is to launch the job with the best
>>>>>>>> resource offer Mesos is able to give. Is this possible with the current
>>>>>>>> implementation?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Ji
>>>>>>>>
>>>>>>>> The information in this email is confidential and may be legally
>>>>>>>> privileged. It is intended solely for the addressee. Access to this 
>>>>>>>> email
>>>>>>>> by anyone else is unauthorized. If you are not the intended recipient, 
>>>>>>>> any
>>>>>>>> disclosure, copying, distribution or any action taken or omitted to be
>>>>>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Michael Gummelt
>>>>>> Software Engineer
>>>>>> Mesosphere
>>>>>>
>>>>>
>>>>>
>>>>> The information in this email is confidential and may be legally
>>>>> privileged. It is intended solely for the addressee. Access to this email
>>>>> by anyone else is unauthorized. If you are not the intended recipient, any
>>>>> disclosure, copying, distribution or any action taken or omitted to be
>>>>> taken in reliance on it, is prohibited and may be unlawful.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Michael Gummelt
>>>> Software Engineer
>>>> Mesosphere
>>>>
>>>
>>>
>>> The information in this email is confidential and may be legally
>>> privileged. It is intended solely for the addressee. Access to this email
>>> by anyone else is unauthorized. If you are not the intended recipient, any
>>> disclosure, copying, distribution or any action taken or omitted to be
>>> taken in reliance on it, is prohibited and may be unlawful.
>>>
>>
>>
>>
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere
>>
>
>
> The information in this email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful.
>



-- 
Michael Gummelt
Software Engineer
Mesosphere

Re: Dynamic resource allocation to Spark on Mesos

Reply via email to