What CLI args are your referring to? I'm aware of spark-submit's arguments (--executor-memory, --total-executor-cores, and --executor-cores)
On Thu, Feb 2, 2017 at 12:41 PM, Ji Yan <ji...@drive.ai> wrote: > I have done a experiment on this today. It shows that only CPUs are > tolerant of insufficient cluster size when a job starts. On my cluster, I > have 180Gb of memory and 64 cores, when I run spark-submit ( on mesos ) > with --cpu_cores set to 1000, the job starts up with 64 cores. but when I > set --memory to 200Gb, the job fails to start with "Initial job has not > accepted any resources; check your cluster UI to ensure that workers are > registered and have sufficient resources" > > Also it is confusing to me that --cpu_cores specifies the number of cpu > cores across all executors, but --memory specifies per executor memory > requirement. > > On Mon, Jan 30, 2017 at 11:34 AM, Michael Gummelt <mgumm...@mesosphere.io> > wrote: > >> >> >> On Mon, Jan 30, 2017 at 9:47 AM, Ji Yan <ji...@drive.ai> wrote: >> >>> Tasks begin scheduling as soon as the first executor comes up >>> >>> >>> Thanks all for the clarification. Is this the default behavior of Spark >>> on Mesos today? I think this is what we are looking for because sometimes a >>> job can take up lots of resources and later jobs could not get all the >>> resources that it asks for. If a Spark job starts with only a subset of >>> resources that it asks for, does it know to expand its resources later when >>> more resources become available? >>> >> >> Yes. >> >> >>> >>> Launch each executor with at least 1GB RAM, but if mesos offers 2GB at >>>> some moment, then launch an executor with 2GB RAM >>> >>> >>> This is less useful in our use case. But I am also quite interested in >>> cases in which this could be helpful. I think this will also help with >>> overall resource utilization on the cluster if when another job starts up >>> that has a hard requirement on resources, the extra resources to the first >>> job can be flexibly re-allocated to the second job. >>> >>> On Sat, Jan 28, 2017 at 2:32 PM, Michael Gummelt <mgumm...@mesosphere.io >>> > wrote: >>> >>>> We've talked about that, but it hasn't become a priority because we >>>> haven't had a driving use case. If anyone has a good argument for >>>> "variable" resource allocation like this, please let me know. >>>> >>>> On Sat, Jan 28, 2017 at 9:17 AM, Shuai Lin <linshuai2...@gmail.com> >>>> wrote: >>>> >>>>> An alternative behavior is to launch the job with the best resource >>>>>> offer Mesos is able to give >>>>> >>>>> >>>>> Michael has just made an excellent explanation about dynamic >>>>> allocation support in mesos. But IIUC, what you want to achieve is >>>>> something like (using RAM as an example) : "Launch each executor with at >>>>> least 1GB RAM, but if mesos offers 2GB at some moment, then launch an >>>>> executor with 2GB RAM". >>>>> >>>>> I wonder what's benefit of that? To reduce the "resource >>>>> fragmentation"? >>>>> >>>>> Anyway, that is not supported at this moment. In all the supported >>>>> cluster managers of spark (mesos, yarn, standalone, and the up-to-coming >>>>> spark on kubernetes), you have to specify the cores and memory of each >>>>> executor. >>>>> >>>>> It may not be supported in the future, because only mesos has the >>>>> concepts of offers because of its two-level scheduling model. >>>>> >>>>> >>>>> On Sat, Jan 28, 2017 at 1:35 AM, Ji Yan <ji...@drive.ai> wrote: >>>>> >>>>>> Dear Spark Users, >>>>>> >>>>>> Currently is there a way to dynamically allocate resources to Spark >>>>>> on Mesos? Within Spark we can specify the CPU cores, memory before >>>>>> running >>>>>> job. The way I understand is that the Spark job will not run if the >>>>>> CPU/Mem >>>>>> requirement is not met. This may lead to decrease in overall utilization >>>>>> of >>>>>> the cluster. An alternative behavior is to launch the job with the best >>>>>> resource offer Mesos is able to give. Is this possible with the current >>>>>> implementation? >>>>>> >>>>>> Thanks >>>>>> Ji >>>>>> >>>>>> The information in this email is confidential and may be legally >>>>>> privileged. It is intended solely for the addressee. Access to this email >>>>>> by anyone else is unauthorized. If you are not the intended recipient, >>>>>> any >>>>>> disclosure, copying, distribution or any action taken or omitted to be >>>>>> taken in reliance on it, is prohibited and may be unlawful. >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Michael Gummelt >>>> Software Engineer >>>> Mesosphere >>>> >>> >>> >>> The information in this email is confidential and may be legally >>> privileged. It is intended solely for the addressee. Access to this email >>> by anyone else is unauthorized. If you are not the intended recipient, any >>> disclosure, copying, distribution or any action taken or omitted to be >>> taken in reliance on it, is prohibited and may be unlawful. >>> >> >> >> >> -- >> Michael Gummelt >> Software Engineer >> Mesosphere >> > > > The information in this email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this email > by anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be > taken in reliance on it, is prohibited and may be unlawful. > -- Michael Gummelt Software Engineer Mesosphere