> Instead of requesting `[driver,executor].memory`, we should just request `[driver,executor].memory + [driver,executor].memoryOverhead `. I think this case is a bit clearer than the CPU case, so I went ahead and filed an issue <https://issues.apache.org/jira/browse/SPARK-23825> with more details and made a PR <https://github.com/apache/spark/pull/20943>.
I think this suggestion makes sense. > One way to solve this could be to request more than 1 core from Kubernetes per task. The exact amount we should request is unclear to me (it largely depends on how many threads actually get spawned for a task). I wonder if this is being addressed by PR #20553 <https://github.com/apache/spark/pull/20553> written by Yinan. Yinan? Thanks, Kimoon On Thu, Mar 29, 2018 at 5:14 PM, David Vogelbacher < dvogelbac...@palantir.com> wrote: > Hi, > > > > At the moment driver and executor pods are created using the following > requests and limits: > > > > *CPU* > > *Memory* > > *Request* > > [driver,executor].cores > > [driver,executor].memory > > *Limit* > > Unlimited (but can be specified using spark.[driver,executor].cores) > > [driver,executor].memory + [driver,executor].memoryOverhead > > > > Specifying the requests like this leads to problems if the pods only get > the requested amount of resources and nothing of the optional (limit) > resources, as it can happen in a fully utilized cluster. > > > > *For memory:* > > Let’s say we have a node with 100GiB memory and 5 pods with 20 GiB memory > and 5 GiB memoryOverhead. > > At the beginning all 5 pods use 20 GiB of memory and all is well. If a pod > then starts using its overhead memory it will get killed as there is no > more memory available, even though we told spark > > that it can use 25 GiB of memory. > > > > Instead of requesting `[driver,executor].memory`, we should just request > `[driver,executor].memory + [driver,executor].memoryOverhead `. > > I think this case is a bit clearer than the CPU case, so I went ahead and > filed an issue <https://issues.apache.org/jira/browse/SPARK-23825> with > more details and made a PR <https://github.com/apache/spark/pull/20943>. > > > > *For CPU:* > > As it turns out, there can be performance problems if we only have > `executor.cores` available (which means we have one core per task). This > was raised here <https://github.com/apache-spark-on-k8s/spark/issues/352> > and is the reason that the cpu limit was set to unlimited. > > This issue stems from the fact that in general there will be more than one > thread per task, resulting in performance impacts if there is only one core > available. > > However, I am not sure that just setting the limit to unlimited is the > best solution because it means that even if the Kubernetes cluster can > perfectly satisfy the resource requests, performance might be very bad. > > > > I think we should guarantee that an executor is able to do its work well > (without performance issues or getting killed - as could happen in the > memory case) with the resources it gets guaranteed from Kubernetes. > > > > One way to solve this could be to request more than 1 core from Kubernetes > per task. The exact amount we should request is unclear to me (it largely > depends on how many threads actually get spawned for a task). > > We would need to find a way to determine this somehow automatically or at > least come up with a better default value than 1 core per task. > > > > Does somebody have ideas or thoughts on how to solve this best? > > > > Best, > > David >