Re: Limit pyspark.daemon threads

agateaaa Thu, 16 Jun 2016 23:20:54 -0700

There is only one executor on each worker. I see one pyspark.daemon, but
when the streaming jobs starts a batch I see that it spawns 4 other
pyspark.daemon processes. After the batch completes, the 4 pyspark.demon
processes die and there is only one left.


I think this behavior was introduced by this change JIRA
https://issues.apache.org/jira/browse/SPARK-2764 where pyspark.daemon was
revamped.



On Wed, Jun 15, 2016 at 11:34 PM, Jeff Zhang <zjf...@gmail.com> wrote:

> >>> I am seeing this issue too with pyspark (Using Spark 1.6.1).  I have
> set spark.executor.cores to 1, but I see that whenever streaming batch
> starts processing data, see python -m pyspark.daemon processes increase
> gradually to about 5, (increasing CPU% on a box about 4-5 times, each
> pyspark.daemon takes up around 100 % CPU)
> >>> After the processing is done 4 pyspark.daemon processes go away and we
> are left with one till the next batch run. Also sometimes the  CPU usage
> for executor process spikes to about 800% even though spark.executor.core
> is set to 1
>
>
> As my understanding, each spark task consume at most 1 python process.  In
> this case (spark.executor.cores=1), there should be only at most 1 python
> process for each executor. And here's 4 python processes, I suspect there's
> at least 4 executors on this machine. Could you check that ?
>
> On Thu, Jun 16, 2016 at 6:50 AM, Sudhir Babu Pothineni <
> sbpothin...@gmail.com> wrote:
>
>> Hi Ken, It may be also related to Grid Engine job scheduling? If it is 16
>> core (virtual cores?), grid engine allocates 16 slots, If you use 'max'
>> scheduling, it will send 16 processes sequentially to same machine, on the
>> top of it each spark job has its own executors. Limit the number of jobs
>> scheduled to the machine = number of physical cores of single CPU, it will
>> solve the problem if it is related to GE. If you are sure it's related to
>> Spark, please ignore.
>>
>> -Sudhir
>>
>>
>> Sent from my iPhone
>>
>> On Jun 15, 2016, at 8:53 AM, Gene Pang <gene.p...@gmail.com> wrote:
>>
>> As Sven mentioned, you can use Alluxio to store RDDs in off-heap memory,
>> and you can then share that RDD across different jobs. If you would like to
>> run Spark on Alluxio, this documentation can help:
>> http://www.alluxio.org/documentation/master/en/Running-Spark-on-Alluxio.html
>>
>> Thanks,
>> Gene
>>
>> On Tue, Jun 14, 2016 at 12:44 AM, agateaaa <agate...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am seeing this issue too with pyspark (Using Spark 1.6.1).  I have set
>>> spark.executor.cores to 1, but I see that whenever streaming batch starts
>>> processing data, see python -m pyspark.daemon processes increase gradually
>>> to about 5, (increasing CPU% on a box about 4-5 times, each pyspark.daemon
>>> takes up around 100 % CPU)
>>>
>>> After the processing is done 4 pyspark.daemon processes go away and we
>>> are left with one till the next batch run. Also sometimes the  CPU usage
>>> for executor process spikes to about 800% even though spark.executor.core
>>> is set to 1
>>>
>>> e.g. top output
>>> PID USER      PR   NI  VIRT  RES  SHR S       %CPU %MEM    TIME+  COMMAND
>>> 19634 spark     20   0 8871420 1.790g  32056 S 814.1  2.9   0:39.33
>>> /usr/lib/j+ <--EXECUTOR
>>>
>>> 13897 spark     20   0   46576  17916   6720 S   100.0  0.0   0:00.17
>>> python -m + <--pyspark.daemon
>>> 13991 spark     20   0   46524  15572   4124 S   98.0  0.0   0:08.18
>>> python -m + <--pyspark.daemon
>>> 14488 spark     20   0   46524  15636   4188 S   98.0  0.0   0:07.25
>>> python -m + <--pyspark.daemon
>>> 14514 spark     20   0   46524  15636   4188 S   94.0  0.0   0:06.72
>>> python -m + <--pyspark.daemon
>>> 14526 spark     20   0   48200  17172   4092 S   0.0  0.0   0:00.38
>>> python -m + <--pyspark.daemon
>>>
>>>
>>>
>>> Is there any way to control the number of pyspark.daemon processes that
>>> get spawned ?
>>>
>>> Thank you
>>> Agateaaa
>>>
>>> On Sun, Mar 27, 2016 at 1:08 AM, Sven Krasser <kras...@gmail.com> wrote:
>>>
>>>> Hey Ken,
>>>>
>>>> 1. You're correct, cached RDDs live on the JVM heap. (There's an
>>>> off-heap storage option using Alluxio, formerly Tachyon, with which I have
>>>> no experience however.)
>>>>
>>>> 2. The worker memory setting is not a hard maximum unfortunately. What
>>>> happens is that during aggregation the Python daemon will check its process
>>>> size. If the size is larger than this setting, it will start spilling to
>>>> disk. I've seen many occasions where my daemons grew larger. Also, you're
>>>> relying on Python's memory management to free up space again once objects
>>>> are evicted. In practice, leave this setting reasonably small but make sure
>>>> there's enough free memory on the machine so you don't run into OOM
>>>> conditions. If the lower memory setting causes strains for your users, make
>>>> sure they increase the parallelism of their jobs (smaller partitions
>>>> meaning less data is processed at a time).
>>>>
>>>> 3. I believe that is the behavior you can expect when setting
>>>> spark.executor.cores. I've not experimented much with it and haven't looked
>>>> at that part of the code, but what you describe also reflects my
>>>> understanding. Please share your findings here, I'm sure those will be very
>>>> helpful to others, too.
>>>>
>>>> One more suggestion for your users is to move to the Pyspark DataFrame
>>>> API. Much of the processing will then happen in the JVM, and you will bump
>>>> into fewer Python resource contention issues.
>>>>
>>>> Best,
>>>> -Sven
>>>>
>>>>
>>>> On Sat, Mar 26, 2016 at 1:38 PM, Carlile, Ken <
>>>> carli...@janelia.hhmi.org> wrote:
>>>>
>>>>> This is extremely helpful!
>>>>>
>>>>> I’ll have to talk to my users about how the python memory limit should
>>>>> be adjusted and what their expectations are. I’m fairly certain we bumped
>>>>> it up in the dark past when jobs were failing because of insufficient
>>>>> memory for the python processes.
>>>>>
>>>>> So just to make sure I’m understanding correctly:
>>>>>
>>>>>
>>>>>    - JVM memory (set by SPARK_EXECUTOR_MEMORY and/or
>>>>>    SPARK_WORKER_MEMORY?) is where the RDDs are stored. Currently both of 
>>>>> those
>>>>>    values are set to 90GB
>>>>>    - spark.python.worker.memory controls how much RAM each python
>>>>>    task can take maximum (roughly speaking. Currently set to 4GB
>>>>>    - spark.task.cpus controls how many java worker threads will exist
>>>>>    and thus indirectly how many pyspark daemon processes will exist
>>>>>
>>>>>
>>>>> I’m also looking into fixing my cron jobs so they don’t stack up by
>>>>> implementing flock in the jobs and changing how teardowns of the spark
>>>>> cluster work as far as failed workers.
>>>>>
>>>>> Thanks again,
>>>>> —Ken
>>>>>
>>>>> On Mar 26, 2016, at 4:08 PM, Sven Krasser <kras...@gmail.com> wrote:
>>>>>
>>>>> My understanding is that the spark.executor.cores setting controls the
>>>>> number of worker threads in the executor in the JVM. Each worker thread
>>>>> communicates then with a pyspark daemon process (these are not threads) to
>>>>> stream data into Python. There should be one daemon process per worker
>>>>> thread (but as I mentioned I sometimes see a low multiple).
>>>>>
>>>>> Your 4GB limit for Python is fairly high, that means even for 12
>>>>> workers you're looking at a max of 48GB (and it goes frequently beyond
>>>>> that). You will be better off using a lower number there and instead
>>>>> increasing the parallelism of your job (i.e. dividing the job into more 
>>>>> and
>>>>> smaller partitions).
>>>>>
>>>>> On Sat, Mar 26, 2016 at 7:10 AM, Carlile, Ken <
>>>>> carli...@janelia.hhmi.org> wrote:
>>>>>
>>>>>> Thanks, Sven!
>>>>>>
>>>>>> I know that I’ve messed up the memory allocation, but I’m trying not
>>>>>> to think too much about that (because I’ve advertised it to my users as
>>>>>> “90GB for Spark works!” and that’s how it displays in the Spark UI 
>>>>>> (totally
>>>>>> ignoring the python processes). So I’ll need to deal with that at some
>>>>>> point… esp since I’ve set the max python memory usage to 4GB to work 
>>>>>> around
>>>>>> other issues!
>>>>>>
>>>>>> The load issue comes in because we have a lot of background cron jobs
>>>>>> (mostly to clean up after spark…), and those will stack up behind the 
>>>>>> high
>>>>>> load and keep stacking until the whole thing comes crashing down. I will
>>>>>> look into how to avoid this stacking, as I think one of my predecessors 
>>>>>> had
>>>>>> a way, but that’s why the high load nukes the nodes. I don’t have the
>>>>>> spark.executor.cores set, but will setting that to say, 12 limit the
>>>>>> pyspark threads, or will it just limit the jvm threads?
>>>>>>
>>>>>> Thanks!
>>>>>> Ken
>>>>>>
>>>>>> On Mar 25, 2016, at 9:10 PM, Sven Krasser <kras...@gmail.com> wrote:
>>>>>>
>>>>>> Hey Ken,
>>>>>>
>>>>>> I also frequently see more pyspark daemons than configured
>>>>>> concurrency, often it's a low multiple. (There was an issue pre-1.3.0 
>>>>>> that
>>>>>> caused this to be quite a bit higher, so make sure you at least have a
>>>>>> recent version; see SPARK-5395.)
>>>>>>
>>>>>> Each pyspark daemon tries to stay below the configured memory limit
>>>>>> during aggregation (which is separate from the JVM heap as you note). 
>>>>>> Since
>>>>>> the number of daemons can be high and the memory limit is per daemon 
>>>>>> (each
>>>>>> daemon is actually a process and not a thread and therefore has its own
>>>>>> memory it tracks against the configured per-worker limit), I found memory
>>>>>> depletion to be the main source of pyspark problems on larger data sets.
>>>>>> Also, as Sea already noted the memory limit is not firm and individual
>>>>>> daemons can grow larger.
>>>>>>
>>>>>> With that said, a run queue of 25 on a 16 core machine does not sound
>>>>>> great but also not awful enough to knock it offline. I suspect something
>>>>>> else may be going on. If you want to limit the amount of work running
>>>>>> concurrently, try reducing spark.executor.cores (under normal 
>>>>>> circumstances
>>>>>> this would leave parts of your resources underutilized).
>>>>>>
>>>>>> Hope this helps!
>>>>>> -Sven
>>>>>>
>>>>>>
>>>>>> On Fri, Mar 25, 2016 at 10:41 AM, Carlile, Ken <
>>>>>> carli...@janelia.hhmi.org> wrote:
>>>>>>
>>>>>>> Further data on this.
>>>>>>> I’m watching another job right now where there are 16 pyspark.daemon
>>>>>>> threads, all of which are trying to get a full core (remember, this is 
>>>>>>> a 16
>>>>>>> core machine). Unfortunately , the java process actually running the 
>>>>>>> spark
>>>>>>> worker is trying to take several cores of its own, driving the load up. 
>>>>>>> I’m
>>>>>>> hoping someone has seen something like this.
>>>>>>>
>>>>>>> —Ken
>>>>>>>
>>>>>>> On Mar 21, 2016, at 3:07 PM, Carlile, Ken <carli...@janelia.hhmi.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>> No further input on this? I discovered today that the pyspark.daemon
>>>>>>> threadcount was actually 48, which makes a little more sense (at least 
>>>>>>> it’s
>>>>>>> a multiple of 16), and it seems to be happening at reduce and collect
>>>>>>> portions of the code.
>>>>>>>
>>>>>>> —Ken
>>>>>>>
>>>>>>> On Mar 17, 2016, at 10:51 AM, Carlile, Ken <
>>>>>>> carli...@janelia.hhmi.org> wrote:
>>>>>>>
>>>>>>> Thanks! I found that part just after I sent the email… whoops. I’m
>>>>>>> guessing that’s not an issue for my users, since it’s been set that way 
>>>>>>> for
>>>>>>> a couple of years now.
>>>>>>>
>>>>>>> The thread count is definitely an issue, though, since if enough
>>>>>>> nodes go down, they can’t schedule their spark clusters.
>>>>>>>
>>>>>>> —Ken
>>>>>>>
>>>>>>> On Mar 17, 2016, at 10:50 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>>>
>>>>>>> I took a look at docs/configuration.md
>>>>>>> Though I didn't find answer for your first question, I think the
>>>>>>> following pertains to your second question:
>>>>>>>
>>>>>>> <tr>
>>>>>>>   <td><code>spark.python.worker.memory</code></td>
>>>>>>>   <td>512m</td>
>>>>>>>   <td>
>>>>>>>     Amount of memory to use per python worker process during
>>>>>>> aggregation, in the same
>>>>>>>     format as JVM memory strings (e.g. <code>512m</code>,
>>>>>>> <code>2g</code>). If the memory
>>>>>>>     used during aggregation goes above this amount, it will spill
>>>>>>> the data into disks.
>>>>>>>   </td>
>>>>>>> </tr>
>>>>>>>
>>>>>>> On Thu, Mar 17, 2016 at 7:43 AM, Carlile, Ken <
>>>>>>> carli...@janelia.hhmi.org> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> We have an HPC cluster that we run Spark jobs on using standalone
>>>>>>>> mode and a number of scripts I’ve built up to dynamically schedule and
>>>>>>>> start spark clusters within the Grid Engine framework. Nodes in the 
>>>>>>>> cluster
>>>>>>>> have 16 cores and 128GB of RAM.
>>>>>>>>
>>>>>>>> My users use pyspark heavily. We’ve been having a number of
>>>>>>>> problems with nodes going offline with extraordinarily high load. I was
>>>>>>>> able to look at one of those nodes today before it went truly 
>>>>>>>> sideways, and
>>>>>>>> I discovered that the user was running 50 pyspark.daemon threads 
>>>>>>>> (remember,
>>>>>>>> this is a 16 core box), and the load was somewhere around 25 or so, 
>>>>>>>> with
>>>>>>>> all CPUs maxed out at 100%.
>>>>>>>>
>>>>>>>> So while the spark worker is aware it’s only got 16 cores and
>>>>>>>> behaves accordingly, pyspark seems to be happy to overrun everything 
>>>>>>>> like
>>>>>>>> crazy. Is there a global parameter I can use to limit pyspark threads 
>>>>>>>> to a
>>>>>>>> sane number, say 15 or 16? It would also be interesting to set a memory
>>>>>>>> limit, which leads to another question.
>>>>>>>>
>>>>>>>> How is memory managed when pyspark is used? I have the spark worker
>>>>>>>> memory set to 90GB, and there is 8GB of system overhead (GPFS 
>>>>>>>> caching), so
>>>>>>>> if pyspark operates outside of the JVM memory pool, that leaves it at 
>>>>>>>> most
>>>>>>>> 30GB to play with, assuming there is no overhead outside the JVM’s 90GB
>>>>>>>> heap (ha ha.)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Ken Carlile
>>>>>>>> Sr. Unix Engineer
>>>>>>>> HHMI/Janelia Research Campus
>>>>>>>> 571-209-4363
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> Т���������������������������������������������������������������������ХF�
>>>>>>> V�7V'67&�&R� R�� �â W6W"�V�7V'67&�&T 7 &�� 6�R��&pФf�" FF�F��� � 6��� 
>>>>>>> �G2�
>>>>>>> R�� �â W6W"ֆV� 7 &�� 6�R��&pР
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For
>>>>>>> additional commands, e-mail: user-h...@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> www.skrasser.com <http://www.skrasser.com/?utm_source=sig>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> www.skrasser.com <http://www.skrasser.com/?utm_source=sig>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> www.skrasser.com <http://www.skrasser.com/?utm_source=sig>
>>>>
>>>
>>>
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: Limit pyspark.daemon threads

Reply via email to