Thanks Sean. When executors has only 2gb, executors restarted every 2/3 hours with OOMkilled errors
When I increased executir memory to 12 GB and number of cores to 12 (2 executors, 6 cores per executor), the OOMKilled is stopped and restart happens but the meory usage peaks to 14GB after few hours and stays there Does this indicate it was memory allocation issue and we should use this higher memory configuration? -EXEC_CORES=2 -TOTAL_CORES=4 -EXEC_MEMORY=2G +EXEC_CORES=6 +TOTAL_CORES=12 +EXEC_MEMORY=12G Could you provide some exact steps and documentation how to collect the heap dump and install all the right packages/environment? I tried below steps on the bash shell of the executor. jmap command does not work Mem: 145373940K used, 118435384K free, 325196K shrd, 4452968K buff, 20344056K cached CPU: 7% usr 2% sys 0% nic 89% idle 0% io 0% irq 0% sirq Load average: 9.57 10.67 12.05 24/21360 7741 PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 16 1 nobody S 3809m 0% 26 0% /usr/bin/java -Dlog4j.configurationFile=log4j.properties -Dspark.driver.port=34137 -Xms2G -Xmx2G -cp /opt/hadoop/etc/hadoop::/opt/sp 7705 0 nobody S 2620 0% 41 0% bash 7731 7705 nobody R 1596 0% 22 0% top 1 0 nobody S 804 0% 74 0% /sbin/tini -s -- /usr/bin/java -Dlog4j.configurationFile=log4j.properties -Dspark.driver.port=34137 -Xms2G -Xmx2G -cp /opt/hadoop/et bash-5.1$ jmap -dump:live,format=b,file=application_heap_dump.bin 16 bash: jmap: command not found bash-5.1$ jmap bash: jmap: command not found Thanks Kiran On Sat, Sep 25, 2021 at 5:28 AM Sean Owen <sro...@gmail.com> wrote: > It could be 'normal' - executors won't GC unless they need to. > It could be state in your application, if you're storing state. > You'd want to dump the heap to take a first look > > On Sat, Sep 25, 2021 at 7:24 AM Kiran Biswal <biswalki...@gmail.com> > wrote: > >> Hello Experts >> >> I have a spark streaming application(DStream). I use spark 3.0.2, scala >> 2.12 This application reads about 20 different kafka topics and produces a >> single stream and I filter the RDD per topic and store in cassandra >> >> I see that there is a steady increase in executor memory over the hours >> until it reaches max allocated memory and then it stays at that value. No >> matter how high I allocate to the executor this pattern is seen. I suspect >> memory leak >> >> Any guidance you may be able provide as to how to debug will be highly >> appreciated >> >> Thanks in advance >> Regards >> Kiran >> >