SunShun created ZEPPELIN-5634:
---------------------------------

             Summary: Memory hasn't been released after restarting spark 
interpreter from notebook page
                 Key: ZEPPELIN-5634
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5634
             Project: Zeppelin
          Issue Type: Improvement
    Affects Versions: 0.10.0, 0.9.0
         Environment: Hadoop 2.7.2

Spark 2.4.2 / 2.4.8 (scala 2.11)

Zeppelin 0.10 / master branch
            Reporter: SunShun


Each time running paragraph for a new spark notebook, it will consume some 
memory inside the spark interpreter process(JVM). 

However, when restarting the interpreter from the notebook page, the used 
memory will not be released (especially in spark2.x), and more memory will be 
used to run the paragraph in backend as it creates a new spark interpreter.

Unless restart interpreters from all open notebooks. At that time, the current 
JVM of spark interpreter will be killed.

If more notebooks are open, will even cause in an OOM issue as there is no 
enough memory to serve them.

I do an experiment testing o observe such problem, and  use jmap to fetch the 
used memory inside JVM.
{quote}Given that available memory for driver is 1GB;

When 1st notebook runs, used memory = 455M;

When 2nd notebook runs, used memory = 610M;

when 2nd notebook restart, used memory = 608M(almost no change);

When run the 2nd notebook again, used memory = 770MB.
{quote}
Is that possible to release the memory when restarting in notebook? it will 
help mitigate the chance of OOM issue in spark driver side.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to