Hello,

I am facing in my Project two different issues with Spark that are driving me 
crazy. I am currently running in EMR (Spark 1.5.2 + YARN), using the 
"--executor-memory 40G" option.

Problem #1
=========

Some of my processes get killed by YARN because the container is exceeding the 
physical memory YARN assigned it. I have been able to work around this issue by 
increasing the spark.yarn.executor.memoryOverhead parameter to 8G, but that 
doesn't seem like a good solution.

My understanding is that the JVM that will run my Spark process will get 40 GB 
of heap memory (-Xmx40G), and if there is memory pressure in the process then 
the GC should kick in to ensure that the heap never exceeds those 40 GB. My 
PermGen is set to 510MB, but that is a very long way from the 8GB I need to set 
as overhead. This seems to happen when I .cache() very big RDDs and I then 
perform operations that require shuffling (cogroup & co.).

- Who is using all that off heap memory?
- Are there any tools in the Spark ecosystem that might help me debug this?


Problem #2
=========

Some tasks fail because the heartbeat didn't get back to the master in 120 
seconds. Again, I can more or less work around this by increasing the timeout 
to 5 minutes, but I don't feel this is addressing the real problem.

- Does the heartbeat have its own thread or would a long-running .map() block 
the heartbeat?
- What conditions would prevent the heartbeat from being sent?

Many thanks in advance for any help with this,
Ximo.

________________________________

Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede 
contener información privilegiada o confidencial y es para uso exclusivo de la 
persona o entidad de destino. Si no es usted. el destinatario indicado, queda 
notificado de que la lectura, utilización, divulgación y/o copia sin 
autorización puede estar prohibida en virtud de la legislación vigente. Si ha 
recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente 
por esta misma vía y proceda a su destrucción.

The information contained in this transmission is privileged and confidential 
information intended only for the use of the individual or entity named above. 
If the reader of this message is not the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this communication 
is strictly prohibited. If you have received this transmission in error, do not 
read it. Please immediately reply to the sender that you have received this 
communication in error and then delete it.

Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode 
conter informação privilegiada ou confidencial e é para uso exclusivo da pessoa 
ou entidade de destino. Se não é vossa senhoria o destinatário indicado, fica 
notificado de que a leitura, utilização, divulgação e/ou cópia sem autorização 
pode estar proibida em virtude da legislação vigente. Se recebeu esta mensagem 
por erro, rogamos-lhe que nos o comunique imediatamente por esta mesma via e 
proceda a sua destruição

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to