Are you referencing member variables or other objects of your driver in your
transformations? Those would have to be serialized and shipped to each executor
when that job kicks off.
On 7/22/16, 8:54 AM, "Jacek Laskowski" wrote:
Hi,
I can't specifically answer your question, but my understandi
Hi,
I can't specifically answer your question, but my understanding of
Task Deserialization Time is that it's time to deserialize a
serialized task from the driver before it gets run. See
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L236
Hi,
I'm running a simple job (reading sequential file and collect data at
the driver) with yarn-client mode. When looking at the history server
UI, Task Deserialization Time of tasks are quite different (5 ms to 5
s). What contribute to this Task Deserialization Time?
Thank you in advance!