The pyspark app stdout/err log shows this oddity.
Traceback (most recent call last):
File "/root/spark/notebooks/ingest/XXX.py", line 86, in
print pdfRDD.collect()[:5]
File "/root/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 773,
in collect
File
"/root/spark/python/lib/py4j-0.8
Is this the stderr output from a woker? Are any files being written? Can
you run in debug and see how far it's getting?
This to me doesn't give me a direction to look without the actual logs
from $SPARK_HOME or the stderr from the worker UI.
Just imho maybe someone know what this means but it
Hi all,
Wondering if someone can provide some insight why this pyspark app is
just hanging. Here is output.
...
15/12/03 01:47:05 INFO TaskSetManager: Starting task 21.0 in stage 0.0
(TID 21, 10.65.143.174, PROCESS_LOCAL, 1794787 bytes)
15/12/03 01:47:05 INFO TaskSetManager: Starting task 22