Hi, I built a spark job which is very slow. ThreadPoolExecutor is executed for every second task of my custom spark pipeline step.
Additionally, I noticed that spark is spending a lot of the time in the garbage collection and sometimes 0 tasks are launched but still the driver is waiting I put it up here http://stackoverflow.com/questions/41298550/spark-threadpoolexecutor-very-often-called-in-tasks as well with a minimal example of https://github.com/geoHeil/sparkContrastCoding Looking forward to any input to speed up this spark job. cheers, Georg -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/ThreadPoolExecutor-slow-spark-job-tp28248.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org