Hi,
The minimal solution is to enable dynamicAllocation and set idle timeout to low value. This will ensure that idle executors are killed and resources available for others to use, spark.dynamicAllocation.enabled spark.dynamicAllocation.executorIdleTimeout If you would like to understand "wastage" vs "completion time" with different executor counts, try Sparklens https://github.com/qubole/sparklens thanks, rohitk On Fri, Jun 15, 2018 at 6:32 PM, Alessandro Liparoti < alessandro.l...@gmail.com> wrote: > Hi, > > I would like to briefly present you my use case and gather possible useful > suggestions from the community. I am developing a spark job which massively > read from and write to Hive. Usually, I use 200 executors with 12g memory > each and a parallelism level of 600. The main run of the application > consists of phases: read from hdfs, persist, small and simple aggregations, > write to hdfs. These steps are repeated a certain number of time. > When I write to Hive, I aim to have partitions of approximately 50/70mb, > therefore I repartition before writing in output in approximately 15 parts > (according to the data size). The writing phase takes around 1.5 minutes; > this means that for 1.5 minutes only 15 out of 600 possible active tasks > are running in parallel. This looks a big waste of resources. How would you > solve the problem? > > I am trying to experiment with the FAIR scheduler and job pools, but it > seems not improving a lot; for some reasons, I cannot have more than 4 > parallel jobs running. I am investigating this opportunity right now, maybe > I will provide more details about it afterwards. > > I would like to know if this use case is normal, what would you do and if > in your opinion I am doing something wrong. > > Thanks, > *Alessandro Liparoti* >