Hello all, I am running a 3~4 cluster nodes under yarn. I have a small dataset (500k~) but a huge amount of internal tasks, for example loop for different segments of the data and run many computations inside each.
It looks like strategies such as disabling serialization, and increasing the amount of executors at expenses of number of cores for each one helps a lot. I need to consider context switching and data localty, any general ideas? Thanks, Saif