Hey Varun!
I'm not sure about you actual query; but hive.exec.parallel enable to execute
stages in parallel
The full tez dag is usually "one stage" of the execution (but you should take a
look at the explain).
If you would be using mr engine there might have been some speedup; but in case of tez the parallel execution of independent tasks are happening inside tez independently
from this setting.
cheers,
Zoltan
On 5/7/19 9:00 PM, Varun Rao wrote:
Hello,
We were wondering what the benefits are of setting hive.exec.parallel to true. I know that this will execute any possible tasks in parallel. For example MapReduce stages,
sampling stages, merge stages, limit stages etc will be executed in parallel allowing for the overall job to be completed more quickly. However, my co worker and I decided
to run a 400 line long query in separate Tez sessions. I would set hive.exec.parallel=true and he would set it to false. However, we see almost no improvement in speed. I
am assuming that is because the stages of my query is dependent on one another (join a with b with c with d). Is thisĀ the case? Can you give me examples of queries where
there would be improvement in speed (perhaps in TPC-DS) when one sets hive.exec.parallel to true
Thanks
Yours Truly,
Varun Rao