Hi Ed,
I agree with you that one of the big overheads in hive is the startup time
of a job while it acquires containers and launches AMs and tasks. I wanted
to just draw your attention to something that is there in hive right now
that addresses some of this. When using hive server 2 in tez mode, w
So Everyone is running around saying "hive is slow" "x is faster". I think
hive's biggest issue is that the mr2 entire process to acquire containers
and then launch a job in them is super overkill. I see it result in 40
seconds startup time for what amounts to a 2 second job. In the old hadoop
0.20