Re: Dealing with monstrous hive startup overhead

2014-07-10 Thread Vikram Dixit
Hi Ed, I agree with you that one of the big overheads in hive is the startup time of a job while it acquires containers and launches AMs and tasks. I wanted to just draw your attention to something that is there in hive right now that addresses some of this. When using hive server 2 in tez mode, w

Dealing with monstrous hive startup overhead

2014-07-10 Thread Edward Capriolo
So Everyone is running around saying "hive is slow" "x is faster". I think hive's biggest issue is that the mr2 entire process to acquire containers and then launch a job in them is super overkill. I see it result in 40 seconds startup time for what amounts to a 2 second job. In the old hadoop 0.20