Hi guys,
I’m playing with the code that integrates Hive on Tez, and have couple questions regarding to the resource allocation. To my understanding (correct me if I am wrong), Hive creates a DAG composed of MapVertex and ReduceVertex, where each Vertex will later be translated to task running on potentially multiple containers by Tez. I was wondering how the resource requirement is determined (how many containers are needed for each Vertex, and what are the requirements for CPU and memory, etc.) in the current implementation, and where I can find the code corresponding to this. Thank you! Yunqi